In this module, we learned how to calculate the confidence interval for a single population mean where the population standard deviation is known. When estimating a population mean, the margin of error is called the error bound for a population mean (*EBM*). A confidence interval has the general form:

(lower bound, upper bound) = (point estimate – *EBM*, point estimate + *EBM*)

The calculation of *EBM* depends on the size of the sample and the level of confidence desired. The confidence level is the percent of all possible samples that can be expected to include the true population parameter. As the confidence level increases, the corresponding *EBM* increases as well. As the sample size increases, the *EBM* decreases. By the central limit theorem,

$EBM=z\frac{\sigma}{\sqrt{n}}$

Given a confidence interval, you can work backwards to find the error bound (*EBM*) or the sample mean. To find the error bound, find the difference of the upper bound of the interval and the mean. If you do not know the sample mean, you can find the error bound by calculating half the difference of the upper and lower bounds. To find the sample mean given a confidence interval, find the difference of the upper bound and the error bound. If the error bound is unknown, then average the upper and lower bounds of the confidence interval to find the sample mean.

Sometimes researchers know in advance that they want to estimate a population mean within a specific margin of error for a given level of confidence. In that case, solve the *EBM* formula for *n* to discover the size of the sample that is needed to achieve this goal:

$n=\frac{{z}^{2}{\sigma}^{2}}{EB{M}^{2}}$

In many cases, the researcher does not know the population standard deviation, *σ*, of the measure being studied. In these cases, it is common to use the sample standard deviation, *s*, as an estimate of *σ*. The normal distribution creates accurate confidence intervals when *σ* is known, but it is not as accurate when *s* is used as an estimate. In this case, the Student’s t-distribution is much better. Define a t-score using the following formula:

$t=\frac{\overline{x}-\mu}{\raisebox{1ex}{$s$}\!\left/ \!\raisebox{-1ex}{$\sqrt{n}$}\right.}$

The *t*-score follows the Student’s t-distribution with *n* – 1 degrees of freedom. The confidence interval under this distribution is calculated with *EBM* = $\left({t}_{\frac{\alpha}{2}}\right)\frac{s}{\sqrt{n}}$ where ${t}_{\frac{\alpha}{2}}$ is the *t*-score with area to the right equal to $\frac{\alpha}{2}$, *s* is the sample standard deviation, and *n* is the sample size. Use a table, calculator, or computer to find ${t}_{\frac{\alpha}{2}}$ for a given *α*.

Some statistical measures, like many survey questions, measure qualitative rather than quantitative data. In this case, the population parameter being estimated is a proportion. It is possible to create a confidence interval for the true population proportion following procedures similar to those used in creating confidence intervals for population means. The formulas are slightly different, but they follow the same reasoning.

Let *p′* represent the sample proportion, *x/n*, where *x* represents the number of successes and *n* represents the sample size. Let *q′* = 1 – *p′*. Then the confidence interval for a population proportion is given by the following formula:

(lower bound, upper bound) $=({p}^{\prime}\u2013EBP,{p}^{\prime}+EBP)=\left({p}^{\prime}\u2013z\sqrt{\frac{{p}^{\prime}{q}^{\prime}}{n}},{p}^{\prime}+z\sqrt{\frac{{p}^{\prime}{q}^{\prime}}{n}}\right)$

The “plus four” method for calculating confidence intervals is an attempt to balance the error introduced by using estimates of the population proportion when calculating the standard deviation of the sampling distribution. Simply imagine four additional trials in the study; two are successes and two are failures. Calculate ${p}^{\prime}=\frac{x+2}{n+4}$, and proceed to find the confidence interval. When sample sizes are small, this method has been demonstrated to provide more accurate confidence intervals than the standard formula used for larger samples.