Skip to Content
OpenStax Logo
Introductory Business Statistics

7.3 The Central Limit Theorem for Proportions

Introductory Business Statistics7.3 The Central Limit Theorem for Proportions
Buy book
  1. Preface
  2. 1 Sampling and Data
    1. Introduction
    2. 1.1 Definitions of Statistics, Probability, and Key Terms
    3. 1.2 Data, Sampling, and Variation in Data and Sampling
    4. 1.3 Levels of Measurement
    5. 1.4 Experimental Design and Ethics
    6. Key Terms
    7. Chapter Review
    8. Homework
    9. References
    10. Solutions
  3. 2 Descriptive Statistics
    1. Introduction
    2. 2.1 Display Data
    3. 2.2 Measures of the Location of the Data
    4. 2.3 Measures of the Center of the Data
    5. 2.4 Sigma Notation and Calculating the Arithmetic Mean
    6. 2.5 Geometric Mean
    7. 2.6 Skewness and the Mean, Median, and Mode
    8. 2.7 Measures of the Spread of the Data
    9. Key Terms
    10. Chapter Review
    11. Formula Review
    12. Practice
    13. Homework
    14. Bringing It Together: Homework
    15. References
    16. Solutions
  4. 3 Probability Topics
    1. Introduction
    2. 3.1 Terminology
    3. 3.2 Independent and Mutually Exclusive Events
    4. 3.3 Two Basic Rules of Probability
    5. 3.4 Contingency Tables and Probability Trees
    6. 3.5 Venn Diagrams
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Bringing It Together: Practice
    12. Homework
    13. Bringing It Together: Homework
    14. References
    15. Solutions
  5. 4 Discrete Random Variables
    1. Introduction
    2. 4.1 Hypergeometric Distribution
    3. 4.2 Binomial Distribution
    4. 4.3 Geometric Distribution
    5. 4.4 Poisson Distribution
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  6. 5 Continuous Random Variables
    1. Introduction
    2. 5.1 Properties of Continuous Probability Density Functions
    3. 5.2 The Uniform Distribution
    4. 5.3 The Exponential Distribution
    5. Key Terms
    6. Chapter Review
    7. Formula Review
    8. Practice
    9. Homework
    10. References
    11. Solutions
  7. 6 The Normal Distribution
    1. Introduction
    2. 6.1 The Standard Normal Distribution
    3. 6.2 Using the Normal Distribution
    4. 6.3 Estimating the Binomial with the Normal Distribution
    5. Key Terms
    6. Chapter Review
    7. Formula Review
    8. Practice
    9. Homework
    10. References
    11. Solutions
  8. 7 The Central Limit Theorem
    1. Introduction
    2. 7.1 The Central Limit Theorem for Sample Means
    3. 7.2 Using the Central Limit Theorem
    4. 7.3 The Central Limit Theorem for Proportions
    5. 7.4 Finite Population Correction Factor
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  9. 8 Confidence Intervals
    1. Introduction
    2. 8.1 A Confidence Interval for a Population Standard Deviation, Known or Large Sample Size
    3. 8.2 A Confidence Interval for a Population Standard Deviation Unknown, Small Sample Case
    4. 8.3 A Confidence Interval for A Population Proportion
    5. 8.4 Calculating the Sample Size n: Continuous and Binary Random Variables
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  10. 9 Hypothesis Testing with One Sample
    1. Introduction
    2. 9.1 Null and Alternative Hypotheses
    3. 9.2 Outcomes and the Type I and Type II Errors
    4. 9.3 Distribution Needed for Hypothesis Testing
    5. 9.4 Full Hypothesis Test Examples
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  11. 10 Hypothesis Testing with Two Samples
    1. Introduction
    2. 10.1 Comparing Two Independent Population Means
    3. 10.2 Cohen's Standards for Small, Medium, and Large Effect Sizes
    4. 10.3 Test for Differences in Means: Assuming Equal Population Variances
    5. 10.4 Comparing Two Independent Population Proportions
    6. 10.5 Two Population Means with Known Standard Deviations
    7. 10.6 Matched or Paired Samples
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Homework
    13. Bringing It Together: Homework
    14. References
    15. Solutions
  12. 11 The Chi-Square Distribution
    1. Introduction
    2. 11.1 Facts About the Chi-Square Distribution
    3. 11.2 Test of a Single Variance
    4. 11.3 Goodness-of-Fit Test
    5. 11.4 Test of Independence
    6. 11.5 Test for Homogeneity
    7. 11.6 Comparison of the Chi-Square Tests
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Homework
    13. Bringing It Together: Homework
    14. References
    15. Solutions
  13. 12 F Distribution and One-Way ANOVA
    1. Introduction
    2. 12.1 Test of Two Variances
    3. 12.2 One-Way ANOVA
    4. 12.3 The F Distribution and the F-Ratio
    5. 12.4 Facts About the F Distribution
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  14. 13 Linear Regression and Correlation
    1. Introduction
    2. 13.1 The Correlation Coefficient r
    3. 13.2 Testing the Significance of the Correlation Coefficient
    4. 13.3 Linear Equations
    5. 13.4 The Regression Equation
    6. 13.5 Interpretation of Regression Coefficients: Elasticity and Logarithmic Transformation
    7. 13.6 Predicting with a Regression Equation
    8. 13.7 How to Use Microsoft Excel® for Regression Analysis
    9. Key Terms
    10. Chapter Review
    11. Practice
    12. Solutions
  15. A | Statistical Tables
  16. B | Mathematical Phrases, Symbols, and Formulas
  17. Index

The Central Limit Theorem tells us that the point estimate for the sample mean, x¯x¯, comes from a normal distribution of x¯x¯'s. This theoretical distribution is called the sampling distribution of x¯x¯'s. We now investigate the sampling distribution for another important parameter we wish to estimate; p from the binomial probability density function.

If the random variable is discrete, such as for categorical data, then the parameter we wish to estimate is the population proportion. This is, of course, the probability of drawing a success in any one random draw. Unlike the case just discussed for a continuous random variable where we did not know the population distribution of X's, here we actually know the underlying probability density function for these data; it is the binomial. The random variable is X = the number of successes and the parameter we wish to know is p, the probability of drawing a success which is of course the proportion of successes in the population. The question at issue is: from what distribution was the sample proportion, p'=xnp'=xn drawn? The sample size is n and X is the number of successes found in that sample. This is a parallel question that was just answered by the Central Limit Theorem: from what distribution was the sample mean, x¯x¯, drawn? We saw that once we knew that the distribution was the Normal distribution then we were able to create confidence intervals for the population parameter, µ. We will also use this same information to test hypotheses about the population mean later. We wish now to be able to develop confidence intervals for the population parameter "p" from the binomial probability density function.

In order to find the distribution from which sample proportions come we need to develop the sampling distribution of sample proportions just as we did for sample means. So again imagine that we randomly sample say 50 people and ask them if they support the new school bond issue. From this we find a sample proportion, p', and graph it on the axis of p's. We do this again and again etc., etc. until we have the theoretical distribution of p's. Some sample proportions will show high favorability toward the bond issue and others will show low favorability because random sampling will reflect the variation of views within the population. What we have done can be seen in Figure 7.9. The top panel is the population distributions of probabilities for each possible value of the random variable X. While we do not know what the specific distribution looks like because we do not know p, the population parameter, we do know that it must look something like this. In reality, we do not know either the mean or the standard deviation of this population distribution, the same difficulty we faced when analyzing the X's previously.

Figure 7.9

Figure 7.9 places the mean on the distribution of population probabilities as µ=npµ=np but of course we do not actually know the population mean because we do not know the population probability of success, pp. Below the distribution of the population values is the sampling distribution of pp's. Again the Central Limit Theorem tells us that this distribution is normally distributed just like the case of the sampling distribution for x¯x¯'s. This sampling distribution also has a mean, the mean of the pp's, and a standard deviation, σp'σp'.

Importantly, in the case of the analysis of the distribution of sample means, the Central Limit Theorem told us the expected value of the mean of the sample means in the sampling distribution, and the standard deviation of the sampling distribution. Again the Central Limit Theorem provides this information for the sampling distribution for proportions. The answers are:

  1. The expected value of the mean of sampling distribution of sample proportions, µp'µp', is the population proportion, p.
  2. The standard deviation of the sampling distribution of sample proportions, σp'σp', is the population standard deviation divided by the square root of the sample size, n.

Both these conclusions are the same as we found for the sampling distribution for sample means. However in this case, because the mean and standard deviation of the binomial distribution both rely upon pp, the formula for the standard deviation of the sampling distribution requires algebraic manipulation to be useful. We will take that up in the next chapter. The proof of these important conclusions from the Central Limit Theorem is provided below.

E(p')=E(xn)=(1n)E(x)=(1n)np=pE(p')=E(xn)=(1n)E(x)=(1n)np=p

(The expected value of X, E(x), is simply the mean of the binomial distribution which we know to be np.)

σp'2=Var(p')=Var(xn)=1n2(Var(x))=1n2(np(1p))=p(1p)nσp'2=Var(p')=Var(xn)=1n2(Var(x))=1n2(np(1p))=p(1p)n

The standard deviation of the sampling distribution for proportions is thus:

σp'=p(1P)nσp'=p(1P)n
Parameter Population distribution Sample Sampling distribution of p's
Mean µ = np p'=xnp'=xn p' and E(p') = p
Standard Deviation σ=npqσ=npq σp'=p(1p)nσp'=p(1p)n
Table 7.2

Table 7.2 summarizes these results and shows the relationship between the population, sample and sampling distribution. Notice the parallel between this Table and Table 7.1 for the case where the random variable is continuous and we were developing the sampling distribution for means.

Reviewing the formula for the standard deviation of the sampling distribution for proportions we see that as n increases the standard deviation decreases. This is the same observation we made for the standard deviation for the sampling distribution for means. Again, as the sample size increases, the point estimate for either µ or p is found to come from a distribution with a narrower and narrower distribution. We concluded that with a given level of probability, the range from which the point estimate comes is smaller as the sample size, n, increases. Figure 7.8 shows this result for the case of sample means. Simply substitute p'p' for x¯x¯ and we can see the impact of the sample size on the estimate of the sample proportion.

Citation/Attribution

Want to cite, share, or modify this book? This book is Creative Commons Attribution License 4.0 and you must attribute OpenStax.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at https://openstax.org/books/introductory-business-statistics/pages/1-introduction
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at https://openstax.org/books/introductory-business-statistics/pages/1-introduction
Citation information

© Nov 29, 2017 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License 4.0 license. The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.