Skip to ContentGo to accessibility pageKeyboard shortcuts menu
OpenStax Logo
Introductory Statistics

7.3 Using the Central Limit Theorem

Introductory Statistics7.3 Using the Central Limit Theorem

Menu
Table of contents
  1. Preface
  2. 1 Sampling and Data
    1. Introduction
    2. 1.1 Definitions of Statistics, Probability, and Key Terms
    3. 1.2 Data, Sampling, and Variation in Data and Sampling
    4. 1.3 Frequency, Frequency Tables, and Levels of Measurement
    5. 1.4 Experimental Design and Ethics
    6. 1.5 Data Collection Experiment
    7. 1.6 Sampling Experiment
    8. Key Terms
    9. Chapter Review
    10. Practice
    11. Homework
    12. Bringing It Together: Homework
    13. References
    14. Solutions
  3. 2 Descriptive Statistics
    1. Introduction
    2. 2.1 Stem-and-Leaf Graphs (Stemplots), Line Graphs, and Bar Graphs
    3. 2.2 Histograms, Frequency Polygons, and Time Series Graphs
    4. 2.3 Measures of the Location of the Data
    5. 2.4 Box Plots
    6. 2.5 Measures of the Center of the Data
    7. 2.6 Skewness and the Mean, Median, and Mode
    8. 2.7 Measures of the Spread of the Data
    9. 2.8 Descriptive Statistics
    10. Key Terms
    11. Chapter Review
    12. Formula Review
    13. Practice
    14. Homework
    15. Bringing It Together: Homework
    16. References
    17. Solutions
  4. 3 Probability Topics
    1. Introduction
    2. 3.1 Terminology
    3. 3.2 Independent and Mutually Exclusive Events
    4. 3.3 Two Basic Rules of Probability
    5. 3.4 Contingency Tables
    6. 3.5 Tree and Venn Diagrams
    7. 3.6 Probability Topics
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Bringing It Together: Practice
    13. Homework
    14. Bringing It Together: Homework
    15. References
    16. Solutions
  5. 4 Discrete Random Variables
    1. Introduction
    2. 4.1 Probability Distribution Function (PDF) for a Discrete Random Variable
    3. 4.2 Mean or Expected Value and Standard Deviation
    4. 4.3 Binomial Distribution
    5. 4.4 Geometric Distribution
    6. 4.5 Hypergeometric Distribution
    7. 4.6 Poisson Distribution
    8. 4.7 Discrete Distribution (Playing Card Experiment)
    9. 4.8 Discrete Distribution (Lucky Dice Experiment)
    10. Key Terms
    11. Chapter Review
    12. Formula Review
    13. Practice
    14. Homework
    15. References
    16. Solutions
  6. 5 Continuous Random Variables
    1. Introduction
    2. 5.1 Continuous Probability Functions
    3. 5.2 The Uniform Distribution
    4. 5.3 The Exponential Distribution
    5. 5.4 Continuous Distribution
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  7. 6 The Normal Distribution
    1. Introduction
    2. 6.1 The Standard Normal Distribution
    3. 6.2 Using the Normal Distribution
    4. 6.3 Normal Distribution (Lap Times)
    5. 6.4 Normal Distribution (Pinkie Length)
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  8. 7 The Central Limit Theorem
    1. Introduction
    2. 7.1 The Central Limit Theorem for Sample Means (Averages)
    3. 7.2 The Central Limit Theorem for Sums
    4. 7.3 Using the Central Limit Theorem
    5. 7.4 Central Limit Theorem (Pocket Change)
    6. 7.5 Central Limit Theorem (Cookie Recipes)
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Homework
    12. References
    13. Solutions
  9. 8 Confidence Intervals
    1. Introduction
    2. 8.1 A Single Population Mean using the Normal Distribution
    3. 8.2 A Single Population Mean using the Student t Distribution
    4. 8.3 A Population Proportion
    5. 8.4 Confidence Interval (Home Costs)
    6. 8.5 Confidence Interval (Place of Birth)
    7. 8.6 Confidence Interval (Women's Heights)
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Homework
    13. References
    14. Solutions
  10. 9 Hypothesis Testing with One Sample
    1. Introduction
    2. 9.1 Null and Alternative Hypotheses
    3. 9.2 Outcomes and the Type I and Type II Errors
    4. 9.3 Distribution Needed for Hypothesis Testing
    5. 9.4 Rare Events, the Sample, Decision and Conclusion
    6. 9.5 Additional Information and Full Hypothesis Test Examples
    7. 9.6 Hypothesis Testing of a Single Mean and Single Proportion
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Homework
    13. References
    14. Solutions
  11. 10 Hypothesis Testing with Two Samples
    1. Introduction
    2. 10.1 Two Population Means with Unknown Standard Deviations
    3. 10.2 Two Population Means with Known Standard Deviations
    4. 10.3 Comparing Two Independent Population Proportions
    5. 10.4 Matched or Paired Samples
    6. 10.5 Hypothesis Testing for Two Means and Two Proportions
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Homework
    12. Bringing It Together: Homework
    13. References
    14. Solutions
  12. 11 The Chi-Square Distribution
    1. Introduction
    2. 11.1 Facts About the Chi-Square Distribution
    3. 11.2 Goodness-of-Fit Test
    4. 11.3 Test of Independence
    5. 11.4 Test for Homogeneity
    6. 11.5 Comparison of the Chi-Square Tests
    7. 11.6 Test of a Single Variance
    8. 11.7 Lab 1: Chi-Square Goodness-of-Fit
    9. 11.8 Lab 2: Chi-Square Test of Independence
    10. Key Terms
    11. Chapter Review
    12. Formula Review
    13. Practice
    14. Homework
    15. Bringing It Together: Homework
    16. References
    17. Solutions
  13. 12 Linear Regression and Correlation
    1. Introduction
    2. 12.1 Linear Equations
    3. 12.2 Scatter Plots
    4. 12.3 The Regression Equation
    5. 12.4 Testing the Significance of the Correlation Coefficient
    6. 12.5 Prediction
    7. 12.6 Outliers
    8. 12.7 Regression (Distance from School)
    9. 12.8 Regression (Textbook Cost)
    10. 12.9 Regression (Fuel Efficiency)
    11. Key Terms
    12. Chapter Review
    13. Formula Review
    14. Practice
    15. Homework
    16. Bringing It Together: Homework
    17. References
    18. Solutions
  14. 13 F Distribution and One-Way ANOVA
    1. Introduction
    2. 13.1 One-Way ANOVA
    3. 13.2 The F Distribution and the F-Ratio
    4. 13.3 Facts About the F Distribution
    5. 13.4 Test of Two Variances
    6. 13.5 Lab: One-Way ANOVA
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Homework
    12. References
    13. Solutions
  15. A | Review Exercises (Ch 3-13)
  16. B | Practice Tests (1-4) and Final Exams
  17. C | Data Sets
  18. D | Group and Partner Projects
  19. E | Solution Sheets
  20. F | Mathematical Phrases, Symbols, and Formulas
  21. G | Notes for the TI-83, 83+, 84, 84+ Calculators
  22. H | Tables
  23. Index

It is important for you to understand when to use the central limit theorem. If you are being asked to find the probability of the mean, use the clt for the mean. If you are being asked to find the probability of a sum or total, use the clt for sums. This also applies to percentiles for means and sums.

NOTE

If you are being asked to find the probability of an individual value, do not use the clt. Use the distribution of its random variable.

Examples of the Central Limit Theorem

Law of Large Numbers

The law of large numbers says that if you take samples of larger and larger size from any population, then the mean x ¯ x ¯ of the sample tends to get closer and closer to μ. From the central limit theorem, we know that as n gets larger and larger, the sample means follow a normal distribution. The larger n gets, the smaller the standard deviation gets. (Remember that the standard deviation for X ¯ X ¯ is σ n σ n .) This means that the sample mean x ¯ x ¯ must be close to the population mean μ. We can say that μ is the value that the sample means approach as n gets larger. The central limit theorem illustrates the law of large numbers.

Central Limit Theorem for the Mean and Sum Examples

Example 7.8

A study involving stress is conducted among the students on a college campus. The stress scores follow a uniform distribution with the lowest stress score equal to one and the highest equal to five. Using a sample of 75 students, find:

  1. The probability that the mean stress score for the 75 students is less than two.
  2. The 90th percentile for the mean stress score for the 75 students.
  3. The probability that the total of the 75 stress scores is less than 200.
  4. The 90th percentile for the total stress score for the 75 students.

Let X = one stress score.

Problems a and b ask you to find a probability or a percentile for a mean. Problems c and d ask you to find a probability or a percentile for a total or sum. The sample size, n, is equal to 75.

Since the individual stress scores follow a uniform distribution, X ~ U(1, 5) where a = 1 and b = 5 (See Continuous Random Variables for an explanation on the uniform distribution).

μX = a+b 2 a+b 2 = 1 + 5 2 1 + 5 2 = 3

σX = (b–a) 2 12 (b–a) 2 12 = (5–1) 2 12 (5–1) 2 12 = 1.15

For problems a. and b., let X ¯ X ¯ = the mean stress score for the 75 students. Then,

X ¯ X ¯ ∼ N ( 3,  1.15 75 ) ( 3,  1.15 75 )

Problem

a. Find P( x ¯ x ¯ < 2). Draw the graph.

Problem

b. Find the 90th percentile for the mean of 75 stress scores. Draw a graph.

For problems c and d, let ΣX = the sum of the 75 stress scores. Then, ΣX ~ N[(75)(3), ( 75 ) ( 75 ) (1.15)]

Problem

c. Find P(Σx < 200). Draw the graph.

Problem

d. Find the 90th percentile for the total of 75 stress scores. Draw a graph.

Try It 7.8

Use the information in Example 7.8, but use a sample size of 55 to answer the following questions.

  1. Find P( x ¯ x ¯ < 7).
  2. Find P(Σx > 170).
  3. Find the 80th percentile for the mean of 55 scores.
  4. Find the 85th percentile for the sum of 55 scores.

Example 7.9

Suppose that a market research analyst for a cell phone company conducts a study of their customers who exceed the time allowance included on their basic cell phone contract; the analyst finds that for those people who exceed the time included in their basic contract, the excess time used follows an exponential distribution with a mean of 22 minutes.

Consider a random sample of 80 customers who exceed the time allowance included in their basic cell phone contract.

Let X = the excess time used by one INDIVIDUAL cell phone customer who exceeds his contracted time allowance.

X ∼ Exp ( 1 22 ) ( 1 22 ) . From previous chapters, we know that μ = 22 and σ = 22.

Let X ¯ X ¯ = the mean excess time used by a sample of n = 80 customers who exceed their contracted time allowance.

X ¯ X ¯  ~ N ( 22,  22 80 ) ( 22,  22 80 ) by the central limit theorem for sample means

Problem

Using the clt to find probability

  1. Find the probability that the mean excess time used by the 80 customers in the sample is longer than 20 minutes. This is asking us to find P( x ¯ x ¯ > 20). Draw the graph.
  2. Suppose that one customer who exceeds the time limit for his cell phone contract is randomly selected. Find the probability that this individual customer's excess time is longer than 20 minutes. This is asking us to find P(x > 20).
  3. Explain why the probabilities in parts a and b are different.

Problem

Using the clt to find percentiles

Find the 95th percentile for the sample mean excess time for samples of 80 customers who exceed their basic contract time allowances. Draw a graph.

Try It 7.9

Use the information in Example 7.9, but change the sample size to 144.

  1. Find P(20 < x ¯ x ¯ < 30).
  2. Find P(Σx is at least 3,000).
  3. Find the 75th percentile for the sample mean excess time of 144 customers.
  4. Find the 85th percentile for the sum of 144 excess times used by customers.

Example 7.10

In the United States, someone is sexually assaulted every two minutes, on average, according to a number of studies. Suppose the standard deviation is 0.5 minutes and the sample size is 100.

Problem

  1. Find the median, the first quartile, and the third quartile for the sample mean time of sexual assaults in the United States.
  2. Find the median, the first quartile, and the third quartile for the sum of sample times of sexual assaults in the United States.
  3. Find the probability that a sexual assault occurs on the average between 1.75 and 1.85 minutes.
  4. Find the value that is two standard deviations above the sample mean.
  5. Find the IQR for the sum of the sample times.

Try It 7.10

Based on data from the National Health Survey, women between the ages of 18 and 24 have an average systolic blood pressures (in mm Hg) of 114.8 with a standard deviation of 13.1. Systolic blood pressure for women between the ages of 18 to 24 follow a normal distribution.

  1. If one woman from this population is randomly selected, find the probability that her systolic blood pressure is greater than 120.
  2. If 40 women from this population are randomly selected, find the probability that their mean systolic blood pressure is greater than 120.
  3. If the sample were four women between the ages of 18 to 24 and we did not know the original distribution, could the central limit theorem be used?

Example 7.11

Problem

A study was done about violence against prostitutes and the symptoms of the posttraumatic stress that they developed. The age range of the prostitutes was 14 to 61. The mean age was 30.9 years with a standard deviation of nine years.

  1. In a sample of 25 prostitutes, what is the probability that the mean age of the prostitutes is less than 35?
  2. Is it likely that the mean age of the sample group could be more than 50 years? Interpret the results.
  3. In a sample of 49 prostitutes, what is the probability that the sum of the ages is no less than 1,600?
  4. Is it likely that the sum of the ages of the 49 prostitutes is at most 1,595? Interpret the results.
  5. Find the 95th percentile for the sample mean age of 65 prostitutes. Interpret the results.
  6. Find the 90th percentile for the sum of the ages of 65 prostitutes. Interpret the results.

Try It 7.11

According to Boeing data, the 757 airliner carries 200 passengers and has doors with a height of 72 inches. Assume for a certain population of men we have a mean height of 69.0 inches and a standard deviation of 2.8 inches.

  1. What doorway height would allow 95% of men to enter the aircraft without bending?
  2. Assume that half of the 200 passengers are men. What mean doorway height satisfies the condition that there is a 0.95 probability that this height is greater than the mean height of 100 men?
  3. For engineers designing the 757, which result is more relevant: the height from part a or part b? Why?

HISTORICAL NOTE

: Normal Approximation to the Binomial

Historically, being able to compute binomial probabilities was one of the most important applications of the central limit theorem. Binomial probabilities with a small value for n(say, 20) were displayed in a table in a book. To calculate the probabilities with large values of n, you had to use the binomial formula, which could be very complicated. Using the normal approximation to the binomial distribution simplified the process. To compute the normal approximation to the binomial distribution, take a simple random sample from a population. You must meet the conditions for a binomial distribution:

  • there are a certain number n of independent trials
  • the outcomes of any trial are success or failure
  • each trial has the same probability of a success p

Recall that if X is the binomial random variable, then X ~ B(n, p). The shape of the binomial distribution needs to be similar to the shape of the normal distribution. To ensure this, the quantities np and nq must both be greater than five (np > 5 and nq > 5; the approximation is better if they are both greater than or equal to 10). Then the binomial can be approximated by the normal distribution with mean μ = np and standard deviation σ = npq npq . Remember that q = 1 – p. In order to get the best approximation, add 0.5 to x or subtract 0.5 from x (use x + 0.5 or x – 0.5). The number 0.5 is called the continuity correction factor and is used in the following example.

Example 7.12

Suppose in a local Kindergarten through 12th grade (K - 12) school district, 53 percent of the population favor a charter school for grades K through 5. A simple random sample of 300 is surveyed.

  1. Find the probability that at least 150 favor a charter school.
  2. Find the probability that at most 160 favor a charter school.
  3. Find the probability that more than 155 favor a charter school.
  4. Find the probability that fewer than 147 favor a charter school.
  5. Find the probability that exactly 175 favor a charter school.

Let X = the number that favor a charter school for grades K trough 5. X ~ B(n, p) where n = 300 and p = 0.53. Since np > 5 and nq > 5, use the normal approximation to the binomial. The formulas for the mean and standard deviation are μ = np and σ = npq npq . The mean is 159 and the standard deviation is 8.6447. The random variable for the normal distribution is Y. Y ~ N(159, 8.6447). See The Normal Distribution for help with calculator instructions.

For part a, you include 150 so P(X ≥ 150) has normal approximation P(Y ≥ 149.5) = 0.8641.

normalcdf(149.5,10^99,159,8.6447) = 0.8641.

For part b, you include 160 so P(X ≤ 160) has normal appraximation P(Y ≤ 160.5) = 0.5689.

normalcdf(0,160.5,159,8.6447) = 0.5689

For part c, you exclude 155 so P(X > 155) has normal approximation P(y > 155.5) = 0.6572.

normalcdf(155.5,10^99,159,8.6447) = 0.6572.

For part d, you exclude 147 so P(X < 147) has normal approximation P(Y < 146.5) = 0.0741.

normalcdf(0,146.5,159,8.6447) = 0.0741

For part e,P(X = 175) has normal approximation P(174.5 < Y < 175.5) = 0.0083.

normalcdf(174.5,175.5,159,8.6447) = 0.0083

Because of calculators and computer software that let you calculate binomial probabilities for large values of n easily, it is not necessary to use the the normal approximation to the binomial distribution, provided that you have access to these technology tools. Most school labs have Microsoft Excel, an example of computer software that calculates binomial probabilities. Many students have access to the TI-83 or 84 series calculators, and they easily calculate probabilities for the binomial distribution. If you type in "binomial probability distribution calculation" in an Internet browser, you can find at least one online calculator for the binomial.

For Example 7.12, the probabilities are calculated using the following binomial distribution: (n = 300 and p = 0.53). Compare the binomial and normal distribution answers. See Discrete Random Variables for help with calculator instructions for the binomial.

P(X ≥ 150) :1 - binomialcdf(300,0.53,149) = 0.8641

P(X ≤ 160) :binomialcdf(300,0.53,160) = 0.5684

P(X > 155) :1 - binomialcdf(300,0.53,155) = 0.6576

P(X < 147) :binomialcdf(300,0.53,146) = 0.0742

P(X = 175) :(You use the binomial pdf.)binomialpdf(300,0.53,175) = 0.0083

Try It 7.12

In a city, 46 percent of the population favor the incumbent, Dawn Morgan, for mayor. A simple random sample of 500 is taken. Using the continuity correction factor, find the probability that at least 250 favor Dawn Morgan for mayor.

Order a print copy

As an Amazon Associate we earn from qualifying purchases.

Citation/Attribution

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute OpenStax.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at https://openstax.org/books/introductory-statistics/pages/1-introduction
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at https://openstax.org/books/introductory-statistics/pages/1-introduction
Citation information

© Jun 23, 2022 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.