Skip to ContentGo to accessibility pageKeyboard shortcuts menu
OpenStax Logo
Contemporary Mathematics

8.6 The Normal Distribution

Contemporary Mathematics8.6 The Normal Distribution

Table of contents
  1. Preface
  2. 1 Sets
    1. Introduction
    2. 1.1 Basic Set Concepts
    3. 1.2 Subsets
    4. 1.3 Understanding Venn Diagrams
    5. 1.4 Set Operations with Two Sets
    6. 1.5 Set Operations with Three Sets
    7. Chapter Summary
      1. Key Terms
      2. Key Concepts
      3. Videos
      4. Formula Review
      5. Projects
      6. Chapter Review
      7. Chapter Test
  3. 2 Logic
    1. Introduction
    2. 2.1 Statements and Quantifiers
    3. 2.2 Compound Statements
    4. 2.3 Constructing Truth Tables
    5. 2.4 Truth Tables for the Conditional and Biconditional
    6. 2.5 Equivalent Statements
    7. 2.6 De Morgan’s Laws
    8. 2.7 Logical Arguments
    9. Chapter Summary
      1. Key Terms
      2. Key Concepts
      3. Videos
      4. Projects
      5. Chapter Review
      6. Chapter Test
  4. 3 Real Number Systems and Number Theory
    1. Introduction
    2. 3.1 Prime and Composite Numbers
    3. 3.2 The Integers
    4. 3.3 Order of Operations
    5. 3.4 Rational Numbers
    6. 3.5 Irrational Numbers
    7. 3.6 Real Numbers
    8. 3.7 Clock Arithmetic
    9. 3.8 Exponents
    10. 3.9 Scientific Notation
    11. 3.10 Arithmetic Sequences
    12. 3.11 Geometric Sequences
    13. Chapter Summary
      1. Key Terms
      2. Key Concepts
      3. Videos
      4. Formula Review
      5. Projects
      6. Chapter Review
      7. Chapter Test
  5. 4 Number Representation and Calculation
    1. Introduction
    2. 4.1 Hindu-Arabic Positional System
    3. 4.2 Early Numeration Systems
    4. 4.3 Converting with Base Systems
    5. 4.4 Addition and Subtraction in Base Systems
    6. 4.5 Multiplication and Division in Base Systems
    7. Chapter Summary
      1. Key Terms
      2. Key Concepts
      3. Videos
      4. Projects
      5. Chapter Review
      6. Chapter Test
  6. 5 Algebra
    1. Introduction
    2. 5.1 Algebraic Expressions
    3. 5.2 Linear Equations in One Variable with Applications
    4. 5.3 Linear Inequalities in One Variable with Applications
    5. 5.4 Ratios and Proportions
    6. 5.5 Graphing Linear Equations and Inequalities
    7. 5.6 Quadratic Equations with Two Variables with Applications
    8. 5.7 Functions
    9. 5.8 Graphing Functions
    10. 5.9 Systems of Linear Equations in Two Variables
    11. 5.10 Systems of Linear Inequalities in Two Variables
    12. 5.11 Linear Programming
    13. Chapter Summary
      1. Key Terms
      2. Key Concepts
      3. Videos
      4. Formula Review
      5. Projects
      6. Chapter Review
      7. Chapter Test
  7. 6 Money Management
    1. Introduction
    2. 6.1 Understanding Percent
    3. 6.2 Discounts, Markups, and Sales Tax
    4. 6.3 Simple Interest
    5. 6.4 Compound Interest
    6. 6.5 Making a Personal Budget
    7. 6.6 Methods of Savings
    8. 6.7 Investments
    9. 6.8 The Basics of Loans
    10. 6.9 Understanding Student Loans
    11. 6.10 Credit Cards
    12. 6.11 Buying or Leasing a Car
    13. 6.12 Renting and Homeownership
    14. 6.13 Income Tax
    15. Chapter Summary
      1. Key Terms
      2. Key Concepts
      3. Videos
      4. Formula Review
      5. Projects
      6. Chapter Review
      7. Chapter Test
  8. 7 Probability
    1. Introduction
    2. 7.1 The Multiplication Rule for Counting
    3. 7.2 Permutations
    4. 7.3 Combinations
    5. 7.4 Tree Diagrams, Tables, and Outcomes
    6. 7.5 Basic Concepts of Probability
    7. 7.6 Probability with Permutations and Combinations
    8. 7.7 What Are the Odds?
    9. 7.8 The Addition Rule for Probability
    10. 7.9 Conditional Probability and the Multiplication Rule
    11. 7.10 The Binomial Distribution
    12. 7.11 Expected Value
    13. Chapter Summary
      1. Key Terms
      2. Key Concepts
      3. Formula Review
      4. Projects
      5. Chapter Review
      6. Chapter Test
  9. 8 Statistics
    1. Introduction
    2. 8.1 Gathering and Organizing Data
    3. 8.2 Visualizing Data
    4. 8.3 Mean, Median and Mode
    5. 8.4 Range and Standard Deviation
    6. 8.5 Percentiles
    7. 8.6 The Normal Distribution
    8. 8.7 Applications of the Normal Distribution
    9. 8.8 Scatter Plots, Correlation, and Regression Lines
    10. Chapter Summary
      1. Key Terms
      2. Key Concepts
      3. Videos
      4. Formula Review
      5. Projects
      6. Chapter Review
      7. Chapter Test
  10. 9 Metric Measurement
    1. Introduction
    2. 9.1 The Metric System
    3. 9.2 Measuring Area
    4. 9.3 Measuring Volume
    5. 9.4 Measuring Weight
    6. 9.5 Measuring Temperature
    7. Chapter Summary
      1. Key Terms
      2. Key Concepts
      3. Videos
      4. Formula Review
      5. Projects
      6. Chapter Review
      7. Chapter Test
  11. 10 Geometry
    1. Introduction
    2. 10.1 Points, Lines, and Planes
    3. 10.2 Angles
    4. 10.3 Triangles
    5. 10.4 Polygons, Perimeter, and Circumference
    6. 10.5 Tessellations
    7. 10.6 Area
    8. 10.7 Volume and Surface Area
    9. 10.8 Right Triangle Trigonometry
    10. Chapter Summary
      1. Key Terms
      2. Key Concepts
      3. Videos
      4. Formula Review
      5. Projects
      6. Chapter Review
      7. Chapter Test
  12. 11 Voting and Apportionment
    1. Introduction
    2. 11.1 Voting Methods
    3. 11.2 Fairness in Voting Methods
    4. 11.3 Standard Divisors, Standard Quotas, and the Apportionment Problem
    5. 11.4 Apportionment Methods
    6. 11.5 Fairness in Apportionment Methods
    7. Chapter Summary
      1. Key Terms
      2. Key Concepts
      3. Videos
      4. Formula Review
      5. Projects
      6. Chapter Review
      7. Chapter Test
  13. 12 Graph Theory
    1. Introduction
    2. 12.1 Graph Basics
    3. 12.2 Graph Structures
    4. 12.3 Comparing Graphs
    5. 12.4 Navigating Graphs
    6. 12.5 Euler Circuits
    7. 12.6 Euler Trails
    8. 12.7 Hamilton Cycles
    9. 12.8 Hamilton Paths
    10. 12.9 Traveling Salesperson Problem
    11. 12.10 Trees
    12. Chapter Summary
      1. Key Terms
      2. Key Concepts
      3. Videos
      4. Formula Review
      5. Projects
      6. Chapter Review
      7. Chapter Test
  14. 13 Math and...
    1. Introduction
    2. 13.1 Math and Art
    3. 13.2 Math and the Environment
    4. 13.3 Math and Medicine
    5. 13.4 Math and Music
    6. 13.5 Math and Sports
    7. Chapter Summary
      1. Key Terms
      2. Key Concepts
      3. Formula Review
      4. Projects
      5. Chapter Review
      6. Chapter Test
  15. A | Co-Req Appendix: Integer Powers of 10
  16. Answer Key
    1. Chapter 1
    2. Chapter 2
    3. Chapter 3
    4. Chapter 4
    5. Chapter 5
    6. Chapter 6
    7. Chapter 7
    8. Chapter 8
    9. Chapter 9
    10. Chapter 10
    11. Chapter 11
    12. Chapter 12
    13. Chapter 13
  17. Index
A combinational graph titled, results of 100 flips. The combinational graph includes a histogram and a normal distribution curve. The horizontal axis representing the number of heads ranges from 30 to 70, in increments of 10. The vertical axis representing frequency ranges from 0 to 800, in increments of 200. The histogram shows bars of varying heights. A normal distribution curve is drawn across the heights of the bars and it infers the following data. The curve begins at (30, 0), has a peak value at (48, 780), and ends at (70, 0).
Figure 8.32 Symmetric, bell-shaped distributions arise naturally in many different situations, including coin flips.

Learning Objectives

After completing this section, you should be able to:

  1. Describe the characteristics of the normal distribution.
  2. Apply the 68-95-99.7 percent groups to normal distribution datasets.
  3. Use the normal distribution to calculate a zz-score.
  4. Find and interpret percentiles and quartiles.

Many datasets that result from natural phenomena tend to have histograms that are symmetric and bell-shaped. Imagine finding yourself with a whole lot of time on your hands, and nothing to keep you entertained but a coin, a pencil, and paper. You decide to flip that coin 100 times and record the number of heads. With nothing else to do, you repeat the experiment ten times total. Using a computer to simulate this series of experiments, here’s a sample for the number of heads in each trial:

54, 51, 40, 42, 53, 50, 52, 52, 47, 54

It makes sense that we’d get somewhere around 50 heads when we flip the coin 100 times, and it makes sense that the result won’t always be exactly 50 heads. In our results, we can see numbers that were generally near 50 and not always 50, like we thought.

Moving Toward Normality

Let’s take a look at a histogram for the dataset in our section opener:

A histogram titled, results of 10 coin flips. The horizontal axis representing the number of heads ranges from 40 to 55, in increments of 1. The vertical axis representing frequency ranges from 0 to 2, in increments of 1. The histogram infers the following data. 40 to 41: 1. 42 to 43: 1. 47 to 48: 1. 50 to 51: 1. 51 to 52: 1. 52 to 53: 2. 53 to 54: 1. 54 to 55: 2.
Figure 8.33

This is interesting, but the data seem pretty sparse. There were no trials where you saw between 43 and 47 heads, for example. Those results don’t seem impossible; we just didn’t flip enough times to give them a chance to pop up. So, let’s do it again, but this time we'll perform 100 coin flips 100 times. Rather than review all 100 results, which could be overwhelming, let's instead visualize the resulting histogram.

A histogram titled, results of 100 coin flips. The horizontal axis representing the number of heads ranges from 35 to 71, in increments of 1. The vertical axis representing frequency ranges from 0 to 15, in increments of 5. The histogram infers the following data. 35 to 36: 1. 37 to 38: 1. 39 to 40: 1. 40 to 41: 1. 41 to 42: 1. 42 to 43: 3. 43 to 44: 1. 44 to 45: 6. 45 to 46: 4. 46 to 47: 7. 47 to 48: 10. 48 to 49: 6. 49 to 50: 7. 50 to 51: 13. 51 to 52: 5. 52 to 53: 5. 53 to 54: 7. 54 to 55: 5. 55 to 56: 3. 56 to 57: 4. 57 to 58: 2. 58 to 59: 2. 59 to 60: 2. 61 to 62: 2. 62 to 63: 1. 70 to 71: 1. Note: all values are approximate.
Figure 8.34

From the histogram, we see that most of the trials resulted in between, say, 44 and 56 heads. There were some more unusual results: one trial resulted in 70 heads, which seems really unlikely (though still possible!). But we’re starting to maybe get a sense of the distribution. More data would help, though. Let’s simulate another 900 trials and add them to the histogram!

A histogram titled, results of 1000 coin flips. The horizontal axis representing the number of heads ranges from 35 to 71, in increments of 1. The vertical axis representing frequency ranges from 0 to 100, in increments of 25. The histogram infers the following data. 35 to 36: 2. 37 to 38: 4. 38 to 39: 3. 39 to 40: 9. 40 to 41: 12. 41 to 42: 10. 42 to 43: 24. 43 to 44: 33. 44 to 45: 40. 45 to 46: 55. 46 to 47: 50. 47 to 48: 73. 48 to 49: 71. 49 to 50: 83. 50 to 51: 81. 51 to 52: 85. 52 to 53: 75. 53 to 54: 58. 54 to 55: 52. 55 to 56: 38. 56 to 57: 45. 57 to 58: 27. 58 to 59: 24. 59 to 60: 13. 60 to 61: 13. 61 to 62: 13. 62 to 63: 5. 63 to 64: 3. 64 to 65: 2. 65 to 66: 2. 66 to 67: 2. 68 to 69: 2. 70 to 71: 2. Note: all values are approximate.
Figure 8.35

We can still see that 70 is a really unusual observation, though we came close in another trial (one that had 68 heads). Now, the distribution is coming more into focus: It looks quite symmetric and bell-shaped. Let’s just go ahead and take this thought experiment to an extreme conclusion: 10,000 trials.

A histogram titled, results of 10,000 coin flips. The horizontal axis representing the number of heads ranges from 30 to 70, in increments of 1. The vertical axis representing frequency ranges from 0 to 1000, in increments of 250. The histogram infers the following data. 30 to 31: 5. 33 to 34: 5. 34 to 35: 5. 35 to 36: 10. 36 to 37: 15. 37 to 38: 15. 38 to 39: 35. 39 to 40: 60. 40 to 41: 100. 41 to 42: 160. 42 to 43: 245. 43 to 44: 270. 44 to 45: 320. 45 to 46: 530. 46 to 47: 540. 47 to 48: 620. 48 to 49: 710. 49 to 50: 820. 50 to 51: 800. 51 to 52: 790. 52 to 53: 680. 53 to 54: 660. 54 to 55: 620. 55 to 56: 470. 56 to 57: 440. 57 to 58: 310. 58 to 59: 230. 59 to 60: 200. 60 to 61: 150. 61 to 62: 120. 62 to 63: 100. 63 to 64: 50. 64 to 65: 40. 65 to 66: 10. 66 to 67: 10. 67 to 68: 8. 68 to 69: 8. 69 to 70: 5. 70 to 71: 5. Note: all values are approximate.
Figure 8.36

The distribution is pretty clear now. Distributions that are symmetric and bell-shaped like this pop up in all sorts of natural phenomena, such as the heights of people in a population, the circumferences of eggs of a particular bird species, and the numbers of leaves on mature trees of a particular species. All of these have bell-shaped distributions. Additionally, the results of many types of repeated experiments generally follow this same pattern, as we saw with the coin-flipping example; this fact is the basis for much of the work done by statisticians. It’s a fact that’s important enough to have its own name: the Central Limit Theorem.

People in Mathematics

John Kerrich

Having enough time on your hands to actually perform this coin-flipping experiment may sound far-fetched, but the English mathematician John Kerrich found himself in just such a situation. While he was studying abroad in Denmark in 1940, that country was invaded by the Germans. Kerrich was captured and placed in an internment camp, where he remained for the duration of the war. Kerrich knew that he had all kinds of time on his hands. He also studied statistics, so he knew what should happen theoretically if he flipped a coin many, many times. He also knew of nobody who had ever tested that theory with an actual, large-scale experiment. So, he did it: While he was incarcerated, Kerrich flipped a regular coin 10,000 times and recorded the results. Sure enough, the theory held up!

The Normal Distribution

In the coin flipping example above, the distribution of the number of heads for 10,000 trials was close to perfectly symmetric and bell-shaped:

A combinational graph titled, results of 100 flips. The combinational graph includes a histogram and a normal distribution curve. The horizontal axis representing the number of heads ranges from 30 to 70, in increments of 10. The vertical axis representing frequency ranges from 0 to 800, in increments of 200. The histogram shows bars of varying heights. A normal distribution curve is drawn across the heights of the bars and it infers the following data. The curve begins at (30, 0), has a peak value at (48, 780), and ends at (70, 0).
Figure 8.37

Because distributions with this shape appear so often, we have a special name for them: normal distributions. Normal distributions can be completely described using two numbers we’ve seen before: the mean of the data and the standard deviation of the data. You may remember that we described the mean as a measure of centrality; for a normal distribution, the mean tells us exactly where the center of the distribution falls. The peak of the distribution happens at the mean (and, because the distribution is symmetric, it’s also the median). The standard deviation is a measure of dispersion; for a normal distribution, it tells us how spread out the histogram looks. To illustrate these points, let’s look at some examples.

Example 8.31

Identifying the Mean of a Normal Distribution

This graph shows three normal distributions. What are their means?

A graph shows three normal distribution curves. The horizontal axis ranges from negative 1 to 5, in increments of 1. The three curves are described as follows. The first curve (red) begins at negative 1, has a peak value at 1, and ends at 3. The second curve (blue) begins at 0, has a peak value at 2, and ends at 4. The third curve (yellow) begins at 1, has a peak value at 3, and ends at 5. The three curves overlap each other and their peaks are of equal height.
Figure 8.38

Your Turn 8.31

1.

Identify the means of these three distributions:

A graph shows three normal distribution curves. The horizontal axis ranges from 10 to 15, in increments of 1. The three curves are described as follows. The first curve (red) begins before 10, has a peak value at 11, and ends at 14. The second curve (blue) begins at 11, has a peak value at 13, and ends at 15. The third curve (yellow) begins at 11, has a peak value at 14, and ends after 15. The three curves overlap each other and their peaks are of equal height.

Example 8.32

Identifying the Standard Deviation of a Normal Distribution

This graph shows three distributions, all with mean 2. What are their standard deviations?

A graph shows three normal distribution curves. The horizontal axis ranges from negative 4 to 10, in increments of 2. The three curves are described as follows. The first curve (red) begins before negative 4, has a peak value at 2, and ends at 10. The second curve (blue) begins before negative 4, has a peak value at 2, and ends at 10. The third curve (yellow) begins before negative 4, has a peak value at 2, and ends after 10. The first curve has the highest peak and the third curve has the lowest peak.
Figure 8.40

Your Turn 8.32

1.
Estimate the standard deviations of this normal distribution, centered at 5:
A normal distribution curve. The horizontal axis ranges from negative 10 to 20, in increments of 2. The curve begins before negative 10, has a peak value at 5, and ends after 20.

Let’s put it all together to identify a completely unknown normal distribution.

Example 8.33

Identifying the Mean and Standard Deviation of a Normal Distribution

Using the graph, identify the mean and standard deviation of the normal distribution.

A normal distribution curve. The horizontal axis ranges from 45 to 70, in increments of 1. The curve begins at 45, has a peak value at 55, and ends at 70.
Figure 8.45

Your Turn 8.33

1.
Identify the mean and standard deviation of this distribution. Anything within 5 on the standard deviation is acceptable.
A normal distribution curve. The horizontal axis ranges from 80 to 200, in increments of 5. The curve begins at 80, has a peak value at 150, and ends at 200.

Properties of Normal Distributions: The 68-95-99.7 Rule

The most important property of normal distributions is tied to its standard deviation. If a dataset is perfectly normally distributed, then 68% of the data values will fall within one standard deviation of the mean. For example, suppose we have a set of data that follows the normal distribution with mean 400 and standard deviation 100. This means 68% of the data would fall between the values of 300 (one standard deviation below the mean: 400-100=300400-100=300) and 500 (one standard deviation above the mean: 400+100=500400+100=500). Looking at the histogram below, the shaded area represents 68% of the total area under the graph and above the axis:

A normal distribution curve. The horizontal axis ranges from 0 to 800, in increments of 50. The curve begins at 0, has a peak value at 400, and ends at 800. The region from 300 to 500 is shaded and marked 68 percent. The regions to the left and right of the shaded region are marked 16 percent, each.
Figure 8.47

Since 68% of the area is in the shaded region, this means that 100% 68% = 32%100% 68% = 32% of the area is found in the unshaded regions. We know that the distribution is symmetric, so that 32% must be divided evenly into the two unshaded tails: 16% in each.

Of course, datasets in the real world are never perfect; when dealing with actual data that seem to follow a symmetric, bell-shaped distribution, we’ll give ourselves a little bit of wiggle room and say that approximately 68% of the data fall within one standard deviation of the mean.

The rule for one standard deviation can be extended to two standard deviations. Approximately 95% of a normally distributed dataset will fall within 2 standard deviations of the mean. If the mean is 400 and the standard deviation is 100, that means 95% calculation describes the way we compute standardized scores. (two standard deviations below the mean: 4002×100=2004002×100=200) and 600 (two standard deviations above the mean: 400+2×100=600400+2×100=600). We can visualize this in the following histogram:

A normal distribution curve. The horizontal axis ranges from 0 to 800, in increments of 50. The curve begins at 0, has a peak value at 400, and ends at 800. The region from 200 to 600 is shaded and marked 95 percent. The regions to the left and right of the shaded region are marked 2.5 percent, each.
Figure 8.48

As before, since 95% of the data are in the shaded area, that leaves 5% of the data to go into the unshaded tails. Since the histogram is symmetric, half of the 5% (that’s 2.5%) is in each.

We can even take this one step further: 99.7% of normally distributed data fall within 3 standard deviations of the mean. In this example, we’d see 99.7% of the data between 100 (calculated as 4003×100=1004003×100=100) and 700 (calculated as 400+3×100=700400+3×100=700). We can see this in the histogram below, although you may need to squint to find the unshaded bits in the tails!

A normal distribution curve. The horizontal axis ranges from 0 to 800, in increments of 50. The curve begins at 0, has a peak value at 400, and ends at 800. The region from 100 to 700 is shaded and marked 99.7 percent. The regions to the left and right of the shaded region are marked 0.15 percent, each.
Figure 8.49

This observation is formally known as the 68-95-99.7 Rule.

Example 8.34

Using the 68-95-99.7 Rule to Find Percentages

  1. If data are normally distributed with mean 8 and standard deviation 2, what percent of the data falls between 4 and 12?
  2. If data are normally distributed with mean 25 and standard deviation 5, what percent of the data falls between 20 and 30?
  3. If data are normally distributed with mean 200 and standard deviation 15, what percent of the data falls between 155 and 245?

Your Turn 8.34

1.
If data are distributed normally with mean 0 and standard deviation 3, what percent of the data fall between –9 and 9?
2.
If data are distributed normally with mean 50 and standard deviation 10, what percent of the data fall between 30 and 70?
3.
If data are distributed normally with mean 60 and standard deviation 5, what percent of the data fall between 55 and 65?

Example 8.35

Using the 68-95-99.7 Rule to Find Data Values

  1. If data are distributed normally with mean 100 and standard deviation 20, between what two values will 68% of the data fall?
  2. If data are distributed normally with mean 0 and standard deviation 15, between what two values will 95% of the data fall?
  3. If data are distributed normally with mean 14 and standard deviation 2, between what two values will 99.7% of the data fall?

Your Turn 8.35

1.

If data are distributed normally with mean 70 and standard deviation 5, between what two values will 68% of the data fall?

2.

If data are distributed normally with mean 40 and standard deviation 7, between what two values will 95% of the data fall?

3.

If data are distributed normally with mean 200 and standard deviation 30, between what two values will 99.7% of the data fall?

There are more problems we can solve using the 68-95-99.7 Rule. but first we must understand what the rule implies. Remember, the rule says that 68% of the data falls within one standard deviation of the mean. Thus, with normally distributed data with mean 100 and standard deviation 10, we have this distribution:

A normal distribution curve. The horizontal axis ranges from 60 to 140, in increments of 5. The curve begins at 60, has a peak value at 100, and ends at 140. The region from 90 to 110 is shaded in blue and marked 68 percent.
Figure 8.50

Since we know that 68% of the data lie within one standard deviation of the mean, the implication is that 32% of the data must fall beyond one standard deviation away from the mean. Since the histogram is symmetric, we can conclude that half of the 32% (or 16%) is more than one standard deviation above the mean and the other half is more than one standard deviation below the mean:

A normal distribution curve. The horizontal axis ranges from 60 to 140, in increments of 5. The curve begins at 60, has a peak value at 100, and ends at 140. The region from 90 to 110 is shaded in blue and marked 68 percent. The region to the left and right of the shaded region inside the curve is shaded in red. The region from 60 to 90 is marked 16 percent. The region from 110 to 140 is marked 16 percent.
Figure 8.51

Further, we know that the middle 68% can be split in half at the peak of the histogram, leaving 34% on either side:

A normal distribution curve. The horizontal axis ranges from 60 to 140, in increments of 5. The curve begins at 60, has a peak value at 100, and ends at 140. A vertical line is drawn at 100. The region from 90 to 100 and 100 to 110 are shaded in blue and marked 34 percent, each. The region to the left and right of the shaded region inside the curve is shaded in red. The region from 60 to 90 is marked 16 percent. The region from 110 to 140 is marked 16 percent.
Figure 8.52

So, just the “68” part of the 68-95-99.7 Rule gives us four other proportions in addition to the 68% in the rule. Similarly, the “95” and “99.7” parts each give us four more proportions:

A normal distribution curve. The horizontal axis ranges from 60 to 140, in increments of 5. The curve begins at 60, has a peak value at 100, and ends at 140. A vertical line is drawn at 100. The region from 80 to 120 is shaded in blue. The region from 80 to 100 and 100 to 120 are marked 47.5 percent, each. The region to the left and right of the shaded region inside the curve is shaded in red. The region from 60 to 80 is marked 2.5 percent. The region from 120 to 140 is marked 2.5 percent.
Figure 8.53
A normal distribution curve. The horizontal axis ranges from 60 to 140, in increments of 5. The curve begins at 60, has a peak value at 100, and ends at 140. A vertical line is drawn at 100. The region from 70 to 130 is shaded in blue. The region from 70 to 100 and 100 to 130 is marked 49.85 percent, each. The region to the left and right of the shaded region inside the curve is shaded. The region from 60 to 70 is marked 0.15 percent. The region from 130 to 140 is marked 0.15 percent.
Figure 8.54

We can put all these together to find even more complicated proportions. For example, since the proportion between 100 and 120 is 47.5% and the proportion between 100 and 110 is 34%, we can subtract to find that the proportion between 110 and 120 is 47.5-34=13.5%47.5-34=13.5%:

A normal distribution curve. The horizontal axis ranges from 60 to 140, in increments of 5. The curve begins at 60, has a peak value at 100, and ends at 140. A vertical line is drawn at 100. The region from 100 to 110 is shaded in blue. The region from 110 to 120 is shaded in red. The total shaded region from 100 to 120 is marked 47.5 percent. The shaded region from 100 to 110 is marked 34 percent. The shaded region from 110 to 120 is marked 13.5 percent.
Figure 8.55

Example 8.36

Finding Other Proportions Using the 68-95-99.7 Rule

Assume that we have data that are normally distributed with mean 80 and standard deviation 3.

  1. What proportion of the data will be greater than 86?
  2. What proportion of the data will be between 74 and 77?
  3. What proportion of the data will be between 74 and 83?

Your Turn 8.36

Suppose we have data that are normally distributed with mean 500 and standard deviation 100. What proportions of the data fall in these ranges?

1.

300 to 500

2.

600 to 800

3.

400 to 700

Standardized Scores

When we want to apply the 68-95-99.7 Rule, we must first figure out how many standard deviations above or below the mean our data fall. This calculation is common enough that it has its own name: the standardized score. Values above the mean have positive standardized scores, while those below the mean have negative standardized scores. Since it's common to use the letter zz to represent a standard score, this value is also often referred to as a zz-score.

So far, we’ve only really considered zz-scores that are whole numbers, but in general they can be any number at all. For example, if we have data that are normally distributed with mean 80 and standard deviation 6, the value 85 is five units above the mean, which is less than one standard deviation. Dividing by the standard deviation, we get 5656. Since 85 is 5656 of one standard deviation above the mean, we’d say that the standardized score for 85 is z=56z=56 (which is positive, since 85>8085>80). This calculation describes the way we compute standardized scores.

FORMULA

If xx is a member of a normally distributed dataset with mean µµ and standard deviation σσ, then the standardized score for xx is

z=x-µσ.z=x-µσ.

If you know a zz-score but not the original data value xx, you can find it by solving the previous equation for xx:

x=µ+z×σ.x=µ+z×σ.

Checkpoint

The symbols μμ and σσ are the Greek letters mu and sigma. They are the analogues of the English letters mm and ss, which stand for mean and standard deviation.

If you convert every data value in a dataset into its zz-score, the resulting set of data will have mean 0 and standard deviation 1. This is why we call these standardized scores: the normal distribution with mean 0 and standard deviation 1 is often called the standard normal distribution.

Example 8.37

Standardizing Data

Suppose we have data that are normally distributed with mean 50 and standard deviation 6. Compute the standardized scores (rounded to three decimal places) for these data values:

  1. 52
  2. 40
  3. 68

Your Turn 8.37

1.
Suppose we have data that are normally distributed with mean 75 and standard deviation 5. Compute standardized scores for each of these data values: 66, 83, and 72.

Example 8.38

Converting Standardized Scores to Original Values

Suppose we have data that are normally distributed with mean 10 and standard deviation 2. Convert the following standardized scores into data values.

  1. 1.4
  2. −0.9
  3. 3.5

Your Turn 8.38

1.
Suppose you have a normally distributed dataset with mean 2 and standard deviation 20. Convert these standardized scores to data values: –2.3, 1.4, and 0.2.

Using Google Sheets to Find Normal Percentiles

The 68-95-99.7 Rule is great when we’re dealing with whole-number zz-scores. However, if the zz-score is not a whole number, the Rule isn’t going to help us. Luckily, we can use technology to help us out. We’ll talk here about the built-in functions in Google Sheets, but other tools work similarly.

Let’s say we’re working with normally distributed data with mean 40 and standard deviation 7, and we want to know at what percentile a data value of 50 would fall. That corresponds to finding the proportion of the data that are less than 50. If we create our histogram and mark off whole-number multiples of the standard deviation like we did before, we’ll see why the 68-95-99.7 Rule isn’t going to help:

A normal distribution curve. The horizontal axis ranges from 20 to 60, in increments of 2. The curve begins before 20, has a peak value at 40, and ends after 60. Five vertical lines are drawn at 26, 33, 40, 47, and 54. The region to the left of 50 is shaded in blue.
Figure 8.62

Since 50 doesn’t line up with one of our lines, the 68-95-99.7 Rule fails us. Looking back at Figure 8.47 and Figure 8.48, the best we can say is that 50 is between the 84th and 99.5th percentiles, but that’s a pretty wide range. Google Sheets has a function that can help; it’s called NORM.DIST. Here’s how to use it:

  1. Click in an empty cell in your worksheet.
  2. Type “=NORM.DIST(“
  3. Inside the parentheses, we must enter a list of four things, separated by commas: the data value, the mean, the standard deviation, and the word “TRUE”. These have to be entered in this order!
  4. Close the parentheses, and hit Enter. The result is then displayed in the cell; convert it to a percent to get the percentile.

So, for our example, we should type “=NORM.DIST(50, 40, 7, TRUE)” into an empty cell, and hit Enter. The result is 0.9234362745; converting to a percent and rounding, we can conclude that 50 is at the 92nd percentile. Let’s walk through a few more examples.

Example 8.39

Using Google Sheets to Find Percentiles

Suppose we have data that are normally distributed with mean 28 and standard deviation 4. At what percentile do each of the following data values fall?

  1. 30
  2. 23
  3. 35

Your Turn 8.39

1.
Suppose you have data that are normally distributed with mean 20 and standard deviation 6. Determine at what percentiles these data values fall: 25, 12, and 31.

Google Sheets can also help us go the other direction: If we want to find the data value that corresponds to a given percentile, we can use the NORM.INV function. For example, if we have normally distributed data with mean 150 and standard deviation 25, we can find the data value at the 30th percentile as follows:

  1. Click on an empty cell in your worksheet.
  2. Type “=NORM.INV(“
  3. Inside the parentheses, we’ll enter a list of three numbers, separated by commas: the percentile in question expressed as a decimal, the mean, and the standard deviation. These must be entered in this order!
  4. Close the parentheses and hit Enter. The desired data value will be in the cell!

In our example, we want the 30th percentile; converting 30% to a decimal gives us 0.3. So, we’ll type “=NORM.INV(0.3, 150, 25)” to get 136.8899872; let’s round that off to 137.

Example 8.40

Using Google Sheets to Find the Data Value Corresponding to a Percentile

Suppose we have data that are normally distributed with mean 47 and standard deviation 9. Find the data values (rounded to the nearest tenth) corresponding to these percentiles:

  1. 75th (that’s the third quartile)
  2. 12th
  3. 90th

Your Turn 8.40

1.
Suppose we have data that are normally distributed with mean 5 and standard deviation 1.6. Identify which data values (rounded to the nearest tenth) correspond to these percentiles: 25th (the first quartile), 80th, and 10th.

Check Your Understanding

For each of these problems, assume we’re working with normally distributed data with mean 100 and standard deviation 12.
42.
What percentage of the data falls between 76 and 124? Use the 68-95-99.7 Rule.
43.
What percentage of the data falls between 100 and 112? Use the 68-95-99.7 Rule.
44.
At what percentile does 112 fall? Use the 68-95-99.7 Rule.
45.
What’s the z -score of the data value 107? Round to three decimal places.
46.
What data value’s z -score is –2.4?
47.
At what percentile does 107 fall? Use Google Sheets (or another technology).
48.
What data value is at the 90th percentile? Use Google Sheets (or another technology), and round to the nearest hundredth.
For the following exercises, explain how you can tell the histogram does NOT represent normally distributed data.
1 .

A bimodal distribution curve. The horizontal axis ranges from 5 to 20, in increments of 1. The curve begins at 2, rises up and to the right, reaches a peak value at 10, goes down and to the right, reaches a low point at 13, goes up and to the right, reaches a peak point at 16, goes down and to the right, and ends after 24.
2 .

A positively skewed distribution curve. The horizontal axis ranges from 0 to 15, in increments of 1. The curve begins at 0, rises steeply, reaches a peak point at 2, goes down and to the right, declines rapidly, and ends at 16.
For the following exercises, use the 68-95-99.7 Rule to answer the given questions about normally distributed data with mean 100 and standard deviation 5.
3 .
What proportion of the data fall between 95 and 105?
4 .
What proportion of the data fall between 90 and 110?
5 .
what proportion of the data fall between 85 and 100?
6 .
What proportion of the data fall between 110 and 120?
7 .
What proportion of the data are less than 90?
8 .
What proportion of the data are greater than 105?
9 .
What proportion of the data fall between 90 and 105?
10 .
What proportion of the data are between 95 and 115?
For the following exercises, use the 68-95-99.7 Rule to answer the given questions about normally distributed data with mean 9 and standard deviation 1.
11 .
What proportion of the data are less than 7?
12 .
What proportion of the data are greater than 12?
13 .
What proportion of the data are between 6 and 12?
14 .
What proportion of the data are between 8 and 9?
15 .
What proportion of the data are between 6 and 8?
16 .
What proportion of the data are between 8 and 11?
17 .
What proportion of the data are less than 10?
18 .
What proportion of the data are greater than 6?
In the following exercises, convert the given data values to standardized scores. Assume the data are distributed normally with mean 15 and standard deviation 3. Round to the nearest hundredth.
19 .
x = 17
20 .
x = 11
21 .
x = 10
22 .
x = 21
23 .
x = 16
24 .
x = 24
25 .
x = 8
26 .
x = 7.2
In the following exercises, convert the given z -scores to data values. Assume the data are distributed normally with mean 15 and standard deviation 3.
27 .
z = 1.2
28 .
z = 0.4
29 .
z = 3.6
30 .
z = 2.1
31 .
z = 2.8
32 .
z = 4
33 .
z = 0.4
34 .
z = 3.4
For the following exercises, answer the questions about normally distributed data with mean 200 and standard deviation 20. Round percentiles to the nearest whole number and round data values to the nearest tenth.
35 .
At what percentile is x = 225 ?
36 .
At what percentile is x = 184 ?
37 .
At what percentile is x = 192 ?
38 .
At what percentile is x = 206 ?
39 .
At what percentile is x = 239 ?
40 .
At what percentile is x = 202 ?
41 .
At what percentile is x = 190 ?
42 .
At what percentile is x = 175 ?
43 .
What data value is at the 40th percentile?
44 .
What data value is at the 10th percentile?
45 .
What data value is at the 55th percentile?
46 .
What data value is at the 95th percentile?
47 .
What data value is at the 33rd percentile?
48 .
What data value is at the third quartile?
49 .
What data value is at the 65th percentile?
50 .
What data value is at the 99th percentile?
Order a print copy

As an Amazon Associate we earn from qualifying purchases.

Citation/Attribution

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute OpenStax.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at https://openstax.org/books/contemporary-mathematics/pages/1-introduction
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at https://openstax.org/books/contemporary-mathematics/pages/1-introduction
Citation information

© Dec 21, 2023 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.