Skip to ContentGo to accessibility pageKeyboard shortcuts menu
OpenStax Logo
Statistics

4.4 Geometric Distribution (Optional)

Statistics4.4 Geometric Distribution (Optional)

Menu
Table of contents
  1. Preface
  2. 1 Sampling and Data
    1. Introduction
    2. 1.1 Definitions of Statistics, Probability, and Key Terms
    3. 1.2 Data, Sampling, and Variation in Data and Sampling
    4. 1.3 Frequency, Frequency Tables, and Levels of Measurement
    5. 1.4 Experimental Design and Ethics
    6. 1.5 Data Collection Experiment
    7. 1.6 Sampling Experiment
    8. Key Terms
    9. Chapter Review
    10. Practice
    11. Homework
    12. Bringing It Together: Homework
    13. References
    14. Solutions
  3. 2 Descriptive Statistics
    1. Introduction
    2. 2.1 Stem-and-Leaf Graphs (Stemplots), Line Graphs, and Bar Graphs
    3. 2.2 Histograms, Frequency Polygons, and Time Series Graphs
    4. 2.3 Measures of the Location of the Data
    5. 2.4 Box Plots
    6. 2.5 Measures of the Center of the Data
    7. 2.6 Skewness and the Mean, Median, and Mode
    8. 2.7 Measures of the Spread of the Data
    9. 2.8 Descriptive Statistics
    10. Key Terms
    11. Chapter Review
    12. Formula Review
    13. Practice
    14. Homework
    15. Bringing It Together: Homework
    16. References
    17. Solutions
  4. 3 Probability Topics
    1. Introduction
    2. 3.1 Terminology
    3. 3.2 Independent and Mutually Exclusive Events
    4. 3.3 Two Basic Rules of Probability
    5. 3.4 Contingency Tables
    6. 3.5 Tree and Venn Diagrams
    7. 3.6 Probability Topics
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Bringing It Together: Practice
    13. Homework
    14. Bringing It Together: Homework
    15. References
    16. Solutions
  5. 4 Discrete Random Variables
    1. Introduction
    2. 4.1 Probability Distribution Function (PDF) for a Discrete Random Variable
    3. 4.2 Mean or Expected Value and Standard Deviation
    4. 4.3 Binomial Distribution (Optional)
    5. 4.4 Geometric Distribution (Optional)
    6. 4.5 Hypergeometric Distribution (Optional)
    7. 4.6 Poisson Distribution (Optional)
    8. 4.7 Discrete Distribution (Playing Card Experiment)
    9. 4.8 Discrete Distribution (Lucky Dice Experiment)
    10. Key Terms
    11. Chapter Review
    12. Formula Review
    13. Practice
    14. Homework
    15. References
    16. Solutions
  6. 5 Continuous Random Variables
    1. Introduction
    2. 5.1 Continuous Probability Functions
    3. 5.2 The Uniform Distribution
    4. 5.3 The Exponential Distribution (Optional)
    5. 5.4 Continuous Distribution
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  7. 6 The Normal Distribution
    1. Introduction
    2. 6.1 The Standard Normal Distribution
    3. 6.2 Using the Normal Distribution
    4. 6.3 Normal Distribution—Lap Times
    5. 6.4 Normal Distribution—Pinkie Length
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  8. 7 The Central Limit Theorem
    1. Introduction
    2. 7.1 The Central Limit Theorem for Sample Means (Averages)
    3. 7.2 The Central Limit Theorem for Sums (Optional)
    4. 7.3 Using the Central Limit Theorem
    5. 7.4 Central Limit Theorem (Pocket Change)
    6. 7.5 Central Limit Theorem (Cookie Recipes)
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Homework
    12. References
    13. Solutions
  9. 8 Confidence Intervals
    1. Introduction
    2. 8.1 A Single Population Mean Using the Normal Distribution
    3. 8.2 A Single Population Mean Using the Student's t-Distribution
    4. 8.3 A Population Proportion
    5. 8.4 Confidence Interval (Home Costs)
    6. 8.5 Confidence Interval (Place of Birth)
    7. 8.6 Confidence Interval (Women's Heights)
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Homework
    13. References
    14. Solutions
  10. 9 Hypothesis Testing with One Sample
    1. Introduction
    2. 9.1 Null and Alternative Hypotheses
    3. 9.2 Outcomes and the Type I and Type II Errors
    4. 9.3 Distribution Needed for Hypothesis Testing
    5. 9.4 Rare Events, the Sample, and the Decision and Conclusion
    6. 9.5 Additional Information and Full Hypothesis Test Examples
    7. 9.6 Hypothesis Testing of a Single Mean and Single Proportion
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Homework
    13. References
    14. Solutions
  11. 10 Hypothesis Testing with Two Samples
    1. Introduction
    2. 10.1 Two Population Means with Unknown Standard Deviations
    3. 10.2 Two Population Means with Known Standard Deviations
    4. 10.3 Comparing Two Independent Population Proportions
    5. 10.4 Matched or Paired Samples (Optional)
    6. 10.5 Hypothesis Testing for Two Means and Two Proportions
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Homework
    12. Bringing It Together: Homework
    13. References
    14. Solutions
  12. 11 The Chi-Square Distribution
    1. Introduction
    2. 11.1 Facts About the Chi-Square Distribution
    3. 11.2 Goodness-of-Fit Test
    4. 11.3 Test of Independence
    5. 11.4 Test for Homogeneity
    6. 11.5 Comparison of the Chi-Square Tests
    7. 11.6 Test of a Single Variance
    8. 11.7 Lab 1: Chi-Square Goodness-of-Fit
    9. 11.8 Lab 2: Chi-Square Test of Independence
    10. Key Terms
    11. Chapter Review
    12. Formula Review
    13. Practice
    14. Homework
    15. Bringing It Together: Homework
    16. References
    17. Solutions
  13. 12 Linear Regression and Correlation
    1. Introduction
    2. 12.1 Linear Equations
    3. 12.2 The Regression Equation
    4. 12.3 Testing the Significance of the Correlation Coefficient (Optional)
    5. 12.4 Prediction (Optional)
    6. 12.5 Outliers
    7. 12.6 Regression (Distance from School) (Optional)
    8. 12.7 Regression (Textbook Cost) (Optional)
    9. 12.8 Regression (Fuel Efficiency) (Optional)
    10. Key Terms
    11. Chapter Review
    12. Formula Review
    13. Practice
    14. Homework
    15. Bringing It Together: Homework
    16. References
    17. Solutions
  14. 13 F Distribution and One-way Anova
    1. Introduction
    2. 13.1 One-Way ANOVA
    3. 13.2 The F Distribution and the F Ratio
    4. 13.3 Facts About the F Distribution
    5. 13.4 Test of Two Variances
    6. 13.5 Lab: One-Way ANOVA
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Homework
    12. References
    13. Solutions
  15. A | Appendix A Review Exercises (Ch 3–13)
  16. B | Appendix B Practice Tests (1–4) and Final Exams
  17. C | Data Sets
  18. D | Group and Partner Projects
  19. E | Solution Sheets
  20. F | Mathematical Phrases, Symbols, and Formulas
  21. G | Notes for the TI-83, 83+, 84, 84+ Calculators
  22. H | Tables
  23. Index

There are three main characteristics of a geometric experiment:

  1. Repeating independent Bernoulli trials until a success is obtained. Recall that a Bernoulli trial is a binomial experiment with number of trials n = 1. In other words, you keep repeating what you are doing until the first success. Then you stop. For example, you throw a dart at a bull's-eye until you hit the bull's-eye. The first time you hit the bull's-eye is a success so you stop throwing the dart. It might take six tries until you hit the bull's-eye. You can think of the trials as failure, failure, failure, failure, failure, success, stop.
  2. In theory, the number of trials could go on forever. There must be at least one trial.
  3. The probability, p, of a success and the probability, q, of a failure do not change from trial to trial. p + q = 1 and q = 1 − p. For example, the probability of rolling a three when you throw one fair die is 1 6 1 6 . This is true no matter how many times you roll the die. Suppose you want to know the probability of getting the first three on the fifth roll. On rolls one through four, you do not get a face with a three. The probability for each of the rolls is q = 5 6 5 6 , the probability of a failure. The probability of getting a three on the fifth roll is ( 5 6 )( 5 6 )( 5 6 )( 5 6 )( 1 6 ) ( 5 6 )( 5 6 )( 5 6 )( 5 6 )( 1 6 ) = .0804.

X = the number of independent trials until the first success.

p = the probability of a success, q = 1 – p = the probability of a failure.

There are shortcut formulas for calculating mean μ, variance σ2, and standard deviation σ of a geometric probability distribution. The formulas are given as below. The deriving of these formulas will not be discussed in this book.

μ= 1 p , σ 2 =( 1 p )( 1 p 1),σ= ( 1 p )( 1 p 1) μ= 1 p , σ 2 =( 1 p )( 1 p 1),σ= ( 1 p )( 1 p 1)

Example 4.16

Suppose a game has two outcomes, win or lose. You repeatedly play that game until you lose. The probability of losing is p = 0.57.

If we let X = the number of games you play until you lose (includes the losing game), then X is a geometric random variable. All three characteristics are met. Each game you play is a Bernoulli trial, either win or lose. You would need to play at least one game before you stop. X takes on the values 1, 2, 3, . . . (could go on indefinitely). Since we are measuring the number of games you play until you lose, we define a success as losing a game and a failure as winning a game. The probability of a success p=.57 p=.57 and the probability of a failure q = 1 – p = 1 – 0.57 = 0.43. Both p and q remain the same from game to game.

If we want to find the probability that it takes five games until you lose, then the probability could be written as P(x = 5). We will explain how to find a geometric probability later in this section.

Try It 4.16

You throw darts at a board until you hit the center area. Your probability of hitting the center area is p = 0.17. You want to find the probability that it takes eight throws until you hit the center. What values does X take on?

Example 4.17

A safety engineer feels that 35 percent of all industrial accidents in her plant are caused by failure of employees to follow instructions. She decides to look at the accident reports (selected randomly and replaced in the pile after reading) until she finds one that shows an accident caused by failure of employees to follow instructions.

If we let X = the number of accidents the safety engineer must examine until she finds a report showing an accident caused by employee failure to follow instructions, then X is a geometric random variable. All three characteristics are met. Each accident report she reads is a Bernoulli trial: the accident was either caused by failure of employees to follow instructions or not. She would need to read at least one accident report before she stops. X takes on the values 1, 2, 3, . . . (could go on indefinitely). Since we are measuring the number of reports she needs to read until one that shows an accident caused by failure of employees to follow instructions, we define a success as an accident caused by failure of employees to follow instructions. If an accident was caused by another reason, the report is defined as a failure. The probability of a success p = .35 and the probability of a failure q=1p=1.35=.65 q=1p=1.35=.65 . Both p and q remain the same from report to report.

If we want to find the probability that the safety engineer will have to examine at least three reports until she finds a report showing an accident caused by employee failure to follow instructions, then the probability could be written as p=.35 p=.35 . If we want to find how many reports, on average, the safety engineer would expect to look at until she finds a report showing an accident caused by employee failure to follow instructions, we need to find the expected value E(x). We will explain how to solve these questions later in this section.

Try It 4.17

An instructor feels that 15 percent of students get below a C on their final exam. She decides to look at final exams (selected randomly and replaced in the pile after reading) until she finds one that shows a grade below a C. We want to know the probability that the instructor will have to examine at least 10 exams until she finds one with a grade below a C. What is the probability question stated mathematically?

Example 4.18

Suppose that you are looking for a student at your college who lives within five miles of you. You know that 55 percent of the 25,000 students do live within five miles of you. You randomly contact students from the college until one says he or she lives within five miles of you. What is the probability that you need to contact four people?

This is a geometric problem because you may have a number of failures before you have the one success you desire. Also, the probability of a success stays the same each time you ask a student if he or she lives within five miles of you. There is no definite number of trials (number of times you ask a student).

Problem

a. Let X = the number of ________ you must ask ________ one says yes.

Problem

b. What values does X take on?

Problem

c. What are p and q?

Problem

d. The probability question is P(_______).

Try It 4.18

You need to find a store that carries a special printer ink. You know that of the stores that carry printer ink, 10 percent of them carry the special ink. You randomly call each store until one has the ink you need. What are p and q?

Notation for the Geometric: G = Geometric Probability Distribution Function

X ~ G(p)

Read this as X is a random variable with a geometric distribution. The parameter is p; p = the probability of a success for each trial.

Example 4.19

Assume that the probability of a defective computer component is 0.02. Components are randomly selected. Find the probability that the first defect is caused by the seventh component tested. How many components do you expect to test until one is found to be defective?

Let X = the number of computer components tested until the first defect is found.

X takes on the values 1, 2, 3, . . . where p = .02. X ~ G(.02)

Find P(x = 7). There is a formula to define the probability of a geometric distribution P(x) P(x) . We can use the formula to find P(x=7) P(x=7) . But since the calculation is tedious and time consuming, people usually use a graphing calculator or software to get the answer. Using a graphing calculator, you can get P(x=7)=.0177 P(x=7)=.0177 . The instruction of TI83, 83+, 84, 84+ is given below.

Using the TI-83, 83+, 84, 84+ Calculator

Go into 2nd DISTR. The syntax for the instructions are as follows:

To calculate the probability of a value P(x = value), use geometpdf(p, number). Here geometpdf represents geometric probability density function. It is used to find the probability that a geometric random variable is equal to an exact value. p is the probability of a success and number is the value.

To calculate the cumulative probability P(x ≤ value), use geometcdf(p, number). Here geometcdf represents geometric cumulative distribution function. It is used to determine the probability of “at most” type of problem, the probability that a geometric random variable is less than or equal to a value. p is the probability of a success and number is the value.

To find P(x=7) P(x=7) , enter 2nd DISTR, arrow down to geometpdf(. Press ENTER. Enter .02,7). The result is P(x=7)=.0177 P(x=7)=.0177 .

If we need to find P(x7) P(x7) enter 2nd DISTR, arrow down to geometcdf(. Press ENTER. Enter .02,7). The result is (x=7)=.1319 (x=7)=.1319 .

The graph of X ~ G(.02) is

This graph shows a geometric probability distribution. It consists of bars that peak at the left and slope downwards with each successive bar to the right. The values on the x-axis count the number of computer components tested until the defect is found. The y-axis is scaled from 0 to 0.02 in increments of 0.005.
Figure 4.2

The previous probability distribution histogram gives all the probabilities of X. The x-axis of each bar is the value of X = the number of computer components tested until the first defect is found, and the height of that bar is the probability of that value occurring. For example, the x value of the first bar is 1 and the height of the first bar is 0.02. That means the probability that the first computer components tested is defective is .02.

The expected value or mean of X is E(X)=μ= 1 p = 1 .02 =50 E(X)=μ= 1 p = 1 .02 =50 .

The variance of X is σ 2 =( 1 p )( 1 p 1)=( 1 .02 )( 1 .02 1)=(50)(49)=2,450 σ 2 =( 1 p )( 1 p 1)=( 1 .02 )( 1 .02 1)=(50)(49)=2,450

The standard deviation of X is σ= σ 2 = 2,450 =49.5 σ= σ 2 = 2,450 =49.5

Here is how we interpret the mean and standard deviation. The number of components that you would expect to test until you find the first defective one is 50 (which is the mean). And you expect that to vary by about 50 computer components (which is the standard deviation) on average.

Try It 4.19

The probability of a defective steel rod is .01. Steel rods are selected at random. Find the probability that the first defect occurs on the ninth steel rod. Use the TI-83+ or TI-84 calculator to find the answer.

Example 4.20

Problem

The lifetime risk of developing pancreatic cancer is about one in 78 (1.28 percent). Let X = the number of people you ask until one says he or she has pancreatic cancer. Then X is a discrete random variable with a geometric distribution: X ~ G ( 1 78 ) ( 1 78 ) or X ~ G(.0128).

  1. What is the probability that you ask 10 people before one says he or she has pancreatic cancer?
  2. What is the probability that you must ask 20 people?
  3. Find the (i) mean and (ii) standard deviation of X.

Try It 4.20

The literacy rate for a nation measures the proportion of people age 15 and over who can read and write. The literacy rate for women in Afghanistan is 12 percent. Let X = the number of Afghani women you ask until one says that she is literate.

  1. What is the probability distribution of X?
  2. What is the probability that you ask five women before one says she is literate?
  3. What is the probability that you must ask 10 women?
  4. Find the (i) mean and (ii) standard deviation of X.
Order a print copy

As an Amazon Associate we earn from qualifying purchases.

Citation/Attribution

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute Texas Education Agency (TEA). The original material is available at: https://www.texasgateway.org/book/tea-statistics . Changes were made to the original material, including updates to art, structure, and other content updates.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at https://openstax.org/books/statistics/pages/1-introduction
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at https://openstax.org/books/statistics/pages/1-introduction
Citation information

© Jan 18, 2023 Texas Education Agency (TEA). The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.