Skip to Content
OpenStax Logo
Buy book
  1. Preface
  2. 1 Sampling and Data
    1. Introduction
    2. 1.1 Definitions of Statistics, Probability, and Key Terms
    3. 1.2 Data, Sampling, and Variation in Data and Sampling
    4. 1.3 Frequency, Frequency Tables, and Levels of Measurement
    5. 1.4 Experimental Design and Ethics
    6. 1.5 Data Collection Experiment
    7. 1.6 Sampling Experiment
    8. Key Terms
    9. Chapter Review
    10. Practice
    11. Homework
    12. Bringing It Together: Homework
    13. References
    14. Solutions
  3. 2 Descriptive Statistics
    1. Introduction
    2. 2.1 Stem-and-Leaf Graphs (Stemplots), Line Graphs, and Bar Graphs
    3. 2.2 Histograms, Frequency Polygons, and Time Series Graphs
    4. 2.3 Measures of the Location of the Data
    5. 2.4 Box Plots
    6. 2.5 Measures of the Center of the Data
    7. 2.6 Skewness and the Mean, Median, and Mode
    8. 2.7 Measures of the Spread of the Data
    9. 2.8 Descriptive Statistics
    10. Key Terms
    11. Chapter Review
    12. Formula Review
    13. Practice
    14. Homework
    15. Bringing It Together: Homework
    16. References
    17. Solutions
  4. 3 Probability Topics
    1. Introduction
    2. 3.1 Terminology
    3. 3.2 Independent and Mutually Exclusive Events
    4. 3.3 Two Basic Rules of Probability
    5. 3.4 Contingency Tables
    6. 3.5 Tree and Venn Diagrams
    7. 3.6 Probability Topics
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Bringing It Together: Practice
    13. Homework
    14. Bringing It Together: Homework
    15. References
    16. Solutions
  5. 4 Discrete Random Variables
    1. Introduction
    2. 4.1 Probability Distribution Function (PDF) for a Discrete Random Variable
    3. 4.2 Mean or Expected Value and Standard Deviation
    4. 4.3 Binomial Distribution
    5. 4.4 Geometric Distribution
    6. 4.5 Hypergeometric Distribution
    7. 4.6 Poisson Distribution
    8. 4.7 Discrete Distribution (Playing Card Experiment)
    9. 4.8 Discrete Distribution (Lucky Dice Experiment)
    10. Key Terms
    11. Chapter Review
    12. Formula Review
    13. Practice
    14. Homework
    15. References
    16. Solutions
  6. 5 Continuous Random Variables
    1. Introduction
    2. 5.1 Continuous Probability Functions
    3. 5.2 The Uniform Distribution
    4. 5.3 The Exponential Distribution
    5. 5.4 Continuous Distribution
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  7. 6 The Normal Distribution
    1. Introduction
    2. 6.1 The Standard Normal Distribution
    3. 6.2 Using the Normal Distribution
    4. 6.3 Normal Distribution (Lap Times)
    5. 6.4 Normal Distribution (Pinkie Length)
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  8. 7 The Central Limit Theorem
    1. Introduction
    2. 7.1 The Central Limit Theorem for Sample Means (Averages)
    3. 7.2 The Central Limit Theorem for Sums
    4. 7.3 Using the Central Limit Theorem
    5. 7.4 Central Limit Theorem (Pocket Change)
    6. 7.5 Central Limit Theorem (Cookie Recipes)
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Homework
    12. References
    13. Solutions
  9. 8 Confidence Intervals
    1. Introduction
    2. 8.1 A Single Population Mean using the Normal Distribution
    3. 8.2 A Single Population Mean using the Student t Distribution
    4. 8.3 A Population Proportion
    5. 8.4 Confidence Interval (Home Costs)
    6. 8.5 Confidence Interval (Place of Birth)
    7. 8.6 Confidence Interval (Women's Heights)
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Homework
    13. References
    14. Solutions
  10. 9 Hypothesis Testing with One Sample
    1. Introduction
    2. 9.1 Null and Alternative Hypotheses
    3. 9.2 Outcomes and the Type I and Type II Errors
    4. 9.3 Distribution Needed for Hypothesis Testing
    5. 9.4 Rare Events, the Sample, Decision and Conclusion
    6. 9.5 Additional Information and Full Hypothesis Test Examples
    7. 9.6 Hypothesis Testing of a Single Mean and Single Proportion
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Homework
    13. References
    14. Solutions
  11. 10 Hypothesis Testing with Two Samples
    1. Introduction
    2. 10.1 Two Population Means with Unknown Standard Deviations
    3. 10.2 Two Population Means with Known Standard Deviations
    4. 10.3 Comparing Two Independent Population Proportions
    5. 10.4 Matched or Paired Samples
    6. 10.5 Hypothesis Testing for Two Means and Two Proportions
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Homework
    12. Bringing It Together: Homework
    13. References
    14. Solutions
  12. 11 The Chi-Square Distribution
    1. Introduction
    2. 11.1 Facts About the Chi-Square Distribution
    3. 11.2 Goodness-of-Fit Test
    4. 11.3 Test of Independence
    5. 11.4 Test for Homogeneity
    6. 11.5 Comparison of the Chi-Square Tests
    7. 11.6 Test of a Single Variance
    8. 11.7 Lab 1: Chi-Square Goodness-of-Fit
    9. 11.8 Lab 2: Chi-Square Test of Independence
    10. Key Terms
    11. Chapter Review
    12. Formula Review
    13. Practice
    14. Homework
    15. Bringing It Together: Homework
    16. References
    17. Solutions
  13. 12 Linear Regression and Correlation
    1. Introduction
    2. 12.1 Linear Equations
    3. 12.2 Scatter Plots
    4. 12.3 The Regression Equation
    5. 12.4 Testing the Significance of the Correlation Coefficient
    6. 12.5 Prediction
    7. 12.6 Outliers
    8. 12.7 Regression (Distance from School)
    9. 12.8 Regression (Textbook Cost)
    10. 12.9 Regression (Fuel Efficiency)
    11. Key Terms
    12. Chapter Review
    13. Formula Review
    14. Practice
    15. Homework
    16. Bringing It Together: Homework
    17. References
    18. Solutions
  14. 13 F Distribution and One-Way ANOVA
    1. Introduction
    2. 13.1 One-Way ANOVA
    3. 13.2 The F Distribution and the F-Ratio
    4. 13.3 Facts About the F Distribution
    5. 13.4 Test of Two Variances
    6. 13.5 Lab: One-Way ANOVA
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Homework
    12. References
    13. Solutions
  15. A | Review Exercises (Ch 3-13)
  16. B | Practice Tests (1-4) and Final Exams
  17. C | Data Sets
  18. D | Group and Partner Projects
  19. E | Solution Sheets
  20. F | Mathematical Phrases, Symbols, and Formulas
  21. G | Notes for the TI-83, 83+, 84, 84+ Calculators
  22. H | Tables
  23. Index
1.

two proportions

3.

matched or paired samples

5.

single mean

7.

independent group means, population standard deviations and/or variances unknown

9.

two proportions

11.

independent group means, population standard deviations and/or variances unknown

13.

independent group means, population standard deviations and/or variances unknown

15.

two proportions

17.

The random variable is the difference between the mean amounts of sugar in the two soft drinks.

19.

means

21.

two-tailed

23.

the difference between the mean life spans of whites and nonwhites

25.

This is a comparison of two population means with unknown population standard deviations.

27.

Check student’s solution.

29.
  1. Reject the null hypothesis
  2. p-value < 0.05
  3. There is not enough evidence at the 5% level of significance to support the claim that life expectancy in the 1900s is different between whites and nonwhites.
31.

The difference in mean speeds of the fastball pitches of the two pitchers

33.

–2.46

35.

At the 1% significance level, we can reject the null hypothesis. There is sufficient data to conclude that the mean speed of Rodriguez’s fastball is faster than Wesley’s.

37.

Subscripts: 1 = Food, 2 = No Food
H0: μ1μ2
Ha: μ1 > μ2

39.
This is a normal distribution curve with mean equal to zero. The values 0 and 0.1 are labeled on the horiztonal axis. A vertical line extends from 0.1 to the curve. The region under the curve to the right of the line is shaded to represent p-value = 0.0198.
Figure 10.18
41.

Subscripts: 1 = Gamma, 2 = Zeta
H0: μ1 = μ2
Ha: μ1μ2

43.

0.0062

45.

There is sufficient evidence to reject the null hypothesis. The data support that the melting point for Alloy Zeta is different from the melting point of Alloy Gamma.

47.

POS1POS2 = difference in the proportions of phones that had system failures within the first eight hours of operation with OS1 and OS2.

49.

0.1018

51.

proportions

53.

right-tailed

55.

The random variable is the difference in proportions (percents) of the populations that are of two or more races in Nevada and North Dakota.

57.

Our sample sizes are much greater than five each, so we use the normal for two proportions distribution for this hypothesis test.

59.

Check student’s solution.

61.
  1. Reject the null hypothesis.
  2. p-value < alpha
  3. At the 5% significance level, there is sufficient evidence to conclude that the proportion (percent) of the population that is of two or more races in Nevada is statistically higher than that in North Dakota.
63.

the mean difference of the system failures

65.

0.0067

67.

With a p-value 0.0067, we can reject the null hypothesis. There is enough evidence to support that the software patch is effective in reducing the number of system failures.

69.

0.0021

71.
This is a normal distribution curve with mean equal to zero. The values 0 and 1.67 are labeled on the horiztonal axis. A vertical line extends from 1.67 to the curve. The region under the curve to the right of the line is shaded to represent p-value = 0.0021.
Figure 10.19
73.

H0: μd ≥ 0

Ha: μd < 0

75.

0.0699

77.

We decline to reject the null hypothesis. There is not sufficient evidence to support that the medication is effective.

79.

Subscripts: 1: two-year colleges; 2: four-year colleges

  1. H0: μ1μ2
  2. Ha: μ1 < μ2
  3. X ¯ 1 X ¯ 2 X ¯ 1 X ¯ 2 is the difference between the mean enrollments of the two-year colleges and the four-year colleges.
  4. Student’s-t
  5. test statistic: -0.2480
  6. p-value: 0.4019
  7. Check student’s solution.
    1. Alpha: 0.05
    2. Decision: Do not reject
    3. Reason for Decision: p-value > alpha
    4. Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the mean enrollment at four-year colleges is higher than at two-year colleges.
81.

Subscripts: 1: mechanical engineering; 2: electrical engineering

  1. H0: µ1µ2
  2. Ha: µ1 < µ2
  3. X ¯ 1 X ¯ 2 X ¯ 1 X ¯ 2 is the difference between the mean entry level salaries of mechanical engineers and electrical engineers.
  4. t108
  5. test statistic: t = –0.82
  6. p-value: 0.2061
  7. Check student’s solution.
    1. Alpha: 0.05
    2. Decision: Do not reject the null hypothesis.
    3. Reason for Decision: p-value > alpha
    4. Conclusion: At the 5% significance level, there is insufficient evidence to conclude that the mean entry-level salaries of mechanical engineers is lower than that of electrical engineers.
83.
  1. H0: µ1 = µ2
  2. Ha: µ1µ2
  3. X ¯ 1 X ¯ 2 X ¯ 1 X ¯ 2 is the difference between the mean times for completing a lap in races and in practices.
  4. t20.32
  5. test statistic: –4.70
  6. p-value: 0.0001
  7. Check student’s solution.
    1. Alpha: 0.05
    2. Decision: Reject the null hypothesis.
    3. Reason for Decision: p-value < alpha
    4. Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the mean time for completing a lap in races is different from that in practices.
85.
  1. H0: µ1 = µ2
  2. Ha: µ1µ2
  3. is the difference between the mean times for completing a lap in races and in practices.
  4. t40.94
  5. test statistic: –5.08
  6. p-value: zero
  7. Check student’s solution.
    1. Alpha: 0.05
    2. Decision: Reject the null hypothesis.
    3. Reason for Decision: p-value < alpha
    4. Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the mean time for completing a lap in races is different from that in practices.
88.

c

90.

Test: two independent sample means, population standard deviations unknown.

Random variable: X ¯ 1 X ¯ 2 X ¯ 1 X ¯ 2

Distribution: H0: μ1 = μ2 Ha: μ1 < μ2 The mean age of entering prostitution in Canada is lower than the mean age in the United States.

This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the left of zero extends from the axis to the curve. The region under the curve to the left of the line is shaded representing p-value = 0.0157.
Figure 10.20

Graph: left-tailed

p-value : 0.0151

Decision: Do not reject H0.

Conclusion: At the 1% level of significance, from the sample data, there is not sufficient evidence to conclude that the mean age of entering prostitution in Canada is lower than the mean age in the United States.

92.

d

94.

Subscripts: 1 = boys, 2 = girls

  1. H0: µ1µ2
  2. Ha: µ1 > µ2
  3. The random variable is the difference in the mean auto insurance costs for boys and girls.
  4. normal
  5. test statistic: z = 2.50
  6. p-value: 0.0062
  7. Check student’s solution.
    1. Alpha: 0.05
    2. Decision: Reject the null hypothesis.
    3. Reason for Decision: p-value < alpha
    4. Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the mean cost of auto insurance for teenage boys is greater than that for girls.
96.

Subscripts: 1 = non-hybrid sedans, 2 = hybrid sedans

  1. H0: µ1µ2
  2. Ha: µ1 < µ2
  3. The random variable is the difference in the mean miles per gallon of non-hybrid sedans and hybrid sedans.
  4. normal
  5. test statistic: 6.36
  6. p-value: 0
  7. Check student’s solution.
    1. Alpha: 0.05
    2. Decision: Reject the null hypothesis.
    3. Reason for decision: p-value < alpha
    4. Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the mean miles per gallon of non-hybrid sedans is less than that of hybrid sedans.
98.
  1. H0: µd = 0
  2. Ha: µd < 0
  3. The random variable Xd is the average difference between husband’s and wife’s satisfaction level.
  4. t9
  5. test statistic: t = –1.86
  6. p-value: 0.0479
  7. Check student’s solution
    1. Alpha: 0.05
    2. Decision: Reject the null hypothesis, but run another test.
    3. Reason for Decision: p-value < alpha
    4. Conclusion: This is a weak test because alpha and the p-value are close. However, there is insufficient evidence to conclude that the mean difference is negative.
100.
  1. H0: PW = PB
  2. Ha: PWPB
  3. The random variable is the difference in the proportions of white and black suicide victims, aged 15 to 24.
  4. normal for two proportions
  5. test statistic: –0.1944
  6. p-value: 0.8458
  7. Check student’s solution.
    1. Alpha: 0.05
    2. Decision: Reject the null hypothesis.
    3. Reason for decision: p-value > alpha
    4. Conclusion: At the 5% significance level, there is insufficient evidence to conclude that the proportions of white and black female suicide victims, aged 15 to 24, are different.
102.

Subscripts: 1 = Cabrillo College, 2 = Lake Tahoe College

  1. H0: p1 = p2
  2. Ha: p1p2
  3. The random variable is the difference between the proportions of Hispanic students at Cabrillo College and Lake Tahoe College.
  4. normal for two proportions
  5. test statistic: 4.29
  6. p-value: 0.00002
  7. Check student’s solution.
    1. Alpha: 0.05
    2. Decision: Reject the null hypothesis.
    3. Reason for decision: p-value < alpha
    4. Conclusion: There is sufficient evidence to conclude that the proportions of Hispanic students at Cabrillo College and Lake Tahoe College are different.
104.

a

106.

Test: two independent sample proportions.

Random variable: p1 - p2

Distribution:
H0: p1 = p2
Ha: p1p2

The proportion of eReader users is different for the 16- to 29-year-old users from that of the 30 and older users.

Graph: two-tailed

This is a normal distribution curve with mean equal to zero. Both the right and left tails of the curve are shaded. Each tail represents 1/2(p-value) = 0.0017.
Figure 10.21

p-value : 0.0033

Decision: Reject the null hypothesis.

Conclusion: At the 5% level of significance, from the sample data, there is sufficient evidence to conclude that the proportion of eReader users 16 to 29 years old is different from the proportion of eReader users 30 and older.

108.

Test: two independent sample proportions

Random variable: p′1p′2

Distribution:

H0: p1 = p2
Ha: p1 > p2

A higher proportion of tablet owners are aged 16 to 29 years old than are 30 years old and older.

Graph: right-tailed

This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded representing p-value = 0.2354.
Figure 10.22

p-value: 0.2354

Decision: Do not reject the H0.

Conclusion: At the 1% level of significance, from the sample data, there is not sufficient evidence to conclude that a higher proportion of tablet owners are aged 16 to 29 years old than are 30 years old and older.

110.

Subscripts: 1: men; 2: women

  1. H0: p1p2
  2. Ha: p1 > p2
  3. P 1 P 2 P 1 P 2 is the difference between the proportions of men and women who enjoy shopping for electronic equipment.
  4. normal for two proportions
  5. test statistic: 0.22
  6. p-value: 0.4133
  7. Check student’s solution.
    1. Alpha: 0.05
    2. Decision: Do not reject the null hypothesis.
    3. Reason for Decision: p-value > alpha
    4. Conclusion: At the 5% significance level, there is insufficient evidence to conclude that the proportion of men who enjoy shopping for electronic equipment is more than the proportion of women.
112.
  1. H0: p1 = p2
  2. Ha: p1p2
  3. P 1 P 2 P 1 P 2 is the difference between the proportions of men and women that have at least one pierced ear.
  4. normal for two proportions
  5. test statistic: –4.82
  6. p-value: zero
  7. Check student’s solution.
    1. Alpha: 0.05
    2. Decision: Reject the null hypothesis.
    3. Reason for Decision: p-value < alpha
    4. Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the proportions of males and females with at least one pierced ear is different.
114.
  1. H0: µd = 0
  2. Ha: µd > 0
  3. The random variable Xd is the mean difference in work times on days when eating breakfast and on days when not eating breakfast.
  4. t9
  5. test statistic: 4.8963
  6. p-value: 0.0004
  7. Check student’s solution.
    1. Alpha: 0.05
    2. Decision: Reject the null hypothesis.
    3. Reason for Decision: p-value < alpha
    4. Conclusion: At the 5% level of significance, there is sufficient evidence to conclude that the mean difference in work times on days when eating breakfast and on days when not eating breakfast has increased.
115.

p-value = 0.1494

At the 5% significance level, there is insufficient evidence to conclude that the medication lowered cholesterol levels after 12 weeks.

117.

b

119.

c

121.

Test: two matched pairs or paired samples (t-test)

Random variable: X ¯ d X ¯ d

Distribution: t12

H0: μd = 0 Ha: μd > 0

The mean of the differences of new female breast cancer cases in the south between 2013 and 2012 is greater than zero. The estimate for new female breast cancer cases in the south is higher in 2013 than in 2012.

Graph: right-tailed

p-value: 0.0004

This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded representing p-value = 0.0004.
Figure 10.23

Decision: Reject H0

Conclusion: At the 5% level of significance, from the sample data, there is sufficient evidence to conclude that there was a higher estimate of new female breast cancer cases in 2013 than in 2012.

123.

Test: matched or paired samples (t-test)

Difference data: {–0.9, –3.7, –3.2, –0.5, 0.6, –1.9, –0.5, 0.2, 0.6, 0.4, 1.7, –2.4, 1.8}

Random Variable: X ¯ d X ¯ d

Distribution: H0: μd = 0 Ha: μd < 0

The mean of the differences of the rate of underemployment in the northeastern states between 2012 and 2011 is less than zero. The underemployment rate went down from 2011 to 2012.

Graph: left-tailed.

This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded representing p-value = 0.1207.
Figure 10.24

p-value: 0.1207

Decision: Do not reject H0.

Conclusion: At the 5% level of significance, from the sample data, there is not sufficient evidence to conclude that there was a decrease in the underemployment rates of the northeastern states from 2011 to 2012.

125.

e

127.

d

129.

f

131.

e

133.

f

135.

a

Citation/Attribution

Want to cite, share, or modify this book? This book is Creative Commons Attribution License 4.0 and you must attribute OpenStax.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at https://openstax.org/books/introductory-statistics/pages/1-introduction
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at https://openstax.org/books/introductory-statistics/pages/1-introduction
Citation information

© Sep 19, 2013 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License 4.0 license. The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.