Skip to Content
OpenStax Logo
Buy book
  1. Preface
  2. 1 Sampling and Data
    1. Introduction
    2. 1.1 Definitions of Statistics, Probability, and Key Terms
    3. 1.2 Data, Sampling, and Variation in Data and Sampling
    4. 1.3 Frequency, Frequency Tables, and Levels of Measurement
    5. 1.4 Experimental Design and Ethics
    6. 1.5 Data Collection Experiment
    7. 1.6 Sampling Experiment
    8. Key Terms
    9. Chapter Review
    10. Practice
    11. Homework
    12. Bringing It Together: Homework
    13. References
    14. Solutions
  3. 2 Descriptive Statistics
    1. Introduction
    2. 2.1 Stem-and-Leaf Graphs (Stemplots), Line Graphs, and Bar Graphs
    3. 2.2 Histograms, Frequency Polygons, and Time Series Graphs
    4. 2.3 Measures of the Location of the Data
    5. 2.4 Box Plots
    6. 2.5 Measures of the Center of the Data
    7. 2.6 Skewness and the Mean, Median, and Mode
    8. 2.7 Measures of the Spread of the Data
    9. 2.8 Descriptive Statistics
    10. Key Terms
    11. Chapter Review
    12. Formula Review
    13. Practice
    14. Homework
    15. Bringing It Together: Homework
    16. References
    17. Solutions
  4. 3 Probability Topics
    1. Introduction
    2. 3.1 Terminology
    3. 3.2 Independent and Mutually Exclusive Events
    4. 3.3 Two Basic Rules of Probability
    5. 3.4 Contingency Tables
    6. 3.5 Tree and Venn Diagrams
    7. 3.6 Probability Topics
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Bringing It Together: Practice
    13. Homework
    14. Bringing It Together: Homework
    15. References
    16. Solutions
  5. 4 Discrete Random Variables
    1. Introduction
    2. 4.1 Probability Distribution Function (PDF) for a Discrete Random Variable
    3. 4.2 Mean or Expected Value and Standard Deviation
    4. 4.3 Binomial Distribution
    5. 4.4 Geometric Distribution
    6. 4.5 Hypergeometric Distribution
    7. 4.6 Poisson Distribution
    8. 4.7 Discrete Distribution (Playing Card Experiment)
    9. 4.8 Discrete Distribution (Lucky Dice Experiment)
    10. Key Terms
    11. Chapter Review
    12. Formula Review
    13. Practice
    14. Homework
    15. References
    16. Solutions
  6. 5 Continuous Random Variables
    1. Introduction
    2. 5.1 Continuous Probability Functions
    3. 5.2 The Uniform Distribution
    4. 5.3 The Exponential Distribution
    5. 5.4 Continuous Distribution
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  7. 6 The Normal Distribution
    1. Introduction
    2. 6.1 The Standard Normal Distribution
    3. 6.2 Using the Normal Distribution
    4. 6.3 Normal Distribution (Lap Times)
    5. 6.4 Normal Distribution (Pinkie Length)
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  8. 7 The Central Limit Theorem
    1. Introduction
    2. 7.1 The Central Limit Theorem for Sample Means (Averages)
    3. 7.2 The Central Limit Theorem for Sums
    4. 7.3 Using the Central Limit Theorem
    5. 7.4 Central Limit Theorem (Pocket Change)
    6. 7.5 Central Limit Theorem (Cookie Recipes)
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Homework
    12. References
    13. Solutions
  9. 8 Confidence Intervals
    1. Introduction
    2. 8.1 A Single Population Mean using the Normal Distribution
    3. 8.2 A Single Population Mean using the Student t Distribution
    4. 8.3 A Population Proportion
    5. 8.4 Confidence Interval (Home Costs)
    6. 8.5 Confidence Interval (Place of Birth)
    7. 8.6 Confidence Interval (Women's Heights)
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Homework
    13. References
    14. Solutions
  10. 9 Hypothesis Testing with One Sample
    1. Introduction
    2. 9.1 Null and Alternative Hypotheses
    3. 9.2 Outcomes and the Type I and Type II Errors
    4. 9.3 Distribution Needed for Hypothesis Testing
    5. 9.4 Rare Events, the Sample, Decision and Conclusion
    6. 9.5 Additional Information and Full Hypothesis Test Examples
    7. 9.6 Hypothesis Testing of a Single Mean and Single Proportion
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Homework
    13. References
    14. Solutions
  11. 10 Hypothesis Testing with Two Samples
    1. Introduction
    2. 10.1 Two Population Means with Unknown Standard Deviations
    3. 10.2 Two Population Means with Known Standard Deviations
    4. 10.3 Comparing Two Independent Population Proportions
    5. 10.4 Matched or Paired Samples
    6. 10.5 Hypothesis Testing for Two Means and Two Proportions
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Homework
    12. Bringing It Together: Homework
    13. References
    14. Solutions
  12. 11 The Chi-Square Distribution
    1. Introduction
    2. 11.1 Facts About the Chi-Square Distribution
    3. 11.2 Goodness-of-Fit Test
    4. 11.3 Test of Independence
    5. 11.4 Test for Homogeneity
    6. 11.5 Comparison of the Chi-Square Tests
    7. 11.6 Test of a Single Variance
    8. 11.7 Lab 1: Chi-Square Goodness-of-Fit
    9. 11.8 Lab 2: Chi-Square Test of Independence
    10. Key Terms
    11. Chapter Review
    12. Formula Review
    13. Practice
    14. Homework
    15. Bringing It Together: Homework
    16. References
    17. Solutions
  13. 12 Linear Regression and Correlation
    1. Introduction
    2. 12.1 Linear Equations
    3. 12.2 Scatter Plots
    4. 12.3 The Regression Equation
    5. 12.4 Testing the Significance of the Correlation Coefficient
    6. 12.5 Prediction
    7. 12.6 Outliers
    8. 12.7 Regression (Distance from School)
    9. 12.8 Regression (Textbook Cost)
    10. 12.9 Regression (Fuel Efficiency)
    11. Key Terms
    12. Chapter Review
    13. Formula Review
    14. Practice
    15. Homework
    16. Bringing It Together: Homework
    17. References
    18. Solutions
  14. 13 F Distribution and One-Way ANOVA
    1. Introduction
    2. 13.1 One-Way ANOVA
    3. 13.2 The F Distribution and the F-Ratio
    4. 13.3 Facts About the F Distribution
    5. 13.4 Test of Two Variances
    6. 13.5 Lab: One-Way ANOVA
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Homework
    12. References
    13. Solutions
  15. A | Review Exercises (Ch 3-13)
  16. B | Practice Tests (1-4) and Final Exams
  17. C | Data Sets
  18. D | Group and Partner Projects
  19. E | Solution Sheets
  20. F | Mathematical Phrases, Symbols, and Formulas
  21. G | Notes for the TI-83, 83+, 84, 84+ Calculators
  22. H | Tables
  23. Index

11.1 Facts About the Chi-Square Distribution

1.

If the number of degrees of freedom for a chi-square distribution is 25, what is the population mean and standard deviation?

2.

If df > 90, the distribution is _____________. If df = 15, the distribution is ________________.

3.

When does the chi-square curve approximate a normal distribution?

4.

Where is μ located on a chi-square curve?

5.

Is it more likely the df is 90, 20, or two in the graph?

This is a nonsymmetrical  chi-square curve which slopes downward continually.
Figure 11.13

11.2 Goodness-of-Fit Test

Determine the appropriate test to be used in the next three exercises.

6.

An archeologist is calculating the distribution of the frequency of the number of artifacts she finds in a dig site. Based on previous digs, the archeologist creates an expected distribution broken down by grid sections in the dig site. Once the site has been fully excavated, she compares the actual number of artifacts found in each grid section to see if her expectation was accurate.

7.

An economist is deriving a model to predict outcomes on the stock market. He creates a list of expected points on the stock market index for the next two weeks. At the close of each day’s trading, he records the actual points on the index. He wants to see how well his model matched what actually happened.

8.

A personal trainer is putting together a weight-lifting program for her clients. For a 90-day program, she expects each client to lift a specific maximum weight each week. As she goes along, she records the actual maximum weights her clients lifted. She wants to know how well her expectations met with what was observed.

Use the following information to answer the next five exercises: A teacher predicts that the distribution of grades on the final exam will be and they are recorded in Table 11.27.

Grade Proportion
A 0.25
B 0.30
C 0.35
D 0.10
Table 11.27

The actual distribution for a class of 20 is in Table 11.28.

Grade Frequency
A 7
B 7
C 5
D 1
Table 11.28
9.

df= df= ______

10.

State the null and alternative hypotheses.

11.

χ2 test statistic = ______

12.

p-value = ______

13.

At the 5% significance level, what can you conclude?


Use the following information to answer the next nine exercises: The following data are real. The cumulative number of AIDS cases reported for Santa Clara County is broken down by ethnicity as in Table 11.29.

Ethnicity Number of Cases
White 2,229
Hispanic 1,157
Black/African-American 457
Asian, Pacific Islander 232
Total = 4,075
Table 11.29

The percentage of each ethnic group in Santa Clara County is as in Table 11.30.

Ethnicity Percentage of total county population Number expected (round to two decimal places)
White 42.9% 1748.18
Hispanic 26.7%
Black/African-American 2.6%
Asian, Pacific Islander 27.8%
Total = 100%
Table 11.30
14.

If the ethnicities of AIDS victims followed the ethnicities of the total county population, fill in the expected number of cases per ethnic group.
Perform a goodness-of-fit test to determine whether the occurrence of AIDS cases follows the ethnicities of the general population of Santa Clara County.

15.

H0: _______

16.

Ha: _______

17.

Is this a right-tailed, left-tailed, or two-tailed test?

18.

degrees of freedom = _______

19.

χ2 test statistic = _______

20.

p-value = _______

21.

Graph the situation. Label and scale the horizontal axis. Mark the mean and test statistic. Shade in the region corresponding to the p-value.

This is a blank graph template. The vertical and horizontal axes are unlabeled.
Figure 11.14

Let α = 0.05

Decision: ________________

Reason for the Decision: ________________

Conclusion (write out in complete sentences): ________________

22.

Does it appear that the pattern of AIDS cases in Santa Clara County corresponds to the distribution of ethnic groups in this county? Why or why not?

11.3 Test of Independence

Determine the appropriate test to be used in the next three exercises.

23.

A pharmaceutical company is interested in the relationship between age and presentation of symptoms for a common viral infection. A random sample is taken of 500 people with the infection across different age groups.

24.

The owner of a baseball team is interested in the relationship between player salaries and team winning percentage. He takes a random sample of 100 players from different organizations.

25.

A marathon runner is interested in the relationship between the brand of shoes runners wear and their run times. She takes a random sample of 50 runners and records their run times as well as the brand of shoes they were wearing.


Use the following information to answer the next seven exercises: Transit Railroads is interested in the relationship between travel distance and the ticket class purchased. A random sample of 200 passengers is taken. Table 11.31 shows the results. The railroad wants to know if a passenger’s choice in ticket class is independent of the distance they must travel.

Traveling Distance Third class Second class First class Total
1–100 miles 21 14 6 41
101–200 miles 18 16 8 42
201–300 miles 16 17 15 48
301–400 miles 12 14 21 47
401–500 miles 6 6 10 22
Total 73 67 60 200
Table 11.31
26.

State the hypotheses.
H0: _______
Ha: _______

27.

df = _______

28.

How many passengers are expected to travel between 201 and 300 miles and purchase second-class tickets?

29.

How many passengers are expected to travel between 401 and 500 miles and purchase first-class tickets?

30.

What is the test statistic?

31.

What is the p-value?

32.

What can you conclude at the 5% level of significance?


Use the following information to answer the next eight exercises: An article in the New England Journal of Medicine, discussed a study on smokers in California and Hawaii. In one part of the report, the self-reported ethnicity and smoking levels per day were given. Of the people smoking at most ten cigarettes per day, there were 9,886 African Americans, 2,745 Native Hawaiians, 12,831 Latinos, 8,378 Japanese Americans and 7,650 whites. Of the people smoking 11 to 20 cigarettes per day, there were 6,514 African Americans, 3,062 Native Hawaiians, 4,932 Latinos, 10,680 Japanese Americans, and 9,877 whites. Of the people smoking 21 to 30 cigarettes per day, there were 1,671 African Americans, 1,419 Native Hawaiians, 1,406 Latinos, 4,715 Japanese Americans, and 6,062 whites. Of the people smoking at least 31 cigarettes per day, there were 759 African Americans, 788 Native Hawaiians, 800 Latinos, 2,305 Japanese Americans, and 3,970 whites.

33.

Complete the table.

Smoking Level Per Day African American Native Hawaiian Latino Japanese Americans White TOTALS
1-10
11-20
21-30
31+
TOTALS
Table 11.32 Smoking Levels by Ethnicity (Observed)
34.

State the hypotheses.
H0: _______
Ha: _______

35.

Enter expected values in Table 11.32. Round to two decimal places.

Calculate the following values:

36.

df = _______

37.

χ 2 χ 2 test statistic = ______

38.

p-value = ______

39.

Is this a right-tailed, left-tailed, or two-tailed test? Explain why.

40.

Graph the situation. Label and scale the horizontal axis. Mark the mean and test statistic. Shade in the region corresponding to the p-value.

Blank graph with vertical and horizontal axes.
Figure 11.15

State the decision and conclusion (in a complete sentence) for the following preconceived levels of α.

41.

α = 0.05

  1. Decision: ___________________
  2. Reason for the decision: ___________________
  3. Conclusion (write out in a complete sentence): ___________________
42.

α = 0.01

  1. Decision: ___________________
  2. Reason for the decision: ___________________
  3. Conclusion (write out in a complete sentence): ___________________

11.4 Test for Homogeneity

43.

A math teacher wants to see if two of her classes have the same distribution of test scores. What test should she use?

44.

What are the null and alternative hypotheses for Exercise 11.43?

45.

A market researcher wants to see if two different stores have the same distribution of sales throughout the year. What type of test should he use?

46.

A meteorologist wants to know if East and West Australia have the same distribution of storms. What type of test should she use?

47.

What condition must be met to use the test for homogeneity?

Use the following information to answer the next five exercises: Do private practice doctors and hospital doctors have the same distribution of working hours? Suppose that a sample of 100 private practice doctors and 150 hospital doctors are selected at random and asked about the number of hours a week they work. The results are shown in Table 11.33.

20–30 30–40 40–50 50–60
Private Practice 16 40 38 6
Hospital 8 44 59 39
Table 11.33
48.

State the null and alternative hypotheses.

49.

df = _______

50.

What is the test statistic?

51.

What is the p-value?

52.

What can you conclude at the 5% significance level?

11.5 Comparison of the Chi-Square Tests

53.

Which test do you use to decide whether an observed distribution is the same as an expected distribution?

54.

What is the null hypothesis for the type of test from Exercise 11.53?

55.

Which test would you use to decide whether two factors have a relationship?

56.

Which test would you use to decide if two populations have the same distribution?

57.

How are tests of independence similar to tests for homogeneity?

58.

How are tests of independence different from tests for homogeneity?

11.6 Test of a Single Variance

Use the following information to answer the next three exercises: An archer’s standard deviation for his hits is six (data is measured in distance from the center of the target). An observer claims the standard deviation is less.

59.

What type of test should be used?

60.

State the null and alternative hypotheses.

61.

Is this a right-tailed, left-tailed, or two-tailed test?


Use the following information to answer the next three exercises: The standard deviation of heights for students in a school is 0.81. A random sample of 50 students is taken, and the standard deviation of heights of the sample is 0.96. A researcher in charge of the study believes the standard deviation of heights for the school is greater than 0.81.

62.

What type of test should be used?

63.

State the null and alternative hypotheses.

64.

df = ________


Use the following information to answer the next four exercises: The average waiting time in a doctor’s office varies. The standard deviation of waiting times in a doctor’s office is 3.4 minutes. A random sample of 30 patients in the doctor’s office has a standard deviation of waiting times of 4.1 minutes. One doctor believes the variance of waiting times is greater than originally thought.

65.

What type of test should be used?

66.

What is the test statistic?

67.

What is the p-value?

68.

What can you conclude at the 5% significance level?

Citation/Attribution

Want to cite, share, or modify this book? This book is Creative Commons Attribution License 4.0 and you must attribute OpenStax.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at https://openstax.org/books/introductory-statistics/pages/1-introduction
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at https://openstax.org/books/introductory-statistics/pages/1-introduction
Citation information

© Sep 19, 2013 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License 4.0 license. The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.