Skip to Content
OpenStax Logo
Introductory Statistics

13.3 Facts About the F Distribution

Introductory Statistics13.3 Facts About the F Distribution
Buy book
  1. Preface
  2. 1 Sampling and Data
    1. Introduction
    2. 1.1 Definitions of Statistics, Probability, and Key Terms
    3. 1.2 Data, Sampling, and Variation in Data and Sampling
    4. 1.3 Frequency, Frequency Tables, and Levels of Measurement
    5. 1.4 Experimental Design and Ethics
    6. 1.5 Data Collection Experiment
    7. 1.6 Sampling Experiment
    8. Key Terms
    9. Chapter Review
    10. Practice
    11. Homework
    12. Bringing It Together: Homework
    13. References
    14. Solutions
  3. 2 Descriptive Statistics
    1. Introduction
    2. 2.1 Stem-and-Leaf Graphs (Stemplots), Line Graphs, and Bar Graphs
    3. 2.2 Histograms, Frequency Polygons, and Time Series Graphs
    4. 2.3 Measures of the Location of the Data
    5. 2.4 Box Plots
    6. 2.5 Measures of the Center of the Data
    7. 2.6 Skewness and the Mean, Median, and Mode
    8. 2.7 Measures of the Spread of the Data
    9. 2.8 Descriptive Statistics
    10. Key Terms
    11. Chapter Review
    12. Formula Review
    13. Practice
    14. Homework
    15. Bringing It Together: Homework
    16. References
    17. Solutions
  4. 3 Probability Topics
    1. Introduction
    2. 3.1 Terminology
    3. 3.2 Independent and Mutually Exclusive Events
    4. 3.3 Two Basic Rules of Probability
    5. 3.4 Contingency Tables
    6. 3.5 Tree and Venn Diagrams
    7. 3.6 Probability Topics
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Bringing It Together: Practice
    13. Homework
    14. Bringing It Together: Homework
    15. References
    16. Solutions
  5. 4 Discrete Random Variables
    1. Introduction
    2. 4.1 Probability Distribution Function (PDF) for a Discrete Random Variable
    3. 4.2 Mean or Expected Value and Standard Deviation
    4. 4.3 Binomial Distribution
    5. 4.4 Geometric Distribution
    6. 4.5 Hypergeometric Distribution
    7. 4.6 Poisson Distribution
    8. 4.7 Discrete Distribution (Playing Card Experiment)
    9. 4.8 Discrete Distribution (Lucky Dice Experiment)
    10. Key Terms
    11. Chapter Review
    12. Formula Review
    13. Practice
    14. Homework
    15. References
    16. Solutions
  6. 5 Continuous Random Variables
    1. Introduction
    2. 5.1 Continuous Probability Functions
    3. 5.2 The Uniform Distribution
    4. 5.3 The Exponential Distribution
    5. 5.4 Continuous Distribution
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  7. 6 The Normal Distribution
    1. Introduction
    2. 6.1 The Standard Normal Distribution
    3. 6.2 Using the Normal Distribution
    4. 6.3 Normal Distribution (Lap Times)
    5. 6.4 Normal Distribution (Pinkie Length)
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  8. 7 The Central Limit Theorem
    1. Introduction
    2. 7.1 The Central Limit Theorem for Sample Means (Averages)
    3. 7.2 The Central Limit Theorem for Sums
    4. 7.3 Using the Central Limit Theorem
    5. 7.4 Central Limit Theorem (Pocket Change)
    6. 7.5 Central Limit Theorem (Cookie Recipes)
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Homework
    12. References
    13. Solutions
  9. 8 Confidence Intervals
    1. Introduction
    2. 8.1 A Single Population Mean using the Normal Distribution
    3. 8.2 A Single Population Mean using the Student t Distribution
    4. 8.3 A Population Proportion
    5. 8.4 Confidence Interval (Home Costs)
    6. 8.5 Confidence Interval (Place of Birth)
    7. 8.6 Confidence Interval (Women's Heights)
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Homework
    13. References
    14. Solutions
  10. 9 Hypothesis Testing with One Sample
    1. Introduction
    2. 9.1 Null and Alternative Hypotheses
    3. 9.2 Outcomes and the Type I and Type II Errors
    4. 9.3 Distribution Needed for Hypothesis Testing
    5. 9.4 Rare Events, the Sample, Decision and Conclusion
    6. 9.5 Additional Information and Full Hypothesis Test Examples
    7. 9.6 Hypothesis Testing of a Single Mean and Single Proportion
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Homework
    13. References
    14. Solutions
  11. 10 Hypothesis Testing with Two Samples
    1. Introduction
    2. 10.1 Two Population Means with Unknown Standard Deviations
    3. 10.2 Two Population Means with Known Standard Deviations
    4. 10.3 Comparing Two Independent Population Proportions
    5. 10.4 Matched or Paired Samples
    6. 10.5 Hypothesis Testing for Two Means and Two Proportions
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Homework
    12. Bringing It Together: Homework
    13. References
    14. Solutions
  12. 11 The Chi-Square Distribution
    1. Introduction
    2. 11.1 Facts About the Chi-Square Distribution
    3. 11.2 Goodness-of-Fit Test
    4. 11.3 Test of Independence
    5. 11.4 Test for Homogeneity
    6. 11.5 Comparison of the Chi-Square Tests
    7. 11.6 Test of a Single Variance
    8. 11.7 Lab 1: Chi-Square Goodness-of-Fit
    9. 11.8 Lab 2: Chi-Square Test of Independence
    10. Key Terms
    11. Chapter Review
    12. Formula Review
    13. Practice
    14. Homework
    15. Bringing It Together: Homework
    16. References
    17. Solutions
  13. 12 Linear Regression and Correlation
    1. Introduction
    2. 12.1 Linear Equations
    3. 12.2 Scatter Plots
    4. 12.3 The Regression Equation
    5. 12.4 Testing the Significance of the Correlation Coefficient
    6. 12.5 Prediction
    7. 12.6 Outliers
    8. 12.7 Regression (Distance from School)
    9. 12.8 Regression (Textbook Cost)
    10. 12.9 Regression (Fuel Efficiency)
    11. Key Terms
    12. Chapter Review
    13. Formula Review
    14. Practice
    15. Homework
    16. Bringing It Together: Homework
    17. References
    18. Solutions
  14. 13 F Distribution and One-Way ANOVA
    1. Introduction
    2. 13.1 One-Way ANOVA
    3. 13.2 The F Distribution and the F-Ratio
    4. 13.3 Facts About the F Distribution
    5. 13.4 Test of Two Variances
    6. 13.5 Lab: One-Way ANOVA
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Homework
    12. References
    13. Solutions
  15. A | Review Exercises (Ch 3-13)
  16. B | Practice Tests (1-4) and Final Exams
  17. C | Data Sets
  18. D | Group and Partner Projects
  19. E | Solution Sheets
  20. F | Mathematical Phrases, Symbols, and Formulas
  21. G | Notes for the TI-83, 83+, 84, 84+ Calculators
  22. H | Tables
  23. Index

Here are some facts about the F distribution.

  1. The curve is not symmetrical but skewed to the right.
  2. There is a different curve for each set of dfs.
  3. The F statistic is greater than or equal to zero.
  4. As the degrees of freedom for the numerator and for the denominator get larger, the curve approximates the normal.
  5. Other uses for the F distribution include comparing two variances and two-way Analysis of Variance. Two-Way Analysis is beyond the scope of this chapter.
This graph has an unmarked Y axis and then an X axis that ranges from 0.00 to 4.00. It has three plot lines. The plot line labelled F subscript 1, 5 starts near the top of the Y axis at the extreme left of the graph and drops quickly to near the bottom at 0.50, at which point is slowly decreases in a curved fashion to the 4.00 mark on the X axis. The plot line labelled F subscript 100, 100 remains at Y = 0 for much of its length, except for a distinct peak between 0.50 and 1.50. The peak is a smooth curve that reaches about half way up the Y axis at its peak. The plot line labeled F subscript 5, 10 increases slightly as it progresses from 0.00 to 0.50, after which it peaks and slowly decreases down the remainder of the X axis. The peak only reaches about one fifth up the height of the Y axis.
Figure 13.3

Example 13.2

Let’s return to the slicing tomato exercise in Try It. The means of the tomato yields under the five mulching conditions are represented by μ1, μ2, μ3, μ4, μ5. We will conduct a hypothesis test to determine if all means are the same or at least one is different. Using a significance level of 5%, test the null hypothesis that there is no difference in mean yields among the five groups against the alternative hypothesis that at least one mean is different from the rest.

Solution 13.2

The null and alternative hypotheses are:

H0: μ1 = μ2 = μ3 = μ4 = μ5

Ha: μi ≠ μj some i ≠ j

The one-way ANOVA results are shown in Figure 13.5

Source of Variation Sum of Squares (SS) Degrees of Freedom (df) Mean Square (MS) F
Factor (Between)36,648,5615 – 1 = 4 36,648,561 4  = 9,162,140 36,648,561 4  = 9,162,140 9,162,140 2,044,672.6  = 4.4810 9,162,140 2,044,672.6  = 4.4810
Error (Within) 20,446,726 15 – 5 = 10 20,446,726 10  = 2,044,672.6 20,446,726 10  = 2,044,672.6
Total 57,095,287 15 – 1 = 14
Table 13.5

Distribution for the test: F4,10

df(num) = 5 – 1 = 4

df(denom) = 15 – 5 = 10

Test statistic: F = 4.4810

This graph shows a nonsymmetrical F distribution curve. The horizontal axis extends from 0 - 5, and the vertical axis ranges from 0 - 0.7. The curve is strongly skewed to the right.
Figure 13.4

Probability Statement: p-value = P(F > 4.481) = 0.0248.

Compare α and the p-value: α = 0.05, p-value = 0.0248

Make a decision: Since α > p-value, we reject H0.

Conclusion: At the 5% significance level, we have reasonably strong evidence that differences in mean yields for slicing tomato plants grown under different mulching conditions are unlikely to be due to chance alone. We may conclude that at least some of mulches led to different mean yields.

Using the TI-83, 83+, 84, 84+ Calculator

To find these results on the calculator:

Press STAT. Press 1:EDIT. Put the data into the lists L1, L2, L3, L4, L5.

Press STAT, and arrow over to TESTS, and arrow down to ANOVA. Press ENTER, and then enter L1, L2, L3, L4, L5). Press ENTER. You will see that the values in the foregoing ANOVA table are easily produced by the calculator, including the test statistic and the p-value of the test.

The calculator displays:
F = 4.4810
p = 0.0248 (p-value)
Factor
df = 4
SS = 36648560.9
MS = 9162140.23
Error
df = 10
SS = 20446726
MS = 2044672.6

Try It 13.2

MRSA, or Staphylococcus aureus, can cause a serious bacterial infections in hospital patients. Table 13.6 shows various colony counts from different patients who may or may not have MRSA. The data from the table is plotted in Figure 13.5.

Conc = 0.6Conc = 0.8 Conc = 1.0 Conc = 1.2 Conc = 1.4
9 16 22 30 27
66 93 147 199 168
98 82 120 148 132
Table 13.6

Plot of the data for the different concentrations:

This graph is a scatterplot for the data provided. The horizontal axis is labeled 'Colony counts' and extends from 0 - 200. The vertical axis is labeled 'Tryptone concentrations' and extends from 0.6 - 1.4.
Figure 13.5

Test whether the mean number of colonies are the same or are different. Construct the ANOVA table (by hand or by using a TI-83, 83+, or 84+ calculator), find the p-value, and state your conclusion. Use a 5% significance level.

Example 13.3

Four sororities took a random sample of sisters regarding their grade means for the past term. The results are shown in Table 13.7.

Sorority 1 Sorority 2 Sorority 3 Sorority 4
2.17 2.63 2.63 3.79
1.85 1.77 3.78 3.45
2.83 3.25 4.00 3.08
1.69 1.86 2.55 2.26
3.33 2.21 2.45 3.18
Table 13.7 MEAN GRADES FOR FOUR SORORITIES

Using a significance level of 1%, is there a difference in mean grades among the sororities?

Solution 13.3

Let μ1, μ2, μ3, μ4 be the population means of the sororities. Remember that the null hypothesis claims that the sorority groups are from the same normal distribution. The alternate hypothesis says that at least two of the sorority groups come from populations with different normal distributions. Notice that the four sample sizes are each five.

Note

This is an example of a balanced design, because each factor (i.e., sorority) has the same number of observations.

H0: μ1 = μ2 = μ3 = μ4

Ha: Not all of the means μ1, μ2, μ3, μ4 are equal.

Distribution for the test: F3,16

where k = 4 groups and n = 20 samples in total

df(num)= k – 1 = 4 – 1 = 3

df(denom) = nk = 20 – 4 = 16

Calculate the test statistic: F = 2.23

Graph:

This graph shows a nonsymmetrical F distribution curve with values of 0 and 2.23 on the x-axis representing the test statistic of sorority grade averages. The curve is slightly skewed to the right, but is approximately normal. A vertical upward line extends from 2.23 to the curve and the area to the right of this is shaded to represent the p-value.
Figure 13.6

Probability statement: p-value = P(F > 2.23) = 0.1241

Compare α and the p-value: α = 0.01
p-value = 0.1241
α < p-value

Make a decision: Since α < p-value, you cannot reject H0.

Conclusion: There is not sufficient evidence to conclude that there is a difference among the mean grades for the sororities.

Using the TI-83, 83+, 84, 84+ Calculator

Put the data into lists L1, L2, L3, and L4. Press STAT and arrow over to TESTS. Arrow down to F:ANOVA. Press ENTER and Enter (L1,L2,L3,L4).

The calculator displays the F statistic, the p-value and the values for the one-way ANOVA table:
F = 2.2303
p = 0.1241 (p-value)
Factor
df = 3
SS = 2.88732
MS = 0.96244
Error
df = 16
SS = 6.9044
MS = 0.431525

Try It 13.3

Four sports teams took a random sample of players regarding their GPAs for the last year. The results are shown in Table 13.8.

Basketball Baseball Hockey Lacrosse
3.6 2.1 4.0 2.0
2.9 2.6 2.0 3.6
2.5 3.9 2.6 3.9
3.3 3.1 3.2 2.7
3.8 3.4 3.2 2.5
Table 13.8 GPAs FOR FOUR SPORTS TEAMS

Use a significance level of 5%, and determine if there is a difference in GPA among the teams.

Example 13.4

A fourth grade class is studying the environment. One of the assignments is to grow bean plants in different soils. Tommy chose to grow his bean plants in soil found outside his classroom mixed with dryer lint. Tara chose to grow her bean plants in potting soil bought at the local nursery. Nick chose to grow his bean plants in soil from his mother's garden. No chemicals were used on the plants, only water. They were grown inside the classroom next to a large window. Each child grew five plants. At the end of the growing period, each plant was measured, producing the data (in inches) in Table 13.9.

Tommy's Plants Tara's Plants Nick's Plants
24 25 23
21 31 27
23 23 22
30 20 30
23 28 20
Table 13.9

Does it appear that the three media in which the bean plants were grown produce the same mean height? Test at a 3% level of significance.

Solution 13.4

This time, we will perform the calculations that lead to the F' statistic. Notice that each group has the same number of plants, so we will use the formula F' = n s x ¯ 2 s 2 pooled n s x ¯ 2 s 2 pooled .

First, calculate the sample mean and sample variance of each group.

Tommy's Plants Tara's Plants Nick's Plants
Sample Mean 24.2 25.4 24.4
Sample Variance 11.7 18.3 16.3
Table 13.10

Next, calculate the variance of the three group means (Calculate the variance of 24.2, 25.4, and 24.4). Variance of the group means = 0.413 = s x ¯ 2 s x ¯ 2

Then MSbetween = n s x ¯ 2 n s x ¯ 2 = (5)(0.413) where n = 5 is the sample size (number of plants each child grew).

Calculate the mean of the three sample variances (Calculate the mean of 11.7, 18.3, and 16.3). Mean of the sample variances = 15.433 = s2 pooled

Then MSwithin = s2pooled = 15.433.

The F statistic (or F ratio) is F= M S between M S within = n s x ¯ 2 s 2 pooled = (5)(0.413) 15.433 =0.134 F= M S between M S within = n s x ¯ 2 s 2 pooled = (5)(0.413) 15.433 =0.134

The dfs for the numerator = the number of groups – 1 = 3 – 1 = 2.

The dfs for the denominator = the total number of samples – the number of groups = 15 – 3 = 12

The distribution for the test is F2,12 and the F statistic is F = 0.134

The p-value is P(F > 0.134) = 0.8759.

Decision: Since α = 0.03 and the p-value = 0.8759, do not reject H0. (Why?)

Conclusion: With a 3% level of significance, from the sample data, the evidence is not sufficient to conclude that the mean heights of the bean plants are different.

Using the TI-83, 83+, 84, 84+ Calculator

To calculate the p-value:

*Press 2nd DISTR

*Arrow down to Fcdf(and press ENTER.

*Enter 0.134, E99, 2, 12)

*Press ENTER

The p-value is 0.8759.

Try It 13.4

Another fourth grader also grew bean plants, but this time in a jelly-like mass. The heights were (in inches) 24, 28, 25, 30, and 32. Do a one-way ANOVA test on the four groups. Are the heights of the bean plants different? Use the same method as shown in Example 13.4.

Collaborative Exercise

From the class, create four groups of the same size as follows: men under 22, men at least 22, women under 22, women at least 22. Have each member of each group record the number of states in the United States he or she has visited. Run an ANOVA test to determine if the average number of states visited in the four groups are the same. Test at a 1% level of significance. Use one of the solution sheets in Table C3.

Citation/Attribution

Want to cite, share, or modify this book? This book is Creative Commons Attribution License 4.0 and you must attribute OpenStax.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at https://openstax.org/books/introductory-statistics/pages/1-introduction
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at https://openstax.org/books/introductory-statistics/pages/1-introduction
Citation information

© Sep 19, 2013 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License 4.0 license. The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.