- In a hypothesis test problem, you may see words such as "the level of significance is 1%." The "1%" is the preconceived or preset α.
- The statistician setting up the hypothesis test selects the value of α to use before collecting the sample data.
- If no level of significance is given, a common standard to use is α = 0.05.
- When you calculate the p-value and draw the picture, the p-value is the area in the left tail, the right tail, or split evenly between the two tails. For this reason, we call the hypothesis test left, right, or two tailed.
- The alternative hypothesis, , tells you if the test is left, right, or two-tailed. It is the key to conducting the appropriate test.
- Ha never has a symbol that contains an equal sign.
- Thinking about the meaning of the p-value: A data analyst (and anyone else) should have more confidence that he made the correct decision to reject the null hypothesis with a smaller p-value (for example, 0.001 as opposed to 0.04) even if using the 0.05 level for alpha. Similarly, for a large p-value such as 0.4, as opposed to a p-value of 0.056 (alpha = 0.05 is less than either number), a data analyst should have more confidence that she made the correct decision in not rejecting the null hypothesis. This makes the data analyst use judgment rather than mindlessly applying rules.
The following examples illustrate a left-, right-, and two-tailed test.
Example 9.11
Ho: μ = 5, Ha: μ < 5
Test of a single population mean. Ha tells you the test is left-tailed. The picture of the p-value is as follows:
Try It 9.11
H0: μ = 10, Ha: μ < 10
Assume the p-value is 0.0935. What type of test is this? Draw the picture of the p-value.
Example 9.12
H0: p ≤ 0.2 Ha: p > 0.2
This is a test of a single population proportion. Ha tells you the test is right-tailed. The picture of the p-value is as follows:
Try It 9.12
H0: μ ≤ 1, Ha: μ > 1
Assume the p-value is 0.1243. What type of test is this? Draw the picture of the p-value.
Example 9.13
H0: p = 50 Ha: p ≠ 50
This is a test of a single population mean. Ha tells you the test is two-tailed. The picture of the p-value is as follows.
Try It 9.13
H0: p = 0.5, Ha: p ≠ 0.5
Assume the p-value is 0.2564. What type of test is this? Draw the picture of the p-value.
Full Hypothesis Test Examples
Example 9.14
Problem
Jeffrey, as an eight-year old, established a mean time of 16.43 seconds for swimming the 25-yard freestyle, with a standard deviation of 0.8 seconds. His dad, Frank, thought that Jeffrey could swim the 25-yard freestyle faster using goggles. Frank bought Jeffrey a new pair of expensive goggles and timed Jeffrey for 15 25-yard freestyle swims. For the 15 swims, Jeffrey's mean time was 16 seconds. Frank thought that the goggles helped Jeffrey to swim faster than the 16.43 seconds. Conduct a hypothesis test using a preset α = 0.05. Assume that the swim times for the 25-yard freestyle are normal.
Solution
Set up the Hypothesis Test:
Since the problem is about a mean, this is a test of a single population mean.
H0: μ = 16.43 Ha: μ < 16.43
For Jeffrey to swim faster, his time will be less than 16.43 seconds. The "<" tells you this is left-tailed.
Determine the distribution needed:
Random variable: = the mean time to swim the 25-yard freestyle.
Distribution for the test: is normal (population standard deviation is known: σ = 0.8)
Therefore,
μ = 16.43 comes from H0 and not the data. σ = 0.8, and n = 15.
Calculate the p-value using the normal distribution for a mean:
p-value = P( < 16) = 0.0187 where the sample mean in the problem is given as 16.
p-value = 0.0187 (This is called the actual level of significance.) The p-value is the area to the left of the sample mean is given as 16.
Graph:
μ = 16.43 comes from H0. Our assumption is μ = 16.43.
Interpretation of the p-value: If H0 is true, there is a 0.0187 probability (1.87%)that Jeffrey's mean time to swim the 25-yard freestyle is 16 seconds or less. Because a 1.87% chance is small, the mean time of 16 seconds or less is unlikely to have happened randomly. It is a rare event.
Compare α and the p-value:
α = 0.05 p-value = 0.0187 α > p-value
Make a decision: Since p-value, reject H0.
This indicates that you reject the null hypothesis that the mean time to swim the 25-yard freestyle is at least 16.43 seconds.
Conclusion: At the 5% significance level, there is sufficient evidence that Jeffrey's mean time to swim the 25-yard freestyle is less than 16.43 seconds. Thus, based on the sample data, we conclude that Jeffrey swims faster using the new goggles.
The Type I and Type II errors for this problem are as follows:
The Type I error is to conclude that Jeffrey swims the 25-yard freestyle, on average, in less than 16.43 seconds when, in fact, he actually swims the 25-yard freestyle, on average, in at least 16.43 seconds. (Reject the null hypothesis when the null hypothesis is true.)
The Type II error is that there is not evidence to conclude that Jeffrey swims the 25-yard freestyle, on average, in less than 16.43 seconds when, in fact, he actually does swim the 25-yard free-style, on average, in less than 16.43 seconds. (Do not reject the null hypothesis when the null hypothesis is false.)
Using the TI-83, 83+, 84, 84+ Calculator
Press STAT
and arrow over to TESTS
. Press 1:Z-Test
. Arrow over to Stats
and press ENTER
. Arrow down and enter 16.43 for μ0 (null hypothesis), .8 for σ, 16 for the sample mean, and 15 for n. Arrow down to μ : (alternate hypothesis) and arrow over to < μ0. Press ENTER
. Arrow down to Calculate
and press ENTER
. The calculator not only calculates the p-value (p = 0.0187) but it also calculates the test statistic (z-score) for the sample mean. μ < 16.43 is the alternative hypothesis. Do this set of instructions again except arrow to Draw
(instead of Calculate
). Press ENTER
. A shaded graph appears with z = -2.08 (test statistic) and p = 0.0187 (p-value). Make sure when you use Draw
that no other equations are highlighted in Y = and the plots are turned off.
Try It 9.14
The mean throwing distance of a football for Marco, a high school quarterback, is 40 yards, with a standard deviation of two yards. The team coach tells Marco to adjust his grip to get more distance. The coach records the distances for 20 throws. For the 20 throws, Marco’s mean distance was 45 yards. The coach thought the different grip helped Marco throw farther than 40 yards. Conduct a hypothesis test using a preset α = 0.05. Assume the throw distances for footballs are normal.
First, determine what type of test this is, set up the hypothesis test, find the p-value, sketch the graph, and state your conclusion.
Using the TI-83, 83+, 84, 84+ Calculator
Press STAT
and arrow over to TESTS
. Press 1:Z-Test
. Arrow over to Stats and press ENTER
. Arrow down and enter 40 for μ0 (null hypothesis), 2 for σ, 45 for the sample mean, and 20 for n. Arrow down to μ: (alternative hypothesis) and set it either as <, ≠, or >. Press ENTER
. Arrow down to Calculate and press ENTER
. The calculator not only calculates the p-value but it also calculates the test statistic (z-score) for the sample mean. Select <, ≠, or > for the alternative hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate). Press ENTER
. A shaded graph appears with test statistic and p-value. Make sure when you use Draw that no other equations are highlighted in Y = and the plots are turned off.
Historical Note (Example 9.14)
The traditional way to compare the two probabilities, α and the p-value, is to compare the critical value (z-score from α) to the test statistic (z-score from data). The calculated test statistic for the p-value is –2.08. (From the Central Limit Theorem, the test statistic formula is . For this problem, = 16, μX = 16.43 from the null hypothes is, σX = 0.8, and n = 15.) You can find the critical value for α = 0.05 in the normal table (see 15.Tables in the Table of Contents). The z-score for an area to the left equal to 0.05 is midway between –1.65 and –1.64 (0.05 is midway between 0.0505 and 0.0495). The z-score is –1.645. Since –1.645 > –2.08 (which demonstrates that α > p-value), reject H0. Traditionally, the decision to reject or not reject was done in this way. Today, comparing the two probabilities α and the p-value is very common. For this problem, the p-value, 0.0187 is considerably smaller than α, 0.05. You can be confident about your decision to reject. The graph shows α, the p-value, and the test statistic and the critical value.
Example 9.15
Problem
A college football coach records the mean weight that the players can bench press as 275 pounds, with a standard deviation of 55 pounds. Three of the players thought that the mean weight was more than that amount. They asked 30 of their teammates for their estimated maximum lift on the bench press exercise. The data ranged from 205 pounds to 385 pounds. The actual different weights were (frequencies are in parentheses) 205(3); 215(3); 225(1); 241(2); 252(2); 265(2); 275(2); 313(2); 316(5); 338(2); 341(1); 345(2); 368(2); 385(1).
Conduct a hypothesis test using a 2.5% level of significance to determine if the bench press mean is more than 275 pounds.
Solution
Set up the Hypothesis Test:
Since the problem is about a mean weight, this is a test of a single population mean.
H0: μ = 275
Ha: μ > 275
This is a right-tailed test.
Calculating the distribution needed:
Random variable: = the mean weight, in pounds, lifted by the football players.
Distribution for the test: It is normal because σ is known.
pounds (from the data).
σ = 55 pounds (Always use σ if you know it.) We assume μ = 275 pounds unless our data shows us otherwise.
Calculate the p-value using the normal distribution for a mean and using the sample mean as input (see Appendix G NOTEs for the TI-83, 83+, 84, 84+ Calculators for using the data as input):
.
Interpretation of the p-value: If H0 is true, then there is a 0.1331 probability (13.23%) that the football players can lift a mean weight of 286.2 pounds or more. Because a 13.23% chance is large enough, a mean weight lift of 286.2 pounds or more is not a rare event.
Compare α and the p-value:
α = 0.025 p-value = 0.1323
Make a decision: Since α <p-value, do not reject H0.
Conclusion: At the 2.5% level of significance, from the sample data, there is not sufficient evidence to conclude that the true mean weight lifted is more than 275 pounds.
Using the TI-83, 83+, 84, 84+ Calculator
Put the data and frequencies into lists. Press STAT
and arrow over to TESTS
. Press 1:Z-Test
. Arrow over to Data
and press ENTER
. Arrow down and enter 275 for μ0, 55 for σ, the name of the list where you put the data, and the name of the list where you put the frequencies. Arrow down to μ: and arrow over to > μ0. Press ENTER
. Arrow down to Calculate
and press ENTER
. The calculator not only calculates the p-value (p = 0.1331, a little different from the previous calculation - in it we used the sample mean rounded to one decimal place instead of the data) but it also calculates the test statistic (z-score) for the sample mean, the sample mean, and the sample standard deviation. μ > 275 is the alternative hypothesis. Do this set of instructions again except arrow to Draw
(instead of Calculate
). Press ENTER
. A shaded graph appears with z = 1.112 (test statistic) and p = 0.1331 (p-value). Make sure when you use Draw
that no other equations are highlighted in Y = and the plots are turned off.
Try It 9.15
A company records the mean time of employees working in a day. The mean comes out to be 475 minutes, with a standard deviation of 45 minutes. A manager recorded times of 20 employees. The times of working were (frequencies are in parentheses) 460(3); 465(2); 470(3); 475(1); 480(6); 485(3); 490(2).
Conduct a hypothesis test using a 2.5% level of significance to determine if the mean time is more than 475.
Example 9.16
Problem
Statistics students believe that the mean score on the first statistics test is 65. A statistics instructor thinks the mean score is higher than 65. He samples ten statistics students and obtains the scores 65; 65; 70; 67; 66; 63; 63; 68; 72; 71. He performs a hypothesis test using a 5% level of significance. The data are assumed to be from a normal distribution.
Solution
Set up the hypothesis test:
A 5% level of significance means that α = 0.05. This is a test of a single population mean.
H0: μ = 65 Ha: μ > 65
Since the instructor thinks the average score is higher, use a ">". The ">" means the test is right-tailed.
Determine the distribution needed:
Random variable: = average score on the first statistics test.
Distribution for the test: If you read the problem carefully, you will notice that there is no population standard deviation given. You are only given n = 10 sample data values. Notice also that the data come from a normal distribution. This means that the distribution for the test is a student's t.
Use tdf. Therefore, the distribution for the test is t9 where n = 10 and df = 10 - 1 = 9.
Calculate the p-value using the Student's t-distribution:
p-value = P( > 67) = 0.0396 where the sample mean and sample standard deviation are calculated as 67 and 3.1972 from the data.
Interpretation of the p-value: If the null hypothesis is true, then there is a 0.0396 probability (3.96%) that the sample mean is 67 or more.
Compare α and the p-value:
Since α = 0.05 and p-value = 0.0396. α > p-value.
Make a decision: Since α > p-value, reject H0.
This means you reject μ = 65. In other words, you believe the average test score is greater than 65.
Conclusion: At a 5% level of significance, the sample data show sufficient evidence that the mean (average) test score is greater than 65, just as the math instructor thinks.
Using the TI-83, 83+, 84, 84+ Calculator
Put the data into a list. Press STAT
and arrow over to TESTS
. Press 2:T-Test
. Arrow over to Data
and press ENTER
. Arrow down and enter 65 for μ0, the name of the list where you put the data, and 1 for Freq:
. Arrow down to μ: and arrow over to > μ0. Press ENTER
. Arrow down to Calculate
and press ENTER
. The calculator not only calculates the p-value (p = 0.0396) but it also calculates the test statistic (t-score) for the sample mean, the sample mean, and the sample standard deviation. μ > 65 is the alternative hypothesis. Do this set of instructions again except arrow to Draw
(instead of Calculate
). Press ENTER
. A shaded graph appears with t = 1.9781 (test statistic) and p = 0.0396 (p-value). Make sure when you use Draw
that no other equations are highlighted in Y = and the plots are turned off.
Try It 9.16
It is believed that a stock price for a particular company will grow at a rate of $5 per week with a standard deviation of $1. An investor believes the stock won’t grow as quickly. The changes in stock price is recorded for ten weeks and are as follows: $4, $3, $2, $3, $1, $7, $2, $1, $1, $2. Perform a hypothesis test using a 5% level of significance. State the null and alternative hypotheses, find the p-value, state your conclusion, and identify the Type I and Type II errors.
Example 9.17
Problem
Joon believes that 50% of first-time brides in the United States are younger than their grooms. She performs a hypothesis test to determine if the percentage is the same or different from 50%. Joon samples 100 first-time brides and 53 reply that they are younger than their grooms. For the hypothesis test, she uses a 1% level of significance.
Solution
Set up the hypothesis test:
The 1% level of significance means that α = 0.01. This is a test of a single population proportion.
H0: p = 0.50 Ha: p ≠ 0.50
The words "is the same or different from" tell you this is a two-tailed test.
Calculate the distribution needed:
Random variable: P′ = the percent of of first-time brides who are younger than their grooms.
Distribution for the test: The problem contains no mention of a mean. The information is given in terms of percentages. Use the distribution for P′, the estimated proportion.
Therefore,
where p = 0.50, q = 1−p = 0.50, and n = 100
Calculate the p-value using the normal distribution for proportions:
p-value = P (p′ < 0.47 or p′ > 0.53) = 0.5485
where x = 53, p′ = = 0.53.
Interpretation of the p-value: If the null hypothesis is true, there is 0.5485 probability (54.85%) that the sample (estimated) proportion is 0.53 or more OR 0.47 or less (see the graph in Figure 9.10).
μ = p = 0.50 comes from H0, the null hypothesis.
p′ = 0.53. Since the curve is symmetrical and the test is two-tailed, the p′ for the left tail is equal to 0.50 – 0.03 = 0.47 where μ = p = 0.50. (0.03 is the difference between 0.53 and 0.50.)
Compare α and the p-value:
Since α = 0.01 and p-value = 0.5485. α < p-value.
Make a decision: Since α < p-value, you cannot reject H0.
Conclusion: At the 1% level of significance, the sample data do not show sufficient evidence that the percentage of first-time brides who are younger than their grooms is different from 50%.
Using the TI-83, 83+, 84, 84+ Calculator
Press STAT
and arrow over to TESTS
. Press 5:1-PropZTest
. Enter .5 for p0, 53 for x and 100 for n. Arrow down to Prop
and arrow to not equals
p0. Press ENTER
. Arrow down to Calculate
and press ENTER
. The calculator calculates the p-value (p = 0.5485) and the test statistic (z-score). Prop not equals
.5 is the alternate hypothesis. Do this set of instructions again except arrow to Draw
(instead of Calculate
). Press ENTER
. A shaded graph appears with z = 0.6 (test statistic) and p = 0.5485 (p-value). Make sure when you use Draw
that no other equations are highlighted in Y = and the plots are turned off.
The Type I and Type II errors are as follows:
The Type I error is to conclude that the proportion of first-time brides who are younger than their grooms is different from 50% when, in fact, the proportion is actually 50%. (Reject the null hypothesis when the null hypothesis is true).
The Type II error is there is not enough evidence to conclude that the proportion of first time brides who are younger than their grooms differs from 50% when, in fact, the proportion does differ from 50%. (Do not reject the null hypothesis when the null hypothesis is false.)
Try It 9.17
A teacher believes that 85% of students in the class will want to go on a field trip to the local zoo. The teacher performs a hypothesis test to determine if the percentage is the same or different from 85%. The teacher samples 50 students and 39 reply that they would want to go to the zoo. For the hypothesis test, use a 1% level of significance.
First, determine what type of test this is, set up the hypothesis test, find the p-value, sketch the graph, and state your conclusion.
Example 9.18
Problem
Suppose a consumer group suspects that the proportion of households that have three cell phones is 30%. A cell phone company has reason to believe that the proportion is not 30%. Before they start a big advertising campaign, they conduct a hypothesis test. Their marketing people survey 150 households with the result that 43 of the households have three cell phones.
a. The value that helps determine the p-value is p′. Calculate p′.
b. What is a success for this problem?
c. What is the level of significance?
d. Draw the graph for this problem. Draw the horizontal axis. Label and shade appropriately.
Calculate the p-value.
e. Make a decision. _____________(Reject/Do not reject) H0 because____________.
Solution
Set up the Hypothesis Test:
H0: p = 0.30 Ha: p ≠ 0.30
Determine the distribution needed:
The random variable is P′ = proportion of households that have three cell phones.
The distribution for the hypothesis test is
a. p′ = where x is the number of successes and n is the total number in the sample.
x = 43, n = 150
p′ =
b. A success is having three cell phones in a household.
c. The level of significance is the preset α. Since α is not given, assume that α = 0.05.
d. p-value = 0.7216
e. Assuming that α = 0.05, α < p-value. The decision is do not reject H0 because there is not sufficient evidence to conclude that the proportion of households that have three cell phones is not 30%.
Try It 9.18
Marketers believe that 92% of adults in the United States own a cell phone. A cell phone manufacturer believes that number is actually lower. 200 American adults are surveyed, of which, 174 report having cell phones. Use a 5% level of significance. State the null and alternative hypothesis, find the p-value, state your conclusion, and identify the Type I and Type II errors.
The next example is a poem written by a statistics student named Nicole Hart. The solution to the problem follows the poem. Notice that the hypothesis test is for a single population proportion. This means that the null and alternate hypotheses use the parameter p. The distribution for the test is normal. The estimated proportion p′ is the proportion of fleas killed to the total fleas found on Fido. This is sample information. The problem gives a preconceived α = 0.01, for comparison, and a 95% confidence interval computation. The poem is clever and humorous, so please enjoy it!
Example 9.19
Problem
My dog has so many fleas,
They do not come off with ease.
As for shampoo, I have tried many types
Even one called Bubble Hype,
Which only killed 25% of the fleas,
Unfortunately I was not pleased.
I've used all kinds of soap,
Until I had given up hope
Until one day I saw
An ad that put me in awe.
A shampoo used for dogs
Called GOOD ENOUGH to Clean a Hog
Guaranteed to kill more fleas.
I gave Fido a bath
And after doing the math
His number of fleas
Started dropping by 3's!
Before his shampoo
I counted 42.
At the end of his bath,
I redid the math
And the new shampoo had killed 17 fleas.
So now I was pleased.
Now it is time for you to have some fun
With the level of significance being .01,
You must help me figure out
Use the new shampoo or go without?
Solution
Set up the hypothesis test:
H0: p ≤ 0.25 Ha: p > 0.25
Determine the distribution needed:
In words, CLEARLY state what your random variable or P′ represents.
P′ = The proportion of fleas that are killed by the new shampoo
State the distribution to use for the test.
Normal:
Test Statistic: z = 2.3163
Calculate the p-value using the normal distribution for proportions:
p-value = 0.0103
In one to two complete sentences, explain what the p-value means for this problem.
If the null hypothesis is true (the proportion is 0.25), then there is a 0.0103 probability that the sample (estimated) proportion is 0.4048 or more.
Use the previous information to sketch a picture of this situation. CLEARLY, label and scale the horizontal axis and shade the region(s) corresponding to the p-value.
Compare α and the p-value:
Indicate the correct decision (“reject” or “do not reject” the null hypothesis), the reason for it, and write an appropriate conclusion, using complete sentences.
alpha | decision | reason for decision |
---|---|---|
0.01 | Do not reject | α < p-value |
Conclusion: At the 1% level of significance, the sample data do not show sufficient evidence that the percentage of fleas that are killed by the new shampoo is more than 25%.
Construct a 95% confidence interval for the true mean or proportion. Include a sketch of the graph of the situation. Label the point estimate and the lower and upper bounds of the confidence interval.
Confidence Interval: (0.26,0.55) We are 95% confident that the true population proportion p of fleas that are killed by the new shampoo is between 26% and 55%.
NOTE
This test result is not very definitive since the p-value is very close to alpha. In reality, one would probably do more tests by giving the dog another bath after the fleas have had a chance to return.
Try It 9.19
A car soap gets rid of 30% of stains on the car. After adding a new compound to the soap, the soap is used on a car and found to wash 20 stains out of the 50 stains on the car. With the level of significance being 0.01, find out if adding the new compound to soap is beneficial.
Example 9.20
Problem
The National Institute of Standards and Technology provides exact data on conductivity properties of materials. Following are conductivity measurements for 11 randomly selected pieces of a particular type of glass.
1.11; 1.07; 1.11; 1.07; 1.12; 1.08; .98; .98; 1.02; .95; .95
Is there convincing evidence that the average conductivity of this type of glass is greater than one? Use a significance level of 0.05. Assume the population is normal.
Solution
Let’s follow a four-step process to answer this statistical question.
- State the Question: We need to determine if, at a 0.05 significance level, the average conductivity of the selected glass is greater than one. Our hypotheses will be
- H0: μ ≤ 1
- Ha: μ > 1
- Plan: We are testing a sample mean without a known population standard deviation. Therefore, we need to use a Student's t-distribution. Assume the underlying population is normal.
- Based on the sample of 11 data values shown above, sample mean, sample standard deviation, and test statistic are calculated as follows:
To calculate the p-value, note that this is a right-tailed test. Then, find the area under the t-distribution to the right of the test statistic 2.014 (using 10 degrees of freedom). This area in the right tail is 0.036, and thus the p-value = 0.036.
- State the Conclusions: Since the p-value (p = 0.036) is less than our alpha value, we will reject the null hypothesis. It is reasonable to state that the data supports the claim that the average conductivity level is greater than one.
Try It 9.20
The boiling point of a specific liquid is measured for 15 samples, and the boiling points are obtained as follows:
205; 206; 206; 202; 199; 194; 197; 198; 198; 201; 201; 202; 207; 211; 205
Is there convincing evidence that the average boiling point is greater than 200? Use a significance level of 0.1. Assume the population is normal.
Example 9.21
Problem
In a study of 420,019 cell phone users, 172 of the subjects developed brain cancer. Test the claim that cell phone users developed brain cancer at a greater rate than that for non-cell phone users (the rate of brain cancer for non-cell phone users is 0.0340%). Since this is a critical issue, use a 0.005 significance level. Explain why the significance level should be so low in terms of a Type I error.
Solution
We will follow the four-step process.
- We need to conduct a hypothesis test on the claimed cancer rate. Our hypotheses will be
- H0: p ≤ 0.00034
- Ha: p > 0.00034
If we commit a Type I error, we are essentially accepting a false claim. Since the claim describes cancer-causing environments, we want to minimize the chances of incorrectly identifying causes of cancer.
- We will be testing a sample proportion with x = 172 and n = 420,019. The sample is sufficiently large because we have np = 420,019(0.00034) = 142.8, nq = 420,019(0.99966) = 419,876.2, two independent outcomes, and a fixed probability of success p = 0.00034. Thus we will be able to generalize our results to the population.
- The associated TI results are
- Since the p-value = 0.0073 is greater than our alpha value = 0.005, we cannot reject the null. Therefore, we conclude that there is not enough evidence to support the claim of higher brain cancer rates for the cell phone users.
Try It 9.21
In a study of 390,000 moisturizer users, 138 of the subjects developed skin diseases. Test the claim that moisturizer users developed skin diseases at a greater rate than that for non-moisturizer users (the rate of skin diseases for non-moisturizer users is 0.041%). Since this is a critical issue, use a 0.005 significance level. Explain why the significance level should be so low in terms of a Type I error.
Example 9.22
Problem
Statistical data indicates that in a certain country there are approximately 268,608,618 residents aged 12 and older. For a certain period of time, statistical data also indicates that the percentage of residents with blood type AB negative (AB-) is 207,754 individuals. This translates into a percentage of 0.078% with this rather rare blood type. In a certain province of the country, there were 11 people with blood type AB- out of the population of 37,937. Conduct an appropriate hypothesis test to determine if there is a statistically significant difference between the percentage of residents in the entire country with blood type AB- versus the percentage in the local province. Use a significance level of 0.01.
Solution
We will follow the four-step plan.
- We need to test whether the proportion of residents with AB- blood type in the local province is statistically different as compared to the proportion in the entire country.
- Since we are presented with proportions, we will use a one-proportion z-test. The hypotheses for the test will be:
- Note the sample proportion is
The test statistic is calculated as z = -3.4189. To calculate the p-value, note that this is a two-tailed test. Find the area under the normal distribution to the left of the test statistic and then double this area. The area to the left of the test statistic is 0.000314, and this area doubled results in the p-value of 0.00063.
- Since the p-value, p = 0.00063, is less than the alpha level of 0.01, the sample data indicates that we should reject the null hypothesis. In conclusion, the sample data support the claim that the proportion of individuals with blood type AB- in the local province is different from the proportion of individuals in the entire country.
Try It 9.22
According to the U.S. Census, there are approximately 201,456,463 residents 20 and older. Statistics from the Criminal National Network indicate that, on average, 104,354 murders occur each year for people aged 20 and older. This translates into a percentage of murder of 0.052%. In Ohio, there were reported 127 murders for a population of 427,648. Conduct an appropriate hypothesis test to determine if there is a statistically significant difference between the local murder percentage and the national murder percentage. Use a significance level of 0.01.