Barbara Illowsky; Susan Dean

When conducting a hypothesis test that compares two independent population proportions, the following characteristics should be present:

The two independent samples are simple random samples that are independent.
The number of successes is at least five, and the number of failures is at least five, for each of the samples.
Growing literature states that the population must be at least ten or 20 times the size of the sample. This keeps each population from being over-sampled and causing incorrect results.

Comparing two proportions, like comparing two means, is common. If two estimated proportions are different, it may be due to a difference in the populations or it may be due to chance. A hypothesis test can help determine if a difference in the estimated proportions reflects a difference in the population proportions.

Like the case of differences in sample means, we construct a sampling distribution for differences in sample proportions:

$p'_{A} = \frac{X_{A}}{n_{A}}$ and $p'_{B} = \frac{X_{B}}{n_{B}}$ are the sample proportions for the two sets of data in question $X_{A}$ and $X_{B}$ .

The difference of two proportions follows an approximate normal distribution. Generally, the null hypothesis states that the two proportions are the same. That is, H₀: p_A = p_B. To conduct the test, we use a pooled proportion, p_c.

The pooled proportion is calculated as follows:

p_{c} = \frac{x_{A} + x_{B}}{n_{A} + n_{B}}

The distribution for the differences is:

{P^{'}}_{A} - {P^{'}}_{B} ~ N [0, \sqrt{p_{c} (1 - p_{c}) (\frac{1}{n_{A}} + \frac{1}{n_{B}})}]

The test statistic (z-score) is:

z = \frac{({p^{'}}_{A} - {p^{'}}_{B}) - (p_{A} - p_{B})}{\sqrt{p_{c} (1 - p_{c}) (\frac{1}{n_{A}} + \frac{1}{n_{B}})}}

Example 10.8

Problem

Two types of medication for hives are being tested to determine if there is a difference in the proportions of adult patient reactions. Twenty out of a random sample of 200 adults given medication A still had hives 30 minutes after taking the medication. Twelve out of another random sample of 200 adults given medication B still had hives 30 minutes after taking the medication. Test at a 1% level of significance.

Solution

The problem asks for a difference in proportions, making it a test of two proportions.

Let A and B be the subscripts for medication A and medication B, respectively. Then p_A and p_B are the desired population proportions.

Random Variable: P′_A – P′_B = difference in the proportions of adult patients who did not react after 30 minutes to medication A and to medication B.

H₀: p_A = p_B

p_A – p_B = 0

H_a: p_A ≠ p_B

p_A – p_B ≠ 0

The words "is a difference" tell you the test is two-tailed.

Distribution for the test: Since this is a test of two binomial population proportions, the distribution is normal:

$p_{c} = \frac{x_{A} + x_{B}}{n_{A} + n_{B}} = \frac{20 + 12}{200 + 200} = 0.08 1 - p_{c} = 0.92$

${P^{'}}_{A} - {P^{'}}_{B} ~ N [0, \sqrt{(0.08) (0.92) (\frac{1}{200} + \frac{1}{200})}]$

P′_A – P′_B follows an approximate normal distribution.

Calculate the p-value using the normal distribution: p-value = 0.1404.

Estimated proportion for group A: ${p^{'}}_{A} = \frac{x_{A}}{n_{A}} = \frac{20}{200} = 0.1$

Estimated proportion for group B: ${p^{'}}_{B} = \frac{x_{B}}{n_{B}} = \frac{12}{200} = 0.06$

Graph:

Normal distribution curve of the difference in the percentages of adult patients who don't react to medication A and B after 30 minutes. The mean is equal to zero, and the values -0.04, 0, and 0.04 are labeled on the horizontal axis. Two vertical lines extend from -0.04 and 0.04 to the curve. The region to the left of -0.04 and the region to the right of 0.04 are each shaded to represent 1/2(p-value) = 0.0702.

Figure 10.7

P′_A – P′_B = 0.1 – 0.06 = 0.04.

Half the p-value is below –0.04, and half is above 0.04.

Compare α and the p-value: α = 0.01 and the p-value = 0.1404. α < p-value.

Make a decision: Since α < p-value, do not reject H₀.

Conclusion: At a 1% level of significance, from the sample data, there is not sufficient evidence to conclude that there is a difference in the proportions of adult patients who did not react after 30 minutes to medication A and medication B.

Using the TI-83, 83+, 84, 84+ Calculator

Press STAT. Arrow over to TESTS and press 6:2-PropZTest. Arrow down and enter 20 for x1, 200 for n1, 12 for x2, and 200 for n2. Arrow down to p1: and arrow to not equal p2. Press ENTER. Arrow down to Calculate and press ENTER. The p-value is p = 0.1404 and the test statistic is 1.47. Do the procedure again, but instead of Calculate do Draw.

Try It 10.8

Two types of valves are being tested to determine if there is a difference in pressure tolerances. Fifteen out of a random sample of 100 of Valve A cracked under 4,500 psi. Six out of a random sample of 100 of Valve B cracked under 4,500 psi. Test at a 5% level of significance.

Example 10.9

Problem

A research study was conducted about gender differences regarding the use of seat belts in motor vehicles. The researcher believed that the proportion of women not wearing seat belts is less than the proportion of men not wearing seat belts. The data collected represents a random sample of U.S. adults and is summarized in Table 10.12. Is the proportion of women not wearing seat belts less than the proportion of men not wearing seat belts? Test at a 1% level of significance.

	Men	Women
Does not wear seat belts	183	156
Total number surveyed	2231	2169

Table 10.12

Solution

This is a test of two population proportions. Let M and F be the subscripts for men and women. Then p_M and p_F are the desired population proportions.

Random Variable: p′_F − p′_M = difference in the proportions of men and women who do not wear seat belts.

H₀: p_F = p_M H₀: p_F – p_M = 0

H_a: p_F < p_M H_a: p_F – p_M < 0

The words "less than" tell you the test is left-tailed.

Distribution for the test: Since this is a test of two population proportions, the distribution is normal:

$p_{c} = \frac{x_{F} + x_{M}}{n_{F} + n_{M}} = \frac{156 + 183}{2169 + 2231} = 0 .077$
$1 - p_{c} = 0.923$
Therefore,
${p^{'}}_{F} - {p^{'}}_{M} \sim N (0, \sqrt{(0.077) (0.923) (\frac{1}{2169} + \frac{1}{2231})})$
p′_F – p′_M follows an approximate normal distribution.

Calculate the p-value using the normal distribution:
p-value = 0.1045
Estimated proportion for women: 0.0719
Estimated proportion for men: 0.082

Graph:

This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the left of zero extends from the axis to the curve. The region under the curve to the left of the line is shaded representing p-value = 0.1045.

Figure 10.8

Decision: Since α < p-value, Do not reject H₀

Conclusion: At the 1% level of significance, from the sample data, there is not sufficient evidence to conclude that the proportion of women not wearing seat belts is less than the proportion of men not wearing seat belts.

Try It 10.9

A survey was conducted about the favorable beverage as tea. The data collected is summarized in the table. Is the proportion of men favoring tea more than women favoring tea? Test at a 1% level of significance.

Men	Women
Favor tea	16	18
Total surveyed	230	218

Table 10.13

Using the TI-83, 83+, 84, 84+ Calculator

Press STAT. Arrow over to TESTS and press 6:2-PropZTest. Arrow down and enter 156 for x1, 2169 for n1, 183 for x2, and 2231 for n2. Arrow down to p1: and arrow to less than p2. Press ENTER. Arrow down to Calculate and press ENTER. The p-value is P = 0.1045 and the test statistic is z = -1.256.

Example 10.10

Problem

A marketing firm claims that the proportion of younger adults who own electric vehicles is greater than the proportion of older adults who own electric vehicles. A random sample of U.S. adults was taken, and the results of the survey indicate the following:

Out of a sample of 232 older adults (aged 35 or older), 5% own electric vehicles.
Out of a sample of 1,343 young adults (aged 34 or younger), 10% own electric vehicles.

Test at the 5% level of significance. Is the proportion of younger adults greater than the proportion of older adults with respect to owning electric vehicles?

Solution

This is a test of two population proportions. Let Y and O be the subscripts for younger adults and older adults, respectively. Then pY and pO are the desired population proportions.

Random Variable: p’Y and p’O = difference in the proportions of younger and older adults who own electric vehicles.

H_{0} : p_{Y} = p_{O} H_{0} : p_{Y} - p_{O} = 0

H_{a} : p_{Y} > p_{O} H_{a} : p_{Y} - p_{O} > 0

The words "greater than" indicate that the test is right-tailed.

Distribution for the test: The distribution is approximately normal:

$p_{c} = \frac{x_{y} + x_{o}}{n_{y} + n_{o}} = \frac{134 + 12}{1343 + 232} = 0.0927$

$1 - p_{c} = 0.9073$

Therefore,

$p'_{y} - {p^{'}}_{O} ~ N (0, \sqrt{0.0927) (0.9073) (\frac{1}{1343} + \frac{1}{232})})$

$p'_{y} - {p^{'}}_{O}$ follows an approximate normal distribution.

Calculate the p-value using the normal distribution:
p-value = 0.0077
Estimated proportion for group Y: 0.10
Estimated proportion for group O: 0.05

Graph:

This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded representing p-value = 0.00004.

Figure 10.9

Decision: Since  > p-value, reject the $H_{0}$ .

Conclusion: At the 5% level of significance, from the sample data, there is sufficient evidence to conclude that a larger proportion of younger adults own electric vehicles as compared to older adults.

Using the TI-83, 83+, 84, 84+ Calculator

TI-83+ and TI-84: Press STAT. Arrow over to TESTS and press 6:2-PropZTest. Arrow down and enter 135 for x1, 1343 for n1, 12 for x2, and 232 for n2. Arrow down to p1: and arrow to greater than p2. Press ENTER. Arrow down to Calculate and press ENTER. The P-value is P = 0.0092 and the test statistic is Z = 2.33.

Try It 10.10

A government researcher is investigating whether there is a difference in the use of helmets by motorcyclists in different geographic regions for those states where use of helmets is required by law. The research shows that for motorcyclists in the northeast U.S., 7622 out of 113,231 motorcyclists did not wear helmets. In the southeast U.S., 7439 out of 104,873 motorcyclists did not wear helmets. Test at a 5% significance level. Answer the following questions:

a. Is this a test of two means or two proportions?

b. Which distribution do you use to perform the test?

c. What is the random variable?

d. What are the null and alternative hypothesis? Write the null and alternative hypothesis in symbols.

e. Is this test right-, left-, or two-tailed?

f. What is the p-value?

g. Do you reject or not reject the null hypothesis?

h. At the ___ level of significance, from the sample data, there ______ (is/is not) sufficient evidence to conclude that ____________.

10.3 Comparing Two Independent Population Proportions

Problem

Solution

Problem

Solution

Problem

Solution