Statistics

# Solutions

StatisticsSolutions

1.

soccer = 12/40 = ;

lacrosse = 8/40 = 0.2

2.

women who play soccer = 8/20 = ;

women who play basketball = 8/20 = ;

women who play lacrosse = 4/20 = ;

3.

patients with the virus

5.

The average length of time (in months) patients live after treatment.

7.

X = the length of time (in months) patients live after treatment

9.

b

11.

a

13.
1. .5242
2. .03 percent
3. 6.86 percent
4. $823,088 823,856 823,088 823,856$
5. quantitative discrete
6. quantitative continuous
7. In both years, underwater earthquakes produced massive tsunamis.
8. Answers may vary. Sample answer: A bar graph with one bar for each year, in order, would be best since it would show the change in the number of deaths from year to year. In my presentation, I would point out that the scale of the graph is in thousands, and I would discuss which specific earthquakes were responsible for the greatest numbers of deaths in those years.
15.

systematic

17.

simple random

19.

values for X, such as 3, 4, 11, and so on

21.

No, we do not have enough information to make such a claim.

23.

Take a simple random sample from each group. One way is by assigning a number to each patient and using a random number generator to randomly select patients.

25.

This would be convenience sampling and is not random.

27.

Yes, the sample size of 150 would be large enough to reflect a population of one school.

29.

Even though the specific data support each researcher’s conclusions, the different results suggest that more data need to be collected before the researchers can reach a conclusion.

30.

Answers may vary. Sample answer: A pie graph would be best for showing the percentage of students that fall into each Hours Played category. A bar graph would be more desirable if knowing the total numbers of students in each category is important. I would be sure that the colors used on the two pie graphs are the same for each category and are clearly distinguishable when displayed. The percentages should be legible, and the pie graph should be large enough to show the smaller sections clearly. For the bar graph, I would display the bars in chronological order and make sure that the colors used for each researcher’s data are clearly distinguishable. The numbers and the scale should be legible and clear when the bar graph is displayed.

32.

There is not enough information given to judge if either one is correct or incorrect.

34.

The software program seems to work because the second study shows that more patients improve while using the software than not. Even though the difference is not as large as that in the first study, the results from the second study are likely more reliable and still show improvement.

36.

Yes, because we cannot tell if the improvement was due to the software or the exercise; the data is confounded, and a reliable conclusion cannot be drawn. New studies should be performed.

38.

No, even though the sample is large enough, the fact that the sample consists of volunteers makes it a self-selected sample, which is not reliable.

40.

No, even though the sample is a large portion of the population, two responses are not enough to justify any conclusions. Because the population is so small, it would be better to include everyone in the population to get the most accurate data.

42.
1. ordinal
2. interval
3. nominal
4. nominal
5. ratio
6. ordinal
7. nominal
8. interval
9. ratio
10. interval
11. ratio
12. ordinal
44.
1. Inmates may not feel comfortable refusing participation, or may feel obligated to take advantage of the promised benefits. They may not feel truly free to refuse participation.
2. Parents can provide consent on behalf of their children, but children are not competent to provide consent for themselves.
3. All risks and benefits must be clearly outlined. Study participants must be informed of relevant aspects of the study in order to give appropriate consent.
45.
1. statistical model: The time any journey takes from New York to Florida is variable and depends on traffic and other driving conditions.
2. statistical model: Although trains try to leave on time, the exact time of departure differs slightly from day to day.
3. mathematical model: The distance from your house to school is the same every day and can be precisely determined.
4. statistical model: The temperature of a refrigerator fluctuates as the compressor turns on and off.
5. statistical model: The fill weight of a bag of rice is different for each bag. Manufacturers spend considerable effort to minimize the variance from bag to bag.
47.
1. all children who take ski or snowboard lessons
2. a group of these children
3. the population mean age of children who take their first snowboard lesson
4. the sample mean age of children who take their first snowboard lesson
5. X = the age of one child who takes his or her first ski or snowboard lesson
6. values for X, such as 3, 7, and so on
49.
1. the clients of the insurance companies
2. a group of the clients
3. the mean health costs of the clients
4. the mean health costs of the sample
5. X = the health costs of one client
6. values for X, such as 34, 9, 82, and so on
51.
1. all the clients of this counselor
2. a group of clients of this marriage counselor
3. the proportion of all her clients who stay married
4. the proportion of the sample of the counselor’s clients who stay married
5. X = the number of couples who stay married
6. yes, no
53.
1. all people (maybe in a certain geographic area, such as the United States)
2. a group of the people
3. the proportion of all people who will buy the product
4. the proportion of the sample who will buy the product
5. X = the number of people who will buy it
55.

a

57.

quantitative discrete, 150

59.

qualitative, Oakland A’s

61.

quantitative discrete, 11,234 students

63.

qualitative, Crest

65.

quantitative continuous, 47.3 years

67.

b

69.
1. The survey was conducted using six similar flights.
The survey would not be a true representation of the entire population of air travelers.
Conducting the survey on a holiday weekend will not produce representative results.
2. Conduct the survey during different times of the year.
Conduct the survey using flights to and from various locations.
Conduct the survey on different days of the week.
71.

Answers will vary. Sample Answer: You could use a systematic sampling method. Stop the tenth person as they leave one of the buildings on campus at 9:50 in the morning. Then stop the tenth person as they leave a different building on campus at 1:50 in the afternoon.

73.

Answers will vary. Sample Answer: Many people will not respond to mail surveys. If they do respond to the surveys, you can’t be sure who is responding. In addition, mailing lists can be incomplete.

75.

b

77.

convenience cluster stratified systematic simple random

79.
1. qualitative
2. quantitative discrete
3. quantitative discrete
4. qualitative
81.

Causality: The fact that two variables are related does not guarantee that one variable is influencing the other. We cannot assume that crime rate impacts education level or that education level impacts crime rate.

Confounding: There are many factors that define a community other than education level and crime rate. Communities with high crime rates and high education levels may have other lurking variables that distinguish them from communities with lower crime rates and lower education levels. Because we cannot isolate these variables of interest, we cannot draw valid conclusions about the connection between education and crime. Possible lurking variables include police expenditures, unemployment levels, region, average age, and size.

83.
1. Possible reasons: increased use of caller id, decreased use of landlines, increased use of private numbers, voice mail, privacy managers, hectic nature of personal schedules, decreased willingness to be interviewed
2. When a large number of people refuse to participate, then the sample may not have the same characteristics of the population. Perhaps the majority of people willing to participate are doing so because they feel strongly about the subject of the survey.
85.

1. # Flossing per Week Frequency Relative Frequency Cumulative Relative Frequency
0 27 .4500 .4500
1 18 .3000 .7500
3 11 .1833 .9333
6 3 .0500 .9833
7 1 .0167 1
Table 1.44
2. 5.00 percent
3. 93.33 percent
87.

The sum of the travel times is 1,173.1. Divide the sum by 50 to calculate the mean value: 23.462. Because each state’s travel time was measured to the nearest tenth, round this calculation to the nearest hundredth: 23.46.

89.

b

91.

Explanatory variable: amount of sleep
Response variable: performance measured in assigned tasks
Treatments: normal sleep and 27 hours of total sleep deprivation
Experimental Units: 19 professional drivers
Lurking variables: none – all drivers participated in both treatments
Random assignment: treatments were assigned in random order; this eliminated the effect of any learning that may take place during the first experimental session
Control/Placebo: completing the experimental session under normal sleep conditions
Blinding: researchers evaluating subjects’ performance must not know which treatment is being applied at the time

93.

You cannot assume that the numbers of complaints reflect the quality of the airlines. The airlines shown with the greatest number of complaints are the ones with the most passengers. You must consider the appropriateness of methods for presenting data; in this case displaying totals is misleading.

94.

He can observe a population of 100 college students on campus. He can collect data about the temperature of their dorm rooms and track how many of them catch a cold. If he uses a survey, the temperature of the dorm rooms can be determined from the survey. He can also ask them to self-report when they catch a cold.

96.

Answers will vary. Sample answer: The sample is not representative of the population of all college textbooks. Two reasons why it is not representative are that he only sampled seven subjects and he only investigated one textbook in each subject. There are several possible sources of bias in the study. The seven subjects that he investigated are all in mathematics and the sciences; there are many subjects in the humanities, social sciences, and other subject areas, for example: literature, art, history, psychology, sociology, business, that he did not investigate at all. It may be that different subject areas exhibit different patterns of textbook availability, but his sample would not detect such results.

He also looked only at the most popular textbook in each of the subjects he investigated. The availability of the most popular textbooks may differ from the availability of other textbooks in one of two ways:

• The most popular textbooks may be more readily available online, because more new copies are printed, and more students nationwide are selling back their used copies
• The most popular textbooks may be harder to find available online, because more student demand exhausts the supply more quickly.

In reality, many college students do not use the most popular textbook in their subject, and this study gives no useful information about the situation for those less popular textbooks.

He could improve this study by

• expanding the selection of subjects he investigates so that it is more representative of all subjects studied by college students, and
• expanding the selection of textbooks he investigates within each subject to include a mixed representation of both the most popular and less popular textbooks.