Skip to Content
OpenStax Logo
Buy book
  1. Preface
  2. 1 Sampling and Data
    1. Introduction
    2. 1.1 Definitions of Statistics, Probability, and Key Terms
    3. 1.2 Data, Sampling, and Variation in Data and Sampling
    4. 1.3 Levels of Measurement
    5. 1.4 Experimental Design and Ethics
    6. Key Terms
    7. Chapter Review
    8. Homework
    9. References
    10. Solutions
  3. 2 Descriptive Statistics
    1. Introduction
    2. 2.1 Display Data
    3. 2.2 Measures of the Location of the Data
    4. 2.3 Measures of the Center of the Data
    5. 2.4 Sigma Notation and Calculating the Arithmetic Mean
    6. 2.5 Geometric Mean
    7. 2.6 Skewness and the Mean, Median, and Mode
    8. 2.7 Measures of the Spread of the Data
    9. Key Terms
    10. Chapter Review
    11. Formula Review
    12. Practice
    13. Homework
    14. Bringing It Together: Homework
    15. References
    16. Solutions
  4. 3 Probability Topics
    1. Introduction
    2. 3.1 Terminology
    3. 3.2 Independent and Mutually Exclusive Events
    4. 3.3 Two Basic Rules of Probability
    5. 3.4 Contingency Tables and Probability Trees
    6. 3.5 Venn Diagrams
    7. Key Terms
    8. Chapter Review
    9. Formula Review
    10. Practice
    11. Bringing It Together: Practice
    12. Homework
    13. Bringing It Together: Homework
    14. References
    15. Solutions
  5. 4 Discrete Random Variables
    1. Introduction
    2. 4.1 Hypergeometric Distribution
    3. 4.2 Binomial Distribution
    4. 4.3 Geometric Distribution
    5. 4.4 Poisson Distribution
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  6. 5 Continuous Random Variables
    1. Introduction
    2. 5.1 Properties of Continuous Probability Density Functions
    3. 5.2 The Uniform Distribution
    4. 5.3 The Exponential Distribution
    5. Key Terms
    6. Chapter Review
    7. Formula Review
    8. Practice
    9. Homework
    10. References
    11. Solutions
  7. 6 The Normal Distribution
    1. Introduction
    2. 6.1 The Standard Normal Distribution
    3. 6.2 Using the Normal Distribution
    4. 6.3 Estimating the Binomial with the Normal Distribution
    5. Key Terms
    6. Chapter Review
    7. Formula Review
    8. Practice
    9. Homework
    10. References
    11. Solutions
  8. 7 The Central Limit Theorem
    1. Introduction
    2. 7.1 The Central Limit Theorem for Sample Means
    3. 7.2 Using the Central Limit Theorem
    4. 7.3 The Central Limit Theorem for Proportions
    5. 7.4 Finite Population Correction Factor
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  9. 8 Confidence Intervals
    1. Introduction
    2. 8.1 A Confidence Interval for a Population Standard Deviation, Known or Large Sample Size
    3. 8.2 A Confidence Interval for a Population Standard Deviation Unknown, Small Sample Case
    4. 8.3 A Confidence Interval for A Population Proportion
    5. 8.4 Calculating the Sample Size n: Continuous and Binary Random Variables
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  10. 9 Hypothesis Testing with One Sample
    1. Introduction
    2. 9.1 Null and Alternative Hypotheses
    3. 9.2 Outcomes and the Type I and Type II Errors
    4. 9.3 Distribution Needed for Hypothesis Testing
    5. 9.4 Full Hypothesis Test Examples
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  11. 10 Hypothesis Testing with Two Samples
    1. Introduction
    2. 10.1 Comparing Two Independent Population Means
    3. 10.2 Cohen's Standards for Small, Medium, and Large Effect Sizes
    4. 10.3 Test for Differences in Means: Assuming Equal Population Variances
    5. 10.4 Comparing Two Independent Population Proportions
    6. 10.5 Two Population Means with Known Standard Deviations
    7. 10.6 Matched or Paired Samples
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Homework
    13. Bringing It Together: Homework
    14. References
    15. Solutions
  12. 11 The Chi-Square Distribution
    1. Introduction
    2. 11.1 Facts About the Chi-Square Distribution
    3. 11.2 Test of a Single Variance
    4. 11.3 Goodness-of-Fit Test
    5. 11.4 Test of Independence
    6. 11.5 Test for Homogeneity
    7. 11.6 Comparison of the Chi-Square Tests
    8. Key Terms
    9. Chapter Review
    10. Formula Review
    11. Practice
    12. Homework
    13. Bringing It Together: Homework
    14. References
    15. Solutions
  13. 12 F Distribution and One-Way ANOVA
    1. Introduction
    2. 12.1 Test of Two Variances
    3. 12.2 One-Way ANOVA
    4. 12.3 The F Distribution and the F-Ratio
    5. 12.4 Facts About the F Distribution
    6. Key Terms
    7. Chapter Review
    8. Formula Review
    9. Practice
    10. Homework
    11. References
    12. Solutions
  14. 13 Linear Regression and Correlation
    1. Introduction
    2. 13.1 The Correlation Coefficient r
    3. 13.2 Testing the Significance of the Correlation Coefficient
    4. 13.3 Linear Equations
    5. 13.4 The Regression Equation
    6. 13.5 Interpretation of Regression Coefficients: Elasticity and Logarithmic Transformation
    7. 13.6 Predicting with a Regression Equation
    8. 13.7 How to Use Microsoft Excel® for Regression Analysis
    9. Key Terms
    10. Chapter Review
    11. Practice
    12. Solutions
  15. A | Statistical Tables
  16. B | Mathematical Phrases, Symbols, and Formulas
  17. Index

Venn Diagrams

A Venn diagram is a picture that represents the outcomes of an experiment. It generally consists of a box that represents the sample space S together with circles or ovals. The circles or ovals represent events. Venn diagrams also help us to convert common English words into mathematical terms that help add precision.

Venn diagrams are named for their inventor, John Venn, a mathematics professor at Cambridge and an Anglican minister. His main work was conducted during the late 1870's and gave rise to a whole branch of mathematics and a new way to approach issues of logic. We will develop the probability rules just covered using this powerful way to demonstrate the probability postulates including the Addition Rule, Multiplication Rule, Complement Rule, Independence, and Conditional Probability.

Example 3.27

Suppose an experiment has the outcomes 1, 2, 3, ... , 12 where each outcome has an equal chance of occurring. Let event A = {1, 2, 3, 4, 5, 6} and event B = {6, 7, 8, 9}. Then A intersect B = AB={6}AB={6} and A union B = AB={1, 2, 3, 4, 5, 6, 7, 8, 9}.AB={1, 2, 3, 4, 5, 6, 7, 8, 9}.. The Venn diagram is as follows:

A Venn diagram. An oval representing set A contains the values 1, 2, 3, 4, 5, and 6. An oval representing set B also contains the 6, along with 7, 8, and 9. The values 10, 11, and 12 are present but not contained in either set.
Figure 3.6

Figure 3.6 shows the most basic relationship among these numbers. First, the numbers are in groups called sets; set A and set B. Some number are in both sets; we say in set A in set B. The English word "and" means inclusive, meaning having the characteristics of both A and B, or in this case, being a part of both A and B. This condition is called the INTERSECTION of the two sets. All members that are part of both sets constitute the intersection of the two sets. The intersection is written as ABAB where is the mathematical symbol for intersection. The statement ABAB is read as "A intersect B." You can remember this by thinking of the intersection of two streets.

There are also those numbers that form a group that, for membership, the number must be in either one or the other group. The number does not have to be in BOTH groups, but instead only in either one of the two. These numbers are called the UNION of the two sets and in this case they are the numbers 1-5 (from A exclusively), 7-9 (from set B exclusively) and also 6, which is in both sets A and B. The symbol for the UNION is , thus AB=AB= numbers 1-9, but excludes number 10, 11, and 12. The values 10, 11, and 12 are part of the universe, but are not in either of the two sets.

Translating the English word "AND" into the mathematical logic symbol , intersection, and the word "OR" into the mathematical symbol , union, provides a very precise way to discuss the issues of probability and logic. The general terminology for the three areas of the Venn diagram in Figure 3.6 is shown in Figure 3.7.

Try It 3.27

Suppose an experiment has outcomes black, white, red, orange, yellow, green, blue, and purple, where each outcome has an equal chance of occurring. Let event C = {green, blue, purple} and event P = {red, yellow, blue}. Then CP={blue}CP={blue} and CP={green, blue, purple, red, yellow}CP={green, blue, purple, red, yellow}. Draw a Venn diagram representing this situation.

Example 3.28

Flip two fair coins. Let A = tails on the first coin. Let B = tails on the second coin. Then A = {TT, TH} and B = {TT, HT}. Therefore, AB={TT}AB={TT}. AB={TH, TT, HT}AB={TH, TT, HT}.

The sample space when you flip two fair coins is X = {HH, HT, TH, TT}. The outcome HH is in NEITHER A NOR B. The Venn diagram is as follows:

This is a venn diagram. An oval representing set A contains Tails + Heads and Tails + Tails. An oval representing set B also contains Tails + Tails, along with Heads + Tails. The universe S contains Heads + Heads, but this value is not contained in either set A or B.
Figure 3.7
Try It 3.28

Roll a fair, six-sided die. Let A = a prime number of dots is rolled. Let B = an odd number of dots is rolled. Then A = {2, 3, 5} and B = {1, 3, 5}. Therefore, AB={3, 5}AB={3, 5}. AB={1, 2, 3, 5}AB={1, 2, 3, 5}. The sample space for rolling a fair die is S = {1, 2, 3, 4, 5, 6}. Draw a Venn diagram representing this situation.

Example 3.29

A person with type O blood and a negative Rh factor (Rh-) can donate blood to any person with any blood type. Four percent of African Americans have type O blood and a negative RH factor, 5−10% of African Americans have the Rh- factor, and 51% have type O blood.

This is an empty Venn diagram showing two overlapping circles. The left circle is labeled O and the right circle is labeled RH-.
Figure 3.8

The “O” circle represents the African Americans with type O blood. The “Rh-“ oval represents the African Americans with the Rh- factor.

We will take the average of 5% and 10% and use 7.5% as the percent of African Americans who have the Rh- factor. Let O = African American with Type O blood and R = African American with Rh- factor.

  1. P(O) = ___________
  2. P(R) = ___________
  3. P(OR)=P(OR)= ___________
  4. P(OR)=P(OR)= ____________
  5. In the Venn Diagram, describe the overlapping area using a complete sentence.
  6. In the Venn Diagram, describe the area in the rectangle but outside both the circle and the oval using a complete sentence.
Solution 3.29

a. 0.51; b. 0.075; c. 0.04; d. 0.545; e. The area represents the African Americans that have type O blood and the Rh- factor. f. The area represents the African Americans that have neither type O blood nor the Rh- factor.

Example 3.30

Forty percent of the students at a local college belong to a club and 50% work part time. Five percent of the students work part time and belong to a club. Draw a Venn diagram showing the relationships. Let C = student belongs to a club and PT = student works part time.

This is a venn diagram with one set containing students in clubs and another set containing students working  part-time. Both sets share students who are members of clubs and also work part-time. The universe is labeled S.
Figure 3.9

If a student is selected at random, find

  • the probability that the student belongs to a club. P(C) = 0.40
  • the probability that the student works part time. P(PT) = 0.50
  • the probability that the student belongs to a club AND works part time. P(CPT)=0.05P(CPT)=0.05
  • the probability that the student belongs to a club given that the student works part time. P(C|PT) = P(CPT) P(PT) = 0.05 0.50 = 0.1 P(C|PT) = P(CPT) P(PT) = 0.05 0.50 = 0.1
  • the probability that the student belongs to a club OR works part time. P(CPT)=P(C)+P(PT)-P(CPT)=0.40+0.50-0.05=0.85P(CPT)=P(C)+P(PT)-P(CPT)=0.40+0.50-0.05=0.85

In order to solve Example 3.30 we had to draw upon the concept of conditional probability from the previous section. There we used tree diagrams to track the changes in the probabilities, because the sample space changed as we drew without replacement. In short, conditional probability is the chance that something will happen given that some other event has already happened. Put another way, the probability that something will happen conditioned upon the situation that something else is also true. In Example 3.30 the probability P(C | | PT) is the conditional probability that the randomly drawn student is a member of the club, conditioned upon the fact that the student also is working part time. This allows us to see the relationship between Venn diagrams and the probability postulates.

Try It 3.30

Fifty percent of the workers at a factory work a second job, 25% have a spouse who also works, 5% work a second job and have a spouse who also works. Draw a Venn diagram showing the relationships. Let W = works a second job and S = spouse also works.

Try It 3.30

In a bookstore, the probability that the customer buys a novel is 0.6, and the probability that the customer buys a non-fiction book is 0.4. Suppose that the probability that the customer buys both is 0.2.

  1. Draw a Venn diagram representing the situation.
  2. Find the probability that the customer buys either a novel or a non-fiction book.
  3. In the Venn diagram, describe the overlapping area using a complete sentence.
  4. Suppose that some customers buy only compact disks. Draw an oval in your Venn diagram representing this event.

Example 3.31

A set of 20 German Shepherd dogs is observed. 12 are male, 8 are female, 10 have some brown coloring, and 5 have some white sections of fur. Answer the following using Venn Diagrams.

Draw a Venn diagram simply showing the sets of male and female dogs.

Solution 3.31

The Venn diagram below demonstrates the situation of mutually exclusive events where the outcomes are independent events. If a dog cannot be both male and female, then there is no intersection. Being male precludes being female and being female precludes being male: in this case, the characteristic gender is therefore mutually exclusive. A Venn diagram shows this as two sets with no intersection. The intersection is said to be the null set using the mathematical symbol ∅.

Figure 3.10

Draw a second Venn diagram illustrating that 10 of the male dogs have brown coloring.

Solution 3.31

The Venn diagram below shows the overlap between male and brown where the number 10 is placed in it. This represents MaleBrownMaleBrown: both male and brown. This is the intersection of these two characteristics. To get the union of Male and Brown, then it is simply the two circled areas minus the overlap. In proper terms, MaleBrown=Male+BrownMaleBrownMaleBrown=Male+BrownMaleBrown will give us the number of dogs in the union of these two sets. If we did not subtract the intersection, we would have double counted some of the dogs.

Figure 3.11

Now draw a situation depicting a scenario in which the non-shaded region represents "No white fur and female," or White fur′ Female. the prime above "fur" indicates "not white fur." The prime above a set means not in that set, e.g. AA means not AA. Sometimes, the notation used is a line above the letter. For example, A¯A¯ = AA.

Solution 3.31
Figure 3.12

The Addition Rule of Probability

We met the addition rule earlier but without the help of Venn diagrams. Venn diagrams help visualize the counting process that is inherent in the calculation of probability. To restate the Addition Rule of Probability:

P(AB)=P(A)+P(B)P(AB)P(AB)=P(A)+P(B)P(AB)

Remember that probability is simply the proportion of the objects we are interested in relative to the total number of objects. This is why we can see the usefulness of the Venn diagrams. Example 3.31 shows how we can use Venn diagrams to count the number of dogs in the union of brown and male by reminding us to subtract the intersection of brown and male. We can see the effect of this directly on probabilities in the addition rule.

Example 3.32

Let's sample 50 students who are in a statistics class. 20 are freshmen and 30 are sophomores. 15 students get a "B" in the course, and 5 students both get a "B" and are freshmen.

Find the probability of selecting a student who either earns a "B" OR is a freshmen. We are translating the word OR to the mathematical symbol for the addition rule, which is the union of the two sets.

Solution 3.32

We know that there are 50 students in our sample, so we know the denominator of our fraction to give us probability. We need only to find the number of students that meet the characteristics we are interested in, i.e. any freshman and any student who earned a grade of "B." With the Addition Rule of probability, we can skip directly to probabilities.

Let "A" = the number of freshmen, and let "B" = the grade of "B." Below we can see the process for using Venn diagrams to solve this.

The P(A)=2050=0.40P(A)=2050=0.40, P(B)=1550=0.30P(B)=1550=0.30, and P(AB)=550=0.10P(AB)=550=0.10.

Therefore, P(AB)=0.40+0.300.10=0.60P(AB)=0.40+0.300.10=0.60.

Figure 3.13

If two events are mutually exclusive, then, like the example where we diagram the male and female dogs, the addition rule is simplified to just P(AB)=P(A)+P(B)0P(AB)=P(A)+P(B)0. This is true because, as we saw earlier, the union of mutually exclusive events is the null set, ∅. The diagrams below demonstrate this.

Figure 3.14

The Multiplication Rule of Probability

Restating the Multiplication Rule of Probability using the notation of Venn diagrams, we have:

P(AB)=P(A|B)P(B)P(AB)=P(A|B)P(B)

The multiplication rule can be modified with a bit of algebra into the following conditional rule. Then Venn diagrams can then be used to demonstrate the process.

The conditional rule: P(A|B)=P(AB)P(B)P(A|B)=P(AB)P(B)

Using the same facts from Example 3.32 above, find the probability that someone will earn a "B" if they are a "freshman."

P(A|B)=0.100.30=13P(A|B)=0.100.30=13
Figure 3.15

The multiplication rule must also be altered if the two events are independent. Independent events are defined as a situation where the conditional probability is simply the probability of the event of interest. Formally, independence of events is defined as P(A|B)=P(A)P(A|B)=P(A) or P(B|A)=P(B)P(B|A)=P(B). When flipping coins, the outcome of the second flip is independent of the outcome of the first flip; coins do not have memory. The Multiplication Rule of Probability for independent events thus becomes:

P(AB)=P(A)P(B)P(AB)=P(A)P(B)

One easy way to remember this is to consider what we mean by the word "and." We see that the Multiplication Rule has translated the word "and" to the Venn notation for intersection. Therefore, the outcome must meet the two conditions of freshmen and grade of "B" in the above example. It is harder, less probable, to meet two conditions than just one or some other one. We can attempt to see the logic of the Multiplication Rule of probability due to the fact that fractions multiplied times each other become smaller.

The development of the Rules of Probability with the use of Venn diagrams can be shown to help as we wish to calculate probabilities from data arranged in a contingency table.

Example 3.33

Table 3.11 is from a sample of 200 people who were asked how much education they completed. The columns represent the highest education they completed, and the rows separate the individuals by male and female.

Less than high school grad High school grad Some college College grad Total
Male 5 15 40 60 120
Female 8 12 30 30 80
Total 13 27 70 90 200
Table 3.11

Now, we can use this table to answer probability questions. The following examples are designed to help understand the format above while connecting the knowledge to both Venn diagrams and the probability rules.

What is the probability that a selected person both finished college and is female?

Solution 3.33

This is a simple task of finding the value where the two characteristics intersect on the table, and then applying the postulate of probability, which states that the probability of an event is the proportion of outcomes that match the event in which we are interested as a proportion of all total possible outcomes.

P(College Grad Female) = 30200=0.1530200=0.15

What is the probability of selecting either a female or someone who finished college?

Solution 3.33

This task involves the use of the addition rule to solve for this probability.

P(College Grad Female) = P(F) + P(CG)− P(F CG)

P(College Grad Female) = 80200+9020030200=140200=0.7080200+9020030200=140200=0.70

What is the probability of selecting a high school graduate if we only select from the group of males?

Solution 3.33

Here we must use the conditional probability rule (the modified multiplication rule) to solve for this probability.

P(HS Grad || Male = P(HS GradMale)P(Male)=(15200)(120200)=15120=0.125P(HS GradMale)P(Male)=(15200)(120200)=15120=0.125

Can we conclude that the level of education attained by these 200 people is independent of the gender of the person?

Solution 3.33

There are two ways to approach this test. The first method seeks to test if the intersection of two events equals the product of the events separately remembering that if two events are independent than P(A)*P(B) = P(A B). For simplicity's sake, we can use calculated values from above.

Does P(College Grad Female) = P(CG) ⋅ P(F)?

302009020080200302009020080200 because 0.15 ≠ 0.18.

Therefore, gender and education here are not independent.

The second method is to test if the conditional probability of A given B is equal to the probability of A. Again for simplicity, we can use an already calculated value from above.

Does P(HS Grad || Male) = P(HS Grad)?

15120272001512027200because 0.125 ≠ 0.135.

Therefore, again gender and education here are not independent.

Citation/Attribution

Want to cite, share, or modify this book? This book is Creative Commons Attribution License 4.0 and you must attribute OpenStax.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at https://openstax.org/books/introductory-business-statistics/pages/1-introduction
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at https://openstax.org/books/introductory-business-statistics/pages/1-introduction
Citation information

© Nov 29, 2017 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License 4.0 license. The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.