Another useful probability distribution is the Poisson distribution, or waiting time distribution. This distribution is used to determine how many checkout clerks are needed to keep the waiting time in line to specified levels, how may telephone lines are needed to keep the system from overloading, and many other practical applications. A modification of the Poisson, the Pascal, invented nearly four centuries ago, is used today by telecommunications companies worldwide for load factors, satellite hookup levels and Internet capacity problems. The distribution gets its name from Simeon Poisson who presented it in 1837 as an extension of the binomial distribution which we will see can be estimated with the Poisson.
There are two main characteristics of a Poisson experiment.
- The Poisson probability distribution gives the probability of a number of events occurring in a fixed interval of time or space if these events happen with a known average rate.
- The events are independently of the time since the last event. For example, a book editor might be interested in the number of words spelled incorrectly in a particular book. It might be that, on the average, there are five words spelled incorrectly in 100 pages. The interval is the 100 pages and it is assumed that there is no relationship between when misspellings occur.
- The random variable X = the number of occurrences in the interval of interest.
A bank expects to receive six bad checks per day, on average. What is the probability of the bank getting fewer than five bad checks on any given day? Of interest is the number of checks the bank receives in one day, so the time interval of interest is one day. Let X = the number of bad checks the bank receives in one day. If the bank expects to receive six bad checks per day then the average is six checks per day. Write a mathematical statement for the probability question.
P(x < 5)
You notice that a news reporter says "uh," on average, two times per broadcast. What is the probability that the news reporter says "uh" more than two times per broadcast.
This is a Poisson problem because you are interested in knowing the number of times the news reporter says "uh" during a broadcast.
a. What is the interval of interest?
a. one broadcast measured in minutes
b. What is the average number of times the news reporter says "uh" during one broadcast?
c. Let X = ____________. What values does X take on?
c. Let X = the number of times the news reporter says "uh" during one broadcast.
x = 0, 1, 2, 3, ...
d. The probability question is P(______).
d. P(x > 2)
Notation for the Poisson: P = Poisson Probability Distribution Function
X ~ P(μ)
Read this as "X is a random variable with a Poisson distribution." The parameter is μ (or λ); μ (or λ) = the mean for the interval of interest. The mean is the number of occurrences that occur on average during the interval period.
The formula for computing probabilities that are from a Poisson process is:
where P(X) is the probability of X successes, μ is the expected number of successes based upon historical data, e is the natural logarithm approximately equal to 2.718, and X is the number of successes per unit, usually per unit of time.
In order to use the Poisson distribution, certain assumptions must hold. These are: the probability of a success, μ, is unchanged within the interval, there cannot be simultaneous successes within the interval, and finally, that the probability of a success among intervals is independent, the same assumption of the binomial distribution.
In a way, the Poisson distribution can be thought of as a clever way to convert a continuous random variable, usually time, into a discrete random variable by breaking up time into discrete independent intervals. This way of thinking about the Poisson helps us understand why it can be used to estimate the probability for the discrete random variable from the binomial distribution. The Poisson is asking for the probability of a number of successes during a period of time while the binomial is asking for the probability of a certain number of successes for a given number of trials.
Leah's answering machine receives about six telephone calls between 8 a.m. and 10 a.m. What is the probability that Leah receives more than one call in the next 15 minutes?
Let X = the number of calls Leah receives in 15 minutes. (The interval of interest is 15 minutes or hour.)
x = 0, 1, 2, 3, ...
If Leah receives, on the average, six telephone calls in two hours, and there are eight 15 minute intervals in two hours, then Leah receives
(6) = 0.75 calls in 15 minutes, on average. So, μ = 0.75 for this problem.
X ~ P(0.75)
Find P(x > 1). P(x > 1) = 0.1734
Probability that Leah receives more than one telephone call in the next 15 minutes is about 0.1734.
The graph of X ~ P(0.75) is:
The y-axis contains the probability of x where X = the number of calls in 15 minutes.
According to a survey a university professor gets, on average, 7 emails per day. Let X = the number of emails a professor receives per day. The discrete random variable X takes on the values x = 0, 1, 2 …. The random variable X has a Poisson distribution: X ~ P(7). The mean is 7 emails.
- What is the probability that an email user receives exactly 2 emails per day?
- What is the probability that an email user receives at most 2 emails per day?
- What is the standard deviation?
- Standard Deviation =
Text message users receive or send an average of 41.5 text messages per day.
- How many text messages does a text message user receive or send per hour?
- What is the probability that a text message user receives or sends two messages per hour?
- What is the probability that a text message user receives or sends more than two messages per hour?
- Let X = the number of texts that a user sends or receives in one hour. The average number of texts received per hour is ≈ 1.7292.
On May 13, 2013, starting at 4:30 PM, the probability of low seismic activity for the next 48 hours in Alaska was reported as about 1.02%. Use this information for the next 200 days to find the probability that there will be low seismic activity in ten of the next 200 days. Use both the binomial and Poisson distributions to calculate the probabilities. Are they close?
Let X = the number of days with low seismic activity.
Using the binomial distribution:
Using the Poisson distribution:
- Calculate μ = np = 200(0.0102) ≈ 2.04
We expect the approximation to be good because n is large (greater than 20) and p is small (less than 0.05). The results are close—both probabilities reported are almost 0.
Estimating the Binomial Distribution with the Poisson Distribution
We found before that the binomial distribution provided an approximation for the hypergeometric distribution. Now we find that the Poisson distribution can provide an approximation for the binomial. We say that the binomial distribution approaches the Poisson. The binomial distribution approaches the Poisson distribution is as n gets larger and p is small such that np becomes a constant value. There are several rules of thumb for when one can say they will use a Poisson to estimate a binomial. One suggests that np, the mean of the binomial, should be less than 25. Another author suggests that it should be less than 7. And another, noting that the mean and variance of the Poisson are both the same, suggests that np and npq, the mean and variance of the binomial, should be greater than 5. There is no one broadly accepted rule of thumb for when one can use the Poisson to estimate the binomial.
As we move through these probability distributions we are getting to more sophisticated distributions that, in a sense, contain the less sophisticated distributions within them. This proposition has been proven by mathematicians. This gets us to the highest level of sophistication in the next probability distribution which can be used as an approximation to all of those that we have discussed so far. This is the normal distribution.
A survey of 500 seniors in the Price Business School yields the following information. 75% go straight to work after graduation. 15% go on to work on their MBA. 9% stay to get a minor in another program. 1% go on to get a Master's in Finance.
What is the probability that more than 2 seniors go to graduate school for their Master's in finance?
This is clearly a binomial probability distribution problem. The choices are binary when we define the results as "Graduate School in Finance" versus "all other options." The random variable is discrete, and the events are, we could assume, independent. Solving as a binomial problem, we have:
Adding all 3 together = 0.12339
An approximation that is off by 1 one thousandth is certainly an acceptable approximation.