The characteristics of a probability distribution or density function (PDF) are as follows:
- Each probability is between zero and one, inclusive (inclusive means to include zero and one).
- The sum of the probabilities is one.
4.1 Hypergeometric Distribution
The combinatorial formula can provide the number of unique subsets of size x that can be created from n unique objects to help us calculate probabilities. The combinatorial formula is
A hypergeometric experiment is a statistical experiment with the following properties:
- You take samples from two groups.
- You are concerned with a group of interest, called the first group.
- You sample without replacement from the combined groups.
- Each pick is not independent, since sampling is without replacement.
The outcomes of a hypergeometric experiment fit a hypergeometric probability distribution. The random variable X = the number of items from the group of interest. .
4.2 Binomial Distribution
A statistical experiment can be classified as a binomial experiment if the following conditions are met:
- There are a fixed number of trials, n.
- There are only two possible outcomes, called "success" and, "failure" for each trial. The letter p denotes the probability of a success on one trial and q denotes the probability of a failure on one trial.
- The n trials are independent and are repeated using identical conditions.
The outcomes of a binomial experiment fit a binomial probability distribution. The random variable X = the number of successes obtained in the n independent trials. The mean of X can be calculated using the formula μ = np, and the standard deviation is given by the formula σ = .
The formula for the Binomial probability density function is
4.3 Geometric Distribution
There are three characteristics of a geometric experiment:
- There are one or more Bernoulli trials with all failures except the last one, which is a success.
- In theory, the number of trials could go on forever. There must be at least one trial.
- The probability, p, of a success and the probability, q, of a failure are the same for each trial.
In a geometric experiment, define the discrete random variable X as the number of independent trials until the first success. We say that X has a geometric distribution and write X ~ G(p) where p is the probability of success in a single trial.
The mean of the geometric distribution X ~ G(p) is μ = where x = number of trials until first success for the formula where the number of trials is up and including the first success.
An alternative formulation of the geometric distribution asks the question: what is the probability of x failures until the first success? In this formulation the trial that resulted in the first success is not counted. The formula for this presentation of the geometric is:
The expected value in this form of the geometric distribution is
The easiest way to keep these two forms of the geometric distribution straight is to remember that p is the probability of success and (1−p) is the probability of failure. In the formula the exponents simply count the number of successes and number of failures of the desired outcome of the experiment. Of course the sum of these two numbers must add to the number of trials in the experiment.
4.4 Poisson Distribution
A Poisson probability distribution of a discrete random variable gives the probability of a number of events occurring in a fixed interval of time or space, if these events happen at a known average rate and independently of the time since the last event. The Poisson distribution may be used to approximate the binomial, if the probability of success is "small" (less than or equal to 0.01) and the number of trials is "large" (greater than or equal to 25). Other rules of thumb are also suggested by different authors, but all recognize that the Poisson distribution is the limiting distribution of the binomial as n increases and p approaches zero.
The formula for computing probabilities that are from a Poisson process is:
where P(X) is the probability of successes, μ (pronounced mu) is the expected number of successes, e is the natural logarithm approximately equal to 2.718, and X is the number of successes per unit, usually per unit of time.