You are on page 1of 14

Summary of Common Distributions

Stat 305 Spring Semester 2006

Discrete Distributions
This section summarizes important facts about the following discrete random variables: Bernoulli, binomial,
hypergeometric, geometric, negative binomial, and Poisson. Discrete random variables have sample spaces
that are either finite or countably infinite. A good source of information on a wide variety of discrete
distributions can be found at http://mathworld.wolfram.com/topics/DiscreteDistributions.html.

Bernoulli
Sample Space
S = {success, failure}
The interpretation of what constitutes a success or failure depends on the application.
Definition
X(success) = 1
X(failure) = 0
PDF
If 0 < p < 1, then
fX (x) = P (X = x) =

p,
if x = 1
.
1 p, if x = 0

Parameters
p = probability of a success
Symbol
X Ber(p)
Expectation
E(X) = p
Variance
V ar(X) = p(1 p)
MGF
X (t) = pet + (1 p), < t <

Binomial
Sample Space
S is the set of all sequences of length n consisting of successes and failures
The interpretation of what constitutes a success or failure depends on the application.
1

Definition
X(s) is the number of successes in the sequence s
PDF


n x
fX (x) = P (X = x) =
p (1 p)nx , x = 0, 1, ..., n
x

Parameters
n = number of trials of the experiment
p = probability of a success
Symbol
X Bin(n, p)
Expectation
E(X) = np
Variance
V ar(X) = np(1 p)
MGF
X (t) = (pet + (1 p))n , < t <
Facts
1. If X1 , X2 , ..., Xn are independent, Bernoulli random variables each with parameter p, then X = X1 +
X2 + + Xn is a binomial random variable with parameters (n, p).
2. If X1 , X2 , ..., Xk are independent, binomial random variables with Xi having parameters (ni , p), then
X = X1 + X2 + + Xk is a binomial random variable with parameters (n1 + n2 + + nk , p).
Therefore, the sum of independent, binomial random variables with the same parameter p is again
binomial.

Hypergeometric
Sample Space
S is the set of all samples consisting of n balls (drawn without
replacement) from a total of r + w balls where r is the
number of red balls and w is the number of white balls
Definition
X(s) is the number of red balls in the sample s
PDF
fX (x) = P (X = x) =

w
nx
r+w

, x = max{0, n w}, ..., min{n, r}

Parameters
r =
w =
n =

number of red balls


number of while balls
size of the sample

Symbol
X Hyp(r, w, n)
Expectation
nr
E(X) =
r+w
Variance

nrw
r+wn
V ar(X) =
(r + w)2 r + w 1

T n
= npq
where T = r + w, p = r/T , and q = w/T
T 1
Note that T is the total number of balls, p is the probability of choosing a red ball, and q is the probability
of choosing a white ball.
MGF
The moment-generating function for the hypergeometric distribution is complicated and involves what is
called the hypergeometric function, a special function in applied mathematics. (See http://mathworld.
wolfram.com/HypergeometricDistribution.html for details.)
Facts
1. If Y is the binomial random variable whose value is the number of red balls obtained in n independent
choices (drawn with replacement), then
r
= np and
r+w
w
r

= npq.
V ar(Y ) = n
r+w r+w
E(Y ) = n

Notice that the expectations of X and and Y are the same, and the variances dier only by the factor
=

T n
.
T 1

As T , that is, as the total number of balls becomes large in relation to a fixed value of n, we
have that 1. Therefore, the variance of X converges to the variance of the binomial random
variable Y as T . What this means is that there is little dierence in sampling with or without
replacement when T , the total number of balls, is large in comparison the the sample size n. This fact
is important in applications to polling.
2. The formula for V ar(X) depends of the concept of covariance. See p. 253.

Geometric
Sample Space
S is the set of all sequences of a (possibly empty)
sequence of failures followed by a single success
For example, s, fs, f fs, ff fs, and f ff f s are outcomes in S. The interpretation of what constitutes a
success or failure depends on the application.
Definition
X(t) is the number of failures in the outcome t
For example, X(f ff f s) = 4.
PDF
If 0 < p < 1,
fX (x) = P (X = x) = (1 p)x p, x = 0, 1, ....
Parameters
p = probability of a success
Symbol
X Geo(p)
Expectation
E(X) =

1p
p

Variance
V ar(X) =
MGF
X (t) =

1p
p2

p
, t < ln(1 p)
1 (1 p)et

Negative Binomial
Sample Space
S is the set of all sequences of success and failures ending
with a success and containing a total of r successes
For example if r = 3, then f f fsf sff ff s is an outcome in S. The interpretation of what constitutes a
success or failure depends on the application.
Definition
X(t) is the number of failures in the outcome t
For example, X(f ff sfsf f ff s) = 8.
4

PDF
If 0 < p < 1,

fX (x) = P (X = x) =

x+r1
(1 p)x pr , x = 0, 1, ....
x

Parameters
p =
r =

probability of a success
number of successes that need to be obtained

Symbol
X Negbin(r, p)
Expectation
E(X) =

r(1 p)
p

Variance
V ar(X) =
MGF
X (t) =

r(1 p)
p2

p
1 (1 p)et

, t < ln(1 p)

Facts
1. If r = 1, the negative binomial is just the geometric.
generalizes the geometric distribution.

That is, the negative binomial distribution

2. If X1 is the number of failures obtained before the first success, and for i > 1, the random variable
Xi is the number of failures obtained after the (i 1)st success but before the ith success, then
X1 , X2 , ..., Xr are independent, geometric random variables each with parameter p. Also, note that
X = X1 + X2 + + Xr is the total number of failures obtained before the rth success. Therefore, X is
a negative binomial random variable with parameters (r, p). Thus, the sum of independent, geometric
random variables with the same parameter p is negative binomial.
3. If X1 , X2 , ..., Xk are independent, negative binomial random variables such that Xi has parameters
(ri , p), then X = X1 + X2 + + Xk is a negative binomial random variable with parameters (r1 + r2 +
+ rk , p). Therefore, the sum of independent, negative binomial random variables with the same
parameter p is negative binomial.

Poisson
Sample Space
S = {0, 1, 2, ...}
Definition
X(s) = s
The value of the random variable is simply the outcome of the experiment.
5

PDF
For > 0,
fX (x) = P (X = x) =

e x
, x = 0, 1, ....
x!

Parameters
= average number of random events occurring in unit time (called the Poisson rate)
Symbol
X Poi()
Expectation
E(X) =
Variance
V ar(X) =
MGF
X (t) = exp((et 1) , < t <
Facts
1. Poisson random variables are most often used to model the number of random events that occur in
unit time. (Random events occurring in time is called a Poisson process. See p. 259.) Examples of
random events are accidents at an intersection or hurricanes in the Gulf of Mexico. In this case, is the
average number of random events occurring in unit time, and is called the Poisson rate. For example,
if the Bears fumble the ball twice on average in a game, then the Poisson rate is = 2/60 = 0.033
fumbles/minute. Therefore, if X is the number of fumbles the Bears commit in a minute of play,
fX (x) =

(0.033)x e0.033
, x = 0, 1, ....
x!

2. If X1 , X2 , ..., Xk are independent, Poisson random variables such that Xi has parameter i , then
X = X1 + X2 + + Xk is a Poisson random variable with parameter 1 + 2 + + k . The sum
of independent, Poisson random variables is Poisson.
3. The Poisson distribution can be used to approximate the binomial distribution if n is large and p is
small. Specifically, if X is a binomial random variable with parameters (n, p) and n is large and p is
small, then if = np,

n x
e x
.
p (1 p)nx
fX (x) =
x
x!

Continuous Distributions
This section summarizes important facts about the following continuous random variables: uniform, normal,
gamma, exponential, and beta. Sample spaces of continuous random variables are intervals (either finite or
infinite) on the real line. The value of a continuous random variable is simply the outcome of the experiment.
That is, continuous random variables are the identity function on their respective sample spaces. Therefore,
there is no need to define a continuous random variable. A good source of information on a wide variety of
continuous distributions can be found at http://mathworld.wolfram.com/topics/ContinuousDistributions.html.

Uniform
Sample Space
S = (a, b), a < b
PDF
fX (x) =

1
ba

Parameters
a =
b =

left endpoint
right endpoint

Symbol
X U(a, b)
Expectation
E(X) =

a+b
2

Variance
V ar(X) =

(b a)2
12

MGF
X (t) =

ebt eat
, <t<
t(b a)

Normal
Sample space
S = (, )
PDF
If < < and > 0,

2 !
1
1 x
fX (x) =
exp
, <x<
2

2
Parameters
=
2 =

mean
variance

Symbol
X N(, 2 )

Expectation
E(X) =
Variance
V ar(X) = 2
MGF

1 22
X (t) = exp t + t , < t <
2

Facts
1. If X1 , X2 , ..., Xk are independent, normal random variables such that Xi has parameters (i , 2i ) and
1 , 2 , ..., k , are real numbers, then X = 1 X1 + 2 X2 + + k Xk + is a normal random variable
with parameters (1 1 + 2 2 + + k k + , 21 21 + 22 22 + + 2k 2k ). That is, the mean of X
is 1 1 + 2 2 + + k k + and the variance of X is 21 21 + 22 22 + + 2k 2k . Therefore, the
sum of independent, normal random variables is normal.
2. If X1 , X2 , ..., Xn is a random sample of size n from a normal random variable with parameters (, 2 ),
then X n = (X1 + X2 + + Xn )/n is normal with parameters (, 2 /n).
3. If X is a normal random variable with parameters (, ), then Z = (X )/ is a standard normal
random variable. That is, Z has = 0 and 2 = 1.
4. If (x) denotes the cumulative distribution function of the standard normal, then (x) + (x) = 1
for < x < . This formula can be used to compute probabilities from a standard normal table
of values.
5. If X is a normal random variable with parameters (, ), and FX (x) is the cumulative distribution
1
1
(p) of X is given by FX
(p) = + 1 (p) where 1 (p)
function of X, then the quantile function FX
is the quantile function of the standard normal.
6. If X is a normal random variable with parameters (, ), then Pr(|X | k) = Pr(|Z| k) where
Z denotes the standard normal. That is, the probability that a normal random variable is within k
standard deviations of its mean is exactly equal to the probability that the standard normal is within
k standard deviations of its mean (since = 1 for the standard normal). Therefore, the probability
that any two normal random variables are within k standard deviations of their respective means is
the same.
7. If X is a continuous random variable such that log(X) is normal, then X is said to have a lognormal
distribution.
8. If X1 , X2 , ..., Xn , ... is an infinite sequence of independent and identically distributed random variables
each with mean and variance 2 . Then
Z b

(X1 + X2 + + Xn ) n
1
1

exp z 2 dz = (b) (a).


lim Pr a
b =
n
2
n
2
a
This result, called the Central Limit Theorem, says that X = X1 + X2 + + Xn is approximately
normal with mean n and variance n2 . As a special case, we have
Z b

1
Xn
1
b =
exp z 2 dz = (b) (a).
lim Pr a
n
/ n
2
2
a

Gamma
Sample space
S = (0, )
8

PDF
If , > 0,
fX (x) =

1 x
x
e
,x>0
()

Parameters
=
=

shape parameter
scale parameter

Symbol
X (, )
Expectation

E(X) =

Variance
V ar(X) =
MGF
X (t) =

,t<

Facts
1. The gamma pdf is defined in terms of the gamma function which is traditionally denoted as (). The
gamma function is an important special function in applied mathematics. For > 0, () is defined
as
Z
x1 ex dx.
() =
0

It is easy to show that (1) = 1, and (by using integration by parts) that () = ( 1)( 1). In
particular, if n > 1 is an integer, then (n) = (n 1)!. Therefore, (2) = 1, (3) = 2, (4) = 6, etc.

2. The gamma distribution is often used to model the time elapsed until the occurrence of the nth random
event in a Poisson process. We take = n and to be the Poisson rate. For example, if the Bears
fumble the ball at a Poisson rate of = 0.033 fumbles/min (about 2 fumbles per game) and X measures
the time until the 10th fumble of the season, then
fX (x) =

(0.033)10 9 0.033x
x e
, x > 0.
(10)

3. Note that fX (x) integrates to 1 (a necessary requirement for a pdf) since


Z
Z

x1 ex dx
fX (x)dx =
()
0
0
Z

=
x1 ex dx
() 0
Z
1


u1 eu du (after the substitution u = x)
=
() 0
1

()
=
()
= 1.
9

4. The graph of the gamma distribution depends on the choices made for the parameters and . The
graphs below depict the gamma distribution for = 2 and various values of .

5. The expectation, variance, and moment-generating function of the gamma distribution are easily derived using the definition of the gamma function. See p. 297 for details.
6. If X1 , X2 , ..., Xk are independent, gamma random variables such that Xi has parameters (i , ) and
X = X1 + X2 + + Xk , then X is a gamma random variable with parameters (1 + 2 + k , ).
Therefore, the sum of independent, gamma random variables with the same is a gamma random
variable.

Exponential
Sample Space
S = (0, )
PDF
If > 0,
fX (x) = ex , x > 0.
Parameters
= scale parameter
Symbol
X Exp()
Expectation
E(X) =

10

Variance
V ar(X) =

1
2

MGF
X (t) =

,t<
t

Facts
1. The exponential distribution is a special case of the gamma distribution. If = 1 in the pdf of the
gamma, we obtain
fX (x) =

1 11 x
x e
(1)

= ex .
2. The exponential distribution has a very special property commonly called the memoryless property.
Specifically, if X is an exponential random variable with parameter , then for t, h > 0,
R
ex dx
e(t+h)
P (X t + h)
= Rt+h
=
= eh = P (X h).
P (X t + h | X t) =

t
x dx
P (X t)
e
e
t

If we think of t as representing time (say) and X as recording the time of arrival of some random event,
then what the above equation is saying is that if the event has not occurred after the first t time units
has elapsed, the probability of it occurring in the next h time units is the same as if we reset the time
back to t = 0. That is, the random variable X forgets about the time interval [0, t]. The fact that
the event has not happened by time t has no eect on whether it will happen during the next h time
units.

3. The exponential distribution is often used to model the elapsed time before the occurrence of the first
event in a Poisson process. For example, if the Bears fumble the ball with a Poisson rate = 0.033
fumbles/minute and X is the time until the first fumble, then
fX (x) = (0.033)e0.033x , x > 0.
In fact, even more is true! Because of the memoryless property, the pdf above also models the time
between fumbles. For example, the time that elapses between the 7th and 8th fumbles is given by the
same pdf.
4. If X1 , X2 , ..., Xk are independent, exponential random variables such that Xi has parameter and
X = X1 + X2 + + Xk , then X is a gamma random variable with parameters (k, ). Therefore, the
sum of independent, exponential random variables with the same is a gamma random variable.

Beta
Sample Space
S = (0, 1)
PDF
If , > 0,
fX (x) =

( + ) 1
x
(1 x)1 , 0 < x < 1.
()()

11

Parameters
=
=

first shape parameter


second shape parameter

Symbol
X Beta(, )
Expectation
E(X) =

Variance
V ar(X) =

( + )2 ( + + 1)

MGF
The moment-generating function for the beta distribution is complicated and involves what is called the confluent hypergeometric function, a special function in applied mathematics. (See http://mathworld.wolfram.
com/BetaDistribution.html for details.) However, for certain values of and , the moment-generating
function has a simple form. For example, if = 3 and = 2, then
X (t) =

1 2 t
(t e 4tet + 6et 2t 6), < t < .
t4

Facts
1. If = = 1, then the gamma distribution reduces to the uniform distribution on (0, 1).
2. Since beta random variables take values in (0, 1), they are useful in modeling proportions, percentages,
or probabilities. A common use of the beta distribution is to model the unknown probability p of a
success in a Bernoulli trial.
3. Jacobians and the Change of Variables Theorem must be used to show that the pdf of a beta distribution
integrates to 1. (See p. 304.)
4. The parameters and are shape parameters for the beta distribution.
(a) If = = 1, the beta distribution reduces to the uniform distribution. (See (a) below.)
(b) If < 1, the pdf approaches + as x 0+. (See (c) and (f) below.)

(c) If < 1, the pdf approaches + as x 1. (See (d) and (f) below.)

(d) If < 1 and < 1, the pdf is U -shaped. (See (b) and (f) below.)
(e) If < 1 and 1, the pdf is decreasing. (See (c) below.)

(f) If 1 and < 1, the pdf is increasing. (See (d) below.)

(g) If > 1 and > 1, the pdf has a single maximum. (See (e) below.)
(h) If = , the pdf is symmetric about x = 1/2. (See (a) and (b) below.) If 6= , the pdf is
skewed. (See (c) through (f) below.)

12

(a)

(b)

(c)

(d)

13

(e)

(f)

5. If X1 and X2 are independent, gamma random variables with parameters (1 , ) and (2 , ) respectively, then it turns out that X1 /(X1 + X2 ) is beta with parameters (1 , 2 ). (This is easy to verify
by first letting Y1 = X1 + X2 and Y2 = X1 /(X1 + X2 ) and then finding the joint distribution of Y1 and
Y2 using Jacobians. It is then straightforward to find the marginal distribution of Y2 and verify that
it is beta with parameters (1 , 2 ). Of course, Y1 is gamma with parameters (1 + 2 , ).)

14

You might also like