You are on page 1of 17

BINOMIAL AND

RELATED
DISTRIBUTIONS
SUBMITTED BY
OSAMA BIN AJAZ
(std_18154@iobm.edu.pk)

CONTENTS
Abstract

03

Bernoulli distribution

04

Binomial distribution
05
Multinomial Distribution
07
Beta binomial distribution
08
Correlated binomial distribution
08

Altham-multiplicative binomial distribution


09
Neyman C () test
09
Testing goodness of fit of binomial distribution
09
The C () test for correlated binomial alternatives
10
C () test for beta binomial alternatives
10
The C () test for Althams multiplicative alternatives
11

Monte Carlo Study

12

Pitman asymptotic relative efficiencies


13
References

14

ABSTRACT
R. E. Tarone from National Cancer Institute, Bethesda, Maryland;
derive the tests for the goodness of fit of the binomial distribution
using C() procedure of Neyman (1959), which are
asymptotically optimal against generalized binomial alternatives
proposed by Altham (1978) and Kupper & Haseman (1978).
Before coming to the article I have explain about binomial and
related distributions. I have reproduced key parts of the article, if
somebody interested in detail of the article then he is advice to
see references at the end page of the report.

BINOMIAL AND RELATED DISTRIBUTIONS


Bernoulli trial
A Bernoulli trial (named after James Bernoulli, one of the founding fathers of
probability theory) is an experiment with two, and only two possible
outcomes [2]. For example: female or male, life or death, Head or Tail and
success or failure etc. A sequence of Bernoulli trials occur when a Bernoulli
experiment is performed several independent times so that the probability of
success, say p, remains the same from trial to trial.

Bernoulli distribution
A random variable X is defined to have a Bernoulli distribution if the discrete
density function of X is given by

1 x

f ( x )= p (1p) forx=01
0 otherwise
Where the parameter p satisfies 0p1,
If X has a Bernoulli distribution, then
E[x] = p,
var [x] = pq,
Mx (t) = pet + q.
Proof
1

E[x] =

x p x (1 p)1x=0. q+1. p= p
x=0

Var[x] = E[x2]-{E[x]} 2= o2.q + 12.p - p2= pq

1-p is often denoted by q. [1]

Mx (t) = E[etx] =

etx p x ( 1 p)1x
x=0

= q+pet

Example 1: out of millions of instant lottery tickets, suppose that 20% are
winners. If five such tickets are purchased, (0, 0, 0, 1, 0) is a possible
observed sequence in which the fourth ticket is a winner and the other four
are losers. Assuming independence among winning and losing tickets, the
probability of this outcome is (0.8) (0.8) (0.8) (0.2) (0.8) = (0.2) (0.8) 4 [5]
In a sequence of Bernoulli trials, we are often interested in the total number
of successes and not in the order of their occurrence. If we let the random
variable X equal the number of observed successes in n Bernoulli trials, the
possible values of X are 0, 1, 2, . . ., n. if x successes occur, where x=0,1,2,
, n, then n-x failures occur. The number of ways selection x positions for
the x successes in the n trials is

n!
(nx)= x !(nx)!

since the trials are

independent and since the probabilities of success and failure on each trial
are, respectively, p and q=1-p, the probability of each of these ways is px (1p) n-x. Thus f(x), the p.m.f of X, is the sum of the probabilities of these
mutually exclusively events, that is
f ( x )= n p x (1 p)n x for x=0,1,2, n
x

()

(nx )

These probabilities are called binomial probabilities, and the random


variable X is said to have a binomial distribution.
A binomial experiment satisfies the following properties:
1. A Bernoulli experiment is performed n times.
2. The trials are independent.
3. The probability of success on each trial is a constant p; the probability
of failure is q=1-p.
4. The random variable X equals the number of successes in the n trials.
A binomial distribution is denoted by the symbol b (n, p) and we say that the
distribution of X is b (n, p). The constants n and p are called the parameters
of the binomial distribution. Thus if we say that the distribution of X is b (10,

1/5), we mean that X is the number of successes in a random sample of size


n=10 from a Binomial distribution with p=1/5.
The binomial distribution derived its name from the fact that the (n+1) terms
in the binomial expansion of (q + p) n correspond to the various values of b(x;
n, p) for x=0, 1, 2. . . n. That is

(n0 ) q +(n1 ) pq +(n2 ) p q


n

(q+ p) n=

n1

n2

+ + n pn
n

()

Since (q + p) =1, we see that

b ( x ; n , p )=1
x=0

, a condition that must be hold

for any probability distribution.


Example 2: if we want to find the probability of obtaining exactly three 2s if
an ordinary die is tossed 4 times; then the probability is:

b (4,

6 =

1
6

5
6

4
3

()

The mean, variance and moment generating function of binomial distribution


b(x; n, p) are:
= np
2 = npq
Mx (t) = (q+pet) n respectively.

Proof

pe
( t )x qn x
Mx (t) = E[etx] =

x=0

x=0

etx (nx ) p x qn x = (nx )

= (pet + q)

Now taking first derivative of Mx (t) = Mx (t) = npet (pet + q) n-1


And second derivative is
(pet + q) n-1

Mx (t) = n (n-1) (pet) 2(pet + q) n-2 + npet

Hence E[x] =Mx (0) = np and


Var[X] = E[x2]-{E[x]} 2= Mx (0) (np) 2= n (n 1) p2 + np (np) 2= np (1 p)

Example 3: If the mgf of a random variable X is M (t) =

2 1 t
+ e
3 3
)5

then X has a binomial distribution with n = 5 and p = 1/3; that is, the pmf of
X is

Here = np = 5 /3 and 2 = np (1 p) = 10/ 9.


Note: Binomial distribution reduces to the Bernoulli distribution when n=1.
Sometimes the Bernoulli distribution is called the point binomial.
Example 4: Let the random variable Y be equal to the number of successes
throughout n independent repetitions of a random experiment with
probability p of success. That is, Y is b (n, p). The ratio Y/n is called the
relative frequency of success. Now recall Chebyshevs Inequality i.e. P (|x-|
2
) 2 for all >0.

Applying this result, we have for all > 0 that


Y
Var ( )
Y
p(1 p)
n
P (| n p )
=
2

n 2

Now, for every fixed > 0, the right-hand member of the preceding inequality is close to zero
for sufficiently large n. That is

Since this is true for every fixed > 0, we see, in a certain sense that the
relative frequency of success is for large values of n, close to the probability
of p of success [3].

Example 5: Let the independent random variables X1, X2, X3 have the same
cdf F(x). Let Y be the middle value of X1, X2, X3. To determine the cdf of Y ,
say FY (y) = P(Y y), we note that Y y if and only if at least two of the
random variables X1, X2, X3 are less than or equal to y. Let us say that the ith
trial is a success if Xi y, i = 1, 2, 3; here each trial has the probability of
success F(y). In this terminology, FY (y) = P(Y y) is then the probability of
at least two successes in three independent trials. Thus
FY(y) =

3
2

()

y
1F
[
)+
2
[F ( y )]

[F ( y )] .

If F(x) is a continuous cdf so that the pdf of X is F(x) =f(x), then the pdf of Y
is
FY(y) = FY(y) =6[F(y)] [1-F(y)] f(y). [4]
MULTINOMIAL DISTRIBUTION
Recall that in order for an experiment to be binomial; two outcomes are
required for each trial. But if each trial in an experiment has more than two
outcomes, a distribution called the multinomial distribution must be used.
For example, a survey might require the responses of approve,
disapprove, or no opinion. In another situation, a person may have a
choice of one of five activities for Friday night, such as a movie, dinner,
baseball game, play, or party. Since these situations have more than two
possible outcomes for each trial, the binomial distribution cannot be used to
compute probabilities.
If X consists of events E1, E2, E3, . . . , Ek, which have corresponding
probabilities p1, p2, p3, . . . , pk of occurring, and X1 is the number of times E1

will occur, X2 is the number of times E2 will occur,X3 is the number of times E3
will occur, etc., then the probability that X will occur is
P ( X )=

n!
. p x p x p xk
X1 ! X2! X3! Xk ! 1 2
1

Where X1 + X2 + X3 + . . . + Xk = n and p1 +p2 +p3 + . . . + pk = 1


For an illustration purpose let a box contains four white balls, three red balls,
and three blue balls. A ball is selected at random, and its color is written
down. It is replaced each time and let we want to find the probability that if
five balls are selected, two are white, two are red, and one is blue.

Beta binomial distribution


The distribution with discrete density function
f(x) = f(x; n, , ) =

(nx)

( + ) (n+ x )
.
(n+ + )

I{0,1 , , n}(x)

Where n is a nonnegative integer, >0 and >0, is defined as the beta


binomial distribution.
The beta binomial distribution has Mean =

n
+

and variance =

n ( n+ + )
( + )2 ( + +1)
If ==1, then the beta binomial distribution reduces to a discrete uniform
distribution over the integers 0, 1 n. [2]

Correlated binomial (CB) distribution [4]

This distribution is derived on the assumption that the binary responses of


the fetuses in a litter are not mutually independent. This idea is due to
Bahadur (1961). Retaining only the first order correlation between the
responses and denoting as the covariance between the binary responses of
any two fetuses, the random variable X is such that

where p is the probability that the fetus is abnormal. Note that for the above
equation to be a valid probability distribution, a data-dependent bound for
the parameters has to be imposed; see Kupper and Haseman (1978). It can
be shown that the expectation and variance of the correlated binomial
distribution are np and np (1-p) + n(n-1), respectively. Thus, the correlated
binomial distribution is a generalization of the binomial distribution, the CB
distribution becomes the binomial distribution when =0. Altham (1978)
derived a further two-parameter generalized binomial distribution, namely,
the multiplicative generalized binomial (MB) distribution.

Altham-multiplicative binomial distribution


The probability mass function of the Altham-multiplicative binomial
distribution is
n p (1 p)
(
x)
P ( X=x )=
x

nx

a x(nx)

F( n)

x= 0, 1, 2, . . . , n
a0
0p1

Neyman C () test

10

Neyman (1959) introduces the C () test with the consideration that


hypotheses testing problems in applied research often involve several
nuisance parameters. In these composite testing problems, most powerful
tests do not exist, motivating search for an optimal test procedure that yields
the highest power among the class of tests obtaining the same size.
Neymans locally asymptotically optimality result for the C() test employs
regularity conditions inherited from the conditions used by Cramer (1946) for
showing consistency of MLE and some further restrictions on the testing
function to allow for replacing the unknown nuisance parameters by its nconsistent estimators. It is the confluence of these Cramer conditions and
the maintained significance level that gives the name to the C () test

TESTING THE GOODNESS OF FIT OF THE BINOMIAL


DISTRIBUTION*
R. E. Tarone from National Cancer Institute, Bethesda, Maryland; derive the
tests for the goodness of fit of the binomial distribution using C() procedure
of Neyman (1959), which are asymptotically optimal against generalized
binomial alternatives proposed by Altham (1978) and Kupper & Haseman
(1978) [5].

The C () test for correlated binomial alternatives


Consider an experiment in which the responses take the form of proportions
and let the ith response be given by pi=xi/ni for i=1, ... , M. Under the
correlated binomial model the log likelihood function is :
M

i=1

i=1

L=K + { xi log p+(nix i) log q }+ log [ 1+

x ni p ) 2+ x i ( 2 p1 )ni p2 }]
2 2 {( i
2p q

Where q = I-p and K is a constant involving only the observations. A test of


the goodness of fit of the binomial distribution is obtained by testing the null
hypothesis: Ho: =0 in the presence of nuisance parameter p. Moran (1970)
demonstrated that for such problems the C () tests proposed by Neyman
(1959) are asymptotically equivalent to tests using maximum likelihood

11

estimators. In order to derive the C () test statistic for Ho: = 0 we need


the following partial derivatives of L evaluated at = 0:

Under the null hypothesis, the xi are independent binomial random variables,
and hence it follows from (2) that E {S2 (p)} =0. Neyman (1959) has shown
that when E {S2 (p)} =0 the null hypothesis Ho: =0 can be tested using the
^
statistic S1 ( p) , where ^p is a root-n consistent estimator of p (Moran,
1970). Substituting the consistent estimator

^p=

xi
ni

into (1) and defining

p
x ini ^
ni

S
2
(^
p) =
S=
,
we
find
that
C
()
test
statistic
is
given
by
S
.

Since E {S2 (p)} =0, the variance of S ( ^p ) is given by E {S3 (p)} where the
expectation is taken under Ho: =0. From (3) it follows that E {S3 (p)} =

ni (ni1)
2 p2 q2

. Substituting

^p

for p in the variance expression we find that

under the null hypothesis, the statistic

will have an asymptotic chi-squared distribution with one degree of freedom.


The statistic X2c is the C () test statistic for homogeneity of proportions
which is asymptotically optimal against correlated binomial alternatives.
The binomial variance test for homogeneity is based on the statistic

12

Which has an asymptotic chi-squared distribution with M - 1 degrees of


freedom when b= 0. It is clear from the above expressions that for the case
in which ni = n for all i, the C () test statistic S is equivalent to the variance
test statisticX2v.

C () test for beta binomial alternatives


The beta-binomial distribution is a mixture of binomial distributions which
has often
been utilized as an alternative to the binomial distribution. Under the betabinomial model
the log likelihood function is given by

Where K is a constant involving only the observations. A test of the goodness


of fit of the
binomial distribution is obtained by testing the null hypothesis Ho: = 0.
The derivation of
the C () test statistic using the beta-binomial model is similar to the
derivation for the correlated binomial model, and the optimal statistic again
is found to be the statistic S
derived in the last section. Note, however, that in the beta-binomial model
the parameter cannot take negative values. The alternative hypothesis is
necessarily one sided, and hence the
C () test is the one-sided test based on the statistic the C () test is the onesided test based on the statistic cannot take negative values. The alternative
hypothesis is necessarily one sided, and hence the C () test is the one-sided
test based on the statistic

Under the null hypothesis Ho: = 0, the statistic Z will have an asymptotic
standard normal
distribution.

The C () test for Althams multiplicative alternatives

13

The multiplicative generalization of the binomial distribution provides an alternative for which
the correlated binomial C () test is not asymptotically optimal. The log likelihood function for
the multiplicative generalization of the binomial model is

i
nix
The C () test for Ho: =1 is based on the statistic x I () . Note that unlike the correlated
R=
binomial C () statistic, R is not equivalent to the variance test statistic in the case ni = n for all i.

Will have an asymptotic chi-squared distribution with one degree of freedom. The test based on
X2m is asymptotically optimal against alternatives given by the multiplicative generalization of
the binomial mode

Monte Carlo Study & Asymptotic Relative Efficiencies


In order to compare the different tests of the goodness of fit of the binomial
distribution we
consider the treatment group data of Kupper & Haseman (1978, p. 75). The
observed proportions were 0/5,2/5,1/7,0/8,2/8,3/8,0/9,4/9,1/10and 6/10.The
variance test gives X2v = 19.03 and P = 0.025,the correlated binomial C()
test gives X2c= 6.63and P = 0. 01. Thus for this example, the correlated
binomial C() test is more sensitive to the departure of the observed
proportions from a binomial distribution than the other tests considered.

14

In order to investigate the small sample distribution of the C () tests under


the null hypothesis, a Monte Carlo experiment was performed. Ten binomial
proportions were randomly generated using the unequal sample sizes from
the above example. For each pseudorandom sample of 10 proportions the C
() statistics X2c and the variance test statistic X2v were calculated and
compared to the 100%, 500 and 1% points of their asymptotic null
distributions. The empirical significance levels based on 1500 replications are
shown in Table 1 for under lying binomial probabilities of 0.10, 0.25and 0.50.
For the cases considered, the empirical significance levels for the correlated
binomial C () statistic are significantly lower than the nominal level for the
500 and 10% critical values. The empirical significance levels for the 1%
critical value show no consistent pattern.

Table 1:Empirical significance levels for the C() homogeneity tests


optimal against correlated binomial and variance test, based on
1500 replications for underlying binomial probabilities of 0.10, 0.25
and 0.50

Nomin
al
level
X2c
X2m
X2v

0.01

0.007
0.010
0.003

Binomial Probabilities
P=0.10
P=0.25
0.05
0.10
0.01
0.05
0.10

0.019
0.043
0.042

0.048
0.100
0.082

0.013
0.012
0.012

0.035
0.037
0.042

0.073
0.085
0.097

0.01

0.009
0.009
0.007

P=0.50
0.05
0.10

0.034
0.031
0.049

0.077
0.075
0.108

Table 2: Pitman asymptotic relative efficiencies of the


variance test and the generalized binomial C () tests for
correlated binomial and multiplicative alternatives

15

Test
statistic
X2v
X2c
X2m

Model under alternative


Correlated binomial Multiplicative generalized
binomial
0.95
0.71
1.00
0.82
0.79
1.00

The above table 2 showing asymptotic relative efficiencies of the


variance test and the generalized binomial C () test for
correlated binomial and multiplicative alternatives; it shows that
the correlated binomial C () test is more efficient than the
variance test for multiplicative alternatives as well as for
correlated binomial alternatives.

REFERENCES
1. Alexander M. Mood, Franklin A. Graybill and Duane C. Boes,
Introduction to the theory of statistics, third edition, McGrawHill series in probability and statistics

16

2. George Casella, Roger L. Berger, Statistical Inference (2002),


second edition, page 89, Duxbury Advanced Series.
3. Hogg, McKean and Craig, Introduction to Mathematical
Statistics (2013), seventh edition, Pearson education, Inc.
4. Paul S. R. , A three parameter generalization of binomial
distribution, Windsor mathematics report, February 1984
5. Robert V. Hogg, Elliot A. Tennis, Jagan Mohan Rao, Probability
and Statistical Inference, seventh edition, Pearson Education
6. Tarone, R. E. (1979), Testing the goodness of fit of binomial
distribution, Biometrika 66, 585 590

17

You might also like