You are on page 1of 78

Business Statistics

PGDM(2017-19)
Term-II (Sep-Dec,2017)

Kakali Kanjilal
Associate Professor, Operations
IMI, Delhi
Tests of Hypothesis
- Concepts of hypothesis
- Null and Alternative Hypotheses
- Type I, Type II Errors and Power of the Test
- Tests of hypotheses:
I. Population Mean (Single and Two Population)
a. known
b. unknown
II. Population Proportions (Single and Two Population)
III. Population Variance (Single and Two Population)
IV. Tests of Independence
Steps of Hypothesis Testing
Step 1: Develop the null and alternative hypotheses.

Step 2: Specify the margin of error or level of significance


and hence confidence level of the test

Step 3: Collect evidence or sample data, compute the


sample statistic, and then test statistic like Z (standard
normal variate) or t (Students t) etc.

Step 4: Decide to reject or accept null hypothesis based on


decision rule
p-value approach or
critical value approach
Hypothesis

A hypothesis is a statement or assertion about the state


of nature (about the true value of an unknown population
parameter):
The accused is innocent
= 100
Every hypothesis implies its contradiction or alternative:
The accused is guilty
100
Examples of Some Hypothetical
Statement
A new teaching method is developed that is believed to
be better than the current method.

A new drug is developed with the goal of lowering blood


pressure more than the existing drug and the result is
positive.

The label on a soft drink bottle states that it contains 67.6


fluid ounces.

Stock market plunges after Diwali

A quality control inspector needs to check if the


quantities supplied meet the specification criteria
Develop Hypotheses

These statements of assertion/claim/belief need to be


translated into

a. the statement that you want to test/Testing Statement


against
b. an alternative statement or contradiction of the testing
statement
a is called Null Hypothesis : H0 and
b is called alternative hypothesis: H1

H0 and H1 are:
Mutually exclusive: Only one can be true.
Exhaustive: Together they cover all possibilities, so one or the
other must be true.
Develop Null and Alternative
Hypotheses
Example:
A new teaching method is developed that is
believed to be better than the current method

Alternative Hypothesis: H1
The new teaching method is better.

Null Hypothesis: H0
The new method is no better than the old method.

Alternative Hypothesis is framed as a Research


Hypothesis
Develop Null and Alternative
Hypotheses
Example:
The label on a soft drink bottle states that it
contains on average at least 67.6 fluid ounces.

Null Hypothesis H0:


The label is correct. > 67.6 ounces.

Alternative Hypothesis: H1
The label is incorrect. < 67.6 ounces.

Here, Null Hypothesis is framed as an assumption to be


challenged.
One tail and Two tailed Tests
The tails of a statistical test are determined by the need for an
action.
If action is to be taken if a parameter is greater than some value a,
then the alternative hypothesis is that the parameter is greater than
a, and the test is a right-tailed test.
H0: 50
H1: 50
If action is to be taken if a parameter is less than some value a, then
the alternative hypothesis is that the parameter is less than a, and
the test is a left-tailed test.
H0: 50
H1: 50
If action is to be taken if a parameter is either greater than or less than some
value a, then the alternative hypothesis is that the parameter is not equal to a,
and the test is a two-tailed test. H0: 50
H1: 50
Summary

The equality part of the hypotheses always appears


in the null hypothesis.

In general, a hypothesis test about the value of a population mean


must take one of the following three forms (where 0 is the
hypothesized value of the population mean).

H 0 : 0 H 0 : 0 H 0 : 0
H a : 0 H a : 0 H a : 0

One-tailed One-tailed Two-tailed


(lower-tail) (upper-tail)
Steps of Hypothesis Testing
Step 1: Develop the null and alternative hypotheses.

Step 2: Specify the margin of error or level of significance


and hence confidence level of the test

Step 3: Collect evidence or sample data, compute the


sample statistic, and then test statistic like Z (standard
normal variate) or t (Students t) etc.

Step 4: Decide to reject or accept null hypothesis based on


decision rule
p-value approach or
critical value approach
Possible Outcomes of Hypothesis
Tests
A hypothesis is either true or false, and you may fail to
reject it or you may reject it on the basis of information:
Trial testimony and evidence
Sample data
Limitation of working with sample is that we cannot be 100%
confident about our decisions. But we can be reasonably
confident (90%, 95%, 98%, or 99%) if we can optimally
minimize the chances of error to 10%, 5%, 2% or 1%
respectively.
Possible Outcomes of Hypothesis
Tests
There are two possible states of nature:
H0 is true
H0 is false
There are two possible decisions:
Fail to reject H0 as it is true
Reject H0 as it is false
Possible Outcomes of Hypothesis
Tests

A decision may be correct in two ways:


Fail to reject a true H0
Reject a false H0
A decision may be incorrect in two ways:
Reject a true H0
Fail to reject a false H0
Type I and Type II Errors
Type-I Error: Reject a true H0
A Type-I error is probability of rejecting H0 when it is true.
The Probability of a Type I error is denoted by .
is called the level of significance of the test
(1- ) is the probability of accepting H0 when it is true is
confidence level

Type-II Error: Fail to reject a false H0


A Type-II error is probability of accepting H0 when it is false.
The Probability of a Type-II error is denoted by .
Statisticians avoid the risk of making a Type II error by using
do not reject H0 and not accept H0. It is difficult to control
The complement of , (1 ) is the probability of rejecting H0
when it is false called the power of the test.
Type I and Type II Errors
A contingency table illustrates the possible outcomes of a
statistical hypothesis test
States of Nature/Population Conditions

H0 True H0 False
Conclusion

Accept H0 Correct
Type II Error
Decision

Correct
Reject H0 Type I Error
Decision
General Rule to select
Type I error and Type II error
Cases where Type-I error is more costly, we choose a small
value of , namely 1%

Cases where Type-II error is more costly, we keep large


value of , namely 10%

Cases where we are not able to determine which type of


error is more costly, or if the costs are roughly equal or if
we do not have much knowledge about the relative costs
of the two types of errors, we will keep = 5%
Steps of Hypothesis Testing
Step 1: Develop the null and alternative hypotheses.

Step 2: Specify the margin of error or level of significance


and hence confidence level of the test

Step 3: Collect evidence or sample data, compute the


sample statistic, and then test statistic like Z (standard
normal variate) or t (Students t) etc.

Step 4: Decide to reject or accept null hypothesis based on


decision rule:
p-value approach or
critical value approach
Test Statistic
The null hypothesis H0 is accepted or rejected on the basis
of a Test Statistic

A test statistic is a sample statistic computed from sample


data. The value of the test statistic is used in determining
whether or not we may reject the null hypothesis.

For Population Mean test, z- test statistic or t-test statistic


is used depending on the knowledge of population
standard deviation
Test Population Mean
Cases in which the test statistic is Z

s is known and the population is normal.


s is known and the sample size is at least 30. (The
population need not be normal, will approach to normal by
CLT)
The formula for calculating Z is :
x
z
s

n
Test Population Mean
Cases in which the test statistic is t

s is unknown but the sample standard deviation s is


known and the population is normal.

The formula for calculating t is :


x
t
s

n
Steps of Hypothesis Testing
Step 1: Develop the null and alternative hypotheses.

Step 2: Specify the margin of error or level of significance


and hence confidence level of the test

Step 3: Collect evidence or sample data, compute the


sample statistic, and then test statistic like Z (standard
normal variate) or t (Students t) etc.

Step 4: Decide to reject or accept null hypothesis based on


decision rule:
p-value approach or
critical value approach
Decision based on
p-value Approach
p-value:
The p-value is the probability, computed using the test
statistic, that measures the support based on the sample for
the null hypothesis.
So, p-value is a kind of credibility rating of H0 in light of the
sample evidence. Suppose for the following hypothesis;
H0:
H1:

p-values obtained are 12% and 2% based on two different


samples when =0.05. The credibility of H0 when it is 12% is
higher than that of 2%. So, the chance of rejection is high in
the case of 2%.
Decision based on
p-value Approach

If the p-value is less than or equal to the level of


significance , the value of the test statistic is in the
rejection region.

Reject H0 if the p-value < (for one-tailed test)

For two tailed test, double the tail area to obtain the
p value.
Critical Value Approach

For a given level of significance , we can obtain z from


normal probability table.

z creates the boundary point of the rejection region.

Rejection region is also known as Critical Region.

Boundary Point is also called Critical Points


Critical Value Approach

The rejection rule is:


Lower tail test: H 0 : 0
H a : 0
Reject H0 if z < -z
Upper tail test: H 0 : 0
H a : 0
Reject H0 if z > z
Two tail test: H 0 : 0
H a : 0
Reject H0 if z < -z/2 or z > z/2.
Critical Region for two tailed test
When = 5%

f(z)
Critical Region Critical Region

Acceptance
0.025% Region 0.025%
Total = 5% Level
95%

-z0.05/2 = -1.96 0 z0.05/2 = 1.96 z

Critical Point Critical Point


Two tailed Tests
Confidence Interval Approach

Select a simple random sample from the population


and use the value of the sample mean x to develop
the confidence interval for the population mean .

If the confidence interval contains the hypothesized


value 0, do not reject H0. Otherwise, reject H0.
Two tailed Tests
Confidence Interval Approach
Glow Toothpaste
The 97% confidence interval for is

s
x z / 2 6.1 2.17(.2 30) 6.1 .07924
n
or 6.02076 to 6.17924

Because the hypothesized value for the population mean,


0 = 6, is not in this interval, the hypothesis-testing
conclusion is that the null hypothesis, H0: = 6, can be
rejected.
Hypothesis Tests About a Population
Mean when unknown

Test Statistic to be used is


x 0
t
s/ n
This test statistic has a t distribution with n - 1 degrees of
freedom.
Hypothesis Tests About a Population
Mean when unknown
Rejection Rule: p -Value Approach
Reject H0 if p value <
Rejection Rule: Critical Value Approach
H0: Reject H0 if t < -t

H0: Reject H0 if t > t

H0: Reject H0 if t < - t or t > t


Note: critical value approach is advisable in case of t-test statistic to avoid
complications as exact p-value is not available. Hence, an approximation or
interpolation is required to obtain p-value.
Some Practical Tips
A sample size of 30 provides good results in most cases.
If the population is skewed or has outliers, a sample size of
50 is preferred.
The smaller the p-value, the greater the evidence
against the null H0 and hence in favour of the alternate
hypothesis H1.
When the p-value is smaller than 0.01, the result is considered to be
very significant.
When the p-value is between 0.01 and 0.05, the result is considered to
be significant.
When the p-value is between 0.05 and 0.10, the result is considered by
some as marginally significant (and by most as not significant).
When the p-value is greater than 0.10, the result is considered not
significant.
Test of Hypothesis
Population Proportion
Null and Alternative Hypotheses
A hypothesis test about the value of a population
proportion p must take one of the following three forms
(where p0 is the hypothesized value of the population
proportion).

H0: p > p0 H0: p < p0 H0: p = p0


H a : p < p0 H a : p > p0 H a : p p0

One-tailed One-tailed Two-tailed


(left tail) (right tail)
Test Statistic

p p0
z
s p

where:
p0 (1 p0 )
s p
n

assuming np > 5 and n(1 p) > 5; Z follows normal distribution


Rejection Rule
p-value approach
Reject H0 if p value <

Critical value approach


Right Tail: H0: p p Reject H0 if z > z

Left Tail: H0: p p Reject H0 if z < -z

Two Tail :H0: pp Reject H0 if z < -z or z > z


Test of Hypothesis
Difference of Two Population Means

Population 1
Population 2
Cognizant Stock Return
WIPRO Stock Return
1 = mean return
of Cognizant 2 = mean return of WIPRO

1 2 = difference between
the mean returns of two populations

Simple random sample Simple random sample


of n1 stock return from Cognizant of n2 stock return from WIPRO
x1 = Its sample mean x2 = Its sample mean

x1 - x2 = Point Estimate of 1 2
Steps of Hypothesis Testing
Step 1: Develop the null and alternative hypotheses.

Step 2: Specify the margin of error or level of significance


and hence confidence level of the test

Step 3: Collect evidence or sample data, compute the


sample statistic, and then test statistic like Z (standard
normal variate) or t (Students t) etc.

Step 4: Decide to reject or accept null hypothesis based on


decision rule
p-value approach or
critical value approach
Hypothesis Tests
About 1 - 2
Possible Situations
I: Difference between two population means is 0
1= 2
H0: 1 -2 = 0
H1: 1 -2 0
II: Difference between two population means is less than 0
1 2
H0: 1 -2 0
H1: 1 -2 0
III: Difference between two population means is less than D
1 2+D
H0: 1 -2 D
H1: 1 -2 D
Hypothesis Tests
About 1 - 2
Possible Forms of Null Hypothesis

H 0 : 1 2 D0 H 0 : 1 2 D0 H 0 : 1 2 D0
H a : 1 2 D0 H a : 1 2 D0 H a : 1 2 D0
Left-tailed Right-tailed Two-tailed
1 is the mean of population 1 and 2 is the mean of
population 2.
The difference D0 between the two population means is
1 - 2 .
D0 can be positive or negative or zero. D0 = 0 is the most
common form of the test.
Test Statistic About 1 - 2

Case I: Test Statistic is Z (Standard Normal)


When Population standard deviations 1 and 2 are known.
( x1 x2 ) D0 1 2
z
s 12 s 22
Standard Error or Standard
n1 n2 Deviation of (x1 x2 )
Confidence Intervals for the
Difference between Two Population
Means: 1 - 2
Case I: Test Statistic is Z (Standard Normal)
When Population standard deviations 1 and 2 are known.

A large-sample (1-)100% confidence interval for the difference


between two population means, 1- 2 , using independent random
samples:

s12 s2
2
( x x ) z
1 2 n n
1 2
2
Test Statistic About 1 - 2
Case II: Test Statistic is t (Students t-Statistic)
When Population standard deviations 1 and 2 are unknown and
have to be estimated from two samples

(x x ) ( )
t= 1 2 1 20

2 1
sp
1
n n
1 2

Assuming that the population variances 12 and 22 are equal (even though
unknown), the statistic follows students t distribution with (n1 + n2 2) degrees
of freedom.
Under this assumption, the two sample variances, s12 and s22, provide two
separate estimators of the common population variance.
Combining the two separate estimates into a pooled estimate sp2 should give
us a better estimate than either sample variance by itself.
Pooled Estimate of the Population
Variance
A pooled estimate of the common population variance,
based on a sample variance s12 from a sample of size n1 and
a sample variance s22 from a sample of size n2 is given by:
( n 1) s 2
( n 1) s 2

s 2p 1 1 2 2

n1 n2 2
The degrees of freedom associated with this estimator is:
df = (n1+ n2-2)

That is, larger weight is given to the variance from the larger
sample.
Confidence Intervals for the
Difference between Two Population
Means: 1 - 2
Case II: Test Statistic is t (Students t-Statistic)
When Population standard deviations 1 and 2 are unknown and
have to be estimated from two samples; using the pooled
variance Sp2

A (1-) 100% confidence interval for the difference between two


population means, 1- 2 , using independent random samples and
assuming equal population variances:

2 1 1
( x1 x2 ) t sp
n1 n2
2
Hypothesis Tests
About 1 2: Matched Samples
Possible Situations
I: Difference between two population means is 0
1 - 2 = d
H0: d = 0
H1: d 0
II: Difference between two population means is less than 0
d
H0: d 0
H1: d 0
Test Statistic
Test statistic for the paired - observatio ns test

t d d , df n 1
S
d
n

d sample average for the difference s


d
d i
n
S sample standard deviation for the difference s
d
2
(d - d )
S i
d n -1
n sample size
mean of the population of difference s under the null hypothesis
d
Confidence Intervals for Paired
Observations

A (1- ) 100% confidence interval for the mean difference :


D

d t 2 sd
n

where t 2 is the value of the t distributi on with (n -1) degrees of freedom


that cuts off an area of 2 to its right, When the sample size is large,
we may approximat e t with z .
2 2
Hypothesis Tests:
Difference between two Population
Proportions p1 - p2
Possible Hypotheses

H 0 : p1 p2 0 H
H00:: pp1
- p2 < 0
1 p2 0 H 0 : p1 p2 0
H a : p1 p2 0 H
Ha:: pp1
- pp2 > 00 H a : p1 p2 0
a 1 2

Left-tailed Right-tailed Two-tailed


Hypothesis Tests:
Difference between two Population
Proportions p1 - p2
Test Statistic
( p1 p 2 ) 0
z
p(1 p)
1 1
n1 n2

x1

p
Where 1 n is the sample proportion for sample 1
1
x2
and
p 2 is the sample proportion for sample 2
n2
stands for a pooled estimator
p
np n p
1 1 2 2
p
n1 n2
Problems
CASES
CASES (using Excel)
SENSEX RETURNS
Test of Hypothesis for Population
Variance:
The Chi-Square (2) Distribution
Confidence intervals for the population variance are based
on the chi-square (2) distribution.
xi x
n 2

If Xi ~ N(, 2); then


i 1 s


is a 2 variate with (n-1) df.

Or, if Xi ~ N(, 2); then (n-1)s2 /2 follows 2 distribution


with (n-1) df.
The Chi-Square (2) Distribution
With 2 degrees
of freedom

With 5 degrees
of freedom

With 10 degrees
of freedom

The chi-square random variable cannot be negative, so it 2is


(n 1) s
In sampling from a normal population, the random variable:
2 =
bound by zero on the left.
0The chi-square ( n 1) s 2 is skewed to the right.
s 2


2 distribution
s2
The chi-square distribution approaches a normal as the
degrees
has of freedom
a chi - square increase.
distribution with (n - 1) degrees of freedom.
Interval Estimation of s2
2
We will use the notation to denote the value for the
chi-square distribution that provides an area of to the
right of the stated value.
2

For example, there is a .95 probability of obtaining a


2(chi-square) value such that

.975
2
2 .025
2
Interval Estimation of s2
( n 1)s 2
.975
2
.025
2

s2
.025
.025
95% of the
possible 2 values
2
0 .975
2
.025
2

95% confidence interval for population variance is :


0.9752 2 0.0252
( n 1) s 2 ( n 1) s 2
s 2
,
0.025
2
0.975
2

( n 1) s 2 ( n 1) s 2
In general, s 2
;
2
1 2

2
2
Selected Values from the Chi-Square
Distribution Table
For n - 1 = 10 - 1 = 9 d.f. and = .05

Degrees Area in Upper Tail


of Freedom .99 .975 .95 .90 .10 .05 .025 .01
5 0.554 0.831 1.145 1.610
2
0.975
9.236 11.070 12.832 15.086
6 0.872 1.237 1.635 2.204 10.645 12.592 14.449 16.812
7 1.239 1.690 2.167 2.833 12.017 14.067 16.013 18.475
8 1.647 2.180 2.733 3.490 13.362 15.507 17.535 20.090
9 2.088 2.700 3.325 4.168 14.684 16.919 19.023 21.666

10 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209

2
Our 0.975 value 2
Our .025 value
Hypothesis Test of s2
Hypotheses
Left tail Right tail Two Tail

H 0 : s 2 s 02 H 0 : s 2 s 20 H 0 : s 2 s 20
Ha : s s 2 2
0
H a : s 2 s 20 H a : s 2 s 20

Reject H0 if Reject H0 if Reject H0 if


2 (12 ) 2 2 2 (12 /2 ) or 2 2 /2

Test Statistic ( n 1) s 2
2
s 20

where 02 is the hypothesized value for the population


variance Note: Since the chi-square table only provides the critical values, it cannot be used to calculate
exact p-values. As in the case of the t-tables, only a range of possible values can be inferred.
Test of Hypothesis for
Two Population Variance:
The F Distribution
The F distribution is the distribution of the ratio of two chi-square
random variables that are independent of each other, each of which
is divided by its own degrees of freedom.

An F random variable with n1 -1 and n2 -1 degrees of freedom:

n1 1 s
2 2
Fn 1, n 1 1
; 1
1 2
n2 1 s
2
2
2
2

under the assumption of equal variance


Test of Hypothesis for
Two Population Variance:
The F Distribution
F Distributions with different Degrees of Freedom

1.0 F(25,30)
f (F)

The F random variable cannot F(10,15)


be negative, so it is bound
by zero on
0.5
the left.
The F distribution is skewed to the right.
The F distribution is identified by the number of degrees
F(5,6)
of freedom
0.0 in the numerator, n1 -1 , and the number of
degrees of freedom
0 1 in the
2 denominator,
3 4 n52 -1. F
Critical Point of F Distribution
Critical Points of the F Distribution Cutting Off a
Right-Tail Area of 0.05
F Dis tribution with 7 and 11 Degrees of F reedom
k1 1 2 3 4 5 6 7 8 9

k2 0.7
1 161.4 199.5 215.7 224.6 230.2 234.0 236.8 238.9 240.5
0.6
2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38
3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 0.5
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00
0.4

f(F )
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 0.3
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 0.2
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 0.1
10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 0.0 F
11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90
0 1 2 3 4 5
12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80
13 4.67 3.81 3.41 3.18 3.03 2.92 2.83 3.01 2.77 2.71
14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 F0.05=3.01
15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59

The left-hand critical point to go along with F(k1,k2) is given by: 1


F k 2 ,k 1
Where F(k1,k2) is the right-hand critical point for an F random variable with the reverse
number of degrees of freedom.
Critical Points of the F Distribution:
F(6, 9), = 0.10
F D is trib utio n with 6 a nd 9 D e g re e s of F re e d o m

The right-hand critical point read


0 .7
directly from the table of the F
0.05
0 .6 0.90 distribution is:
0 .5

0 .4
F(6,9) =3.37
f( F )

0 .3
0.05 The corresponding left-hand critical
0 .2
point is given by:
0 .1
1 1
0.2439
0 .0
F 9 , 6 410
.
0 1 2 3 4 5 F

F0.95=(1/4.10)=0.2439 F0.05=3.37
Hypothesis Test for
Two Population Variance
Hypotheses

H 0 : s 12 s 22 H 0 : s 12 s 22
H a : s 12 s 22 Ha : s12 s 22
Test Statistic
2
s
F 1
s22

Please read pages 378 to 381, from the text to derive left tail critical value for further details
Hypothesis Test for
Two Population Variance
Rejection Rule: one-tail (right tail) test
Reject H0 if F > F ;
where the value of F is based on an F distribution with n1 - 1
(numerator) and n2 - 1 (denominator) d.f.

Rejection Rule: two-tail test


Reject H0 if F > F/2
OR
Reject H0 if F < F1-/2
where the value of F/2 is based on an F distribution with n1 - 1
(numerator) and n2 - 1 (denominator) d.f.
Example 8-9
The economist wants to test whether or not the event (interceptions and prosecution of
insider traders) has decreased the variance of prices of stocks.

Population1 : Before
n = 25 2
1 2 2
H 0: s s
s 2 9.3 1 21
1
2 2
Population 2 : After H1: s s
n = 24 1 2
2
s2
s 2 3.0 1 9.3
F 3.1

2 F
n1 1, n 2 1 24 ,23 s2 3.0
2
0.05
F 2.01
24,23 H 0 may be rejected at a 1% level of significance.
0.01
F 2.70
24,23
Example 8-9: Solution
Distribution with 24 and 23 Degrees of Freedom
Since the value of the test
0.7

0.6
statistic is above the critical
0.5 point, even for a level of
0.4
f(F )

0.3
significance as small as 0.01,
0.2 the null hypothesis may be
0.1

0.0 F
rejected, and we may conclude
0 1 2 3 4 5
that the variance of stock
F0.01=2.7 Test Statistic=3.1
prices is reduced after the
interception and prosecution
of inside traders.
Example 8-9: Solution Using the EXCEL

Decision: Reject the null hypothesis; p-value = 0.0042.


Example 8-10: Testing the Equality of
Variances for Example 8-5

Population 1 Population 2
n = 14 n =9 2
H :s s
2
1 2 0 1 2
2 2 2 2
s 0.12 s 0.11 2 2
1 2 H :s s
1 1 2
0.05
s2
3.28 1 0.12 2
F F 119
13,8
F .
n1 1, n 2 1 s 0.11
13,8 2 2
2
0.10
F 2.50
13,8 H may not be rejected at the 10% level of significance.
0
Example 8-10: Solution

F Distribution with 13 and 8 Degrees of Freedom


0.7
Since the value of the test
0.10
0.6 0.80 statistic is between the critical
0.5
points, even for a 20% level of
f(F )

0.4

0.3

0.2
0.10 significance, we can not reject
0.1

0.0
the null hypothesis. We
0 1 2 3 4 5 F conclude the two population
F0.90=(1/2.20)=0.4545 F0.10=3.28

Test Statistic=1.19
variances are equal.
Template to test for the Difference between Two
Population Variances: Example 8-10

Decision: Do not reject the null hypothesis; p-value = 0.8304;


Assume equal variances..
Problems
Test of Independence
CASE: UNITECH II
Each home sold by Unitech can be classified according to
price and to style. Unitechs manager would like to
determine if the price of the home and the style of the
home are independent variables.

The number of homes sold for each model and price for the
past two years is shown below. For convenience, the price
of the home is listed as either Rs 30,00,000 or less or more
than Rs 30,00,000.
Price Colonial Log Duplex Studio
< 30 Lacs 18 6 19 12
> 30 Lacs 12 14 16 3
Hypothesis Test for
Unitech Example(Situation II)
Hypothesis 2:
If the customers preference of home is independent of its price and
style OR if the price and style are independent.
-known as Test of Independence of Attributes: Contingency Table
Analysis

H0: Price of the home is independent of the style of the


home that is purchased
H1: Price of the home is not independent of the style of the
home that is purchased
Hypothesis Test for
Unitech Example(Situation II)
Test Statistic: ( f ij eij ) 2

2
i j eij
where,
fij = Observed frequency for ith and jth attribute
eij = Expected frequency for ith and jth attribute.
i=1,2,,m (m-rows); j=1,2,,n (n-columns)
m n
N = Sample Size
A B N
i 1
i
j 1
j

The test statistic follows chi-square distribution with degrees of


freedom mn-(m+n-1)=mn-m-n+1=(m-1)(n-1).
Reject H0 if ; being the level of significance.
2 2
Unitech Example
Test of Independence
Ai and Bj are independent if: P(Ai Style
Bj) = P(Ai)P(Bj).
Price Colonial Log Duplex
= (Ai /N Studio Total
) * (Bj /N) ;
<=30Frequency
eij = Expected lacs 18or Expected
6 19 of observations
number 12 55 possessing
both Ai and B> 30attributes
lacs 12 14 16 3 45
j
= N P(Ai Total
Bj) = Ai *30Bj /N =20 35
(Row i Total) 15
* (Column 100
j Total)/N
In general , a mxn contingency table is given below:
Attribute B
In a mxn contingency
Attribute A table,
B1while calculating
B2 the
Bj expected
Bnfrequency,
Totalthe row
total, column A1
total and grand
A1,B1 totalA1,B2
remain fixed. The fixation
A1,B3 A1,B4 of m row
A1 total and
n column total imposes (m+n) constraints. Again, since
A2 A2,B1 m A2,B2n A2,B3 A2,B4 A2
Ai . Ai B j Ai,BjN .. Ai
i 1 j 1
Am Am,B1 Am,B2 . Am,Bn Am
The total number
Total of independent
B1 constraint
B2 is Bj
(m+n-1). Further,
Bn theNtotal cell
frequency is mxn. So, the
df = Total Obs Constraints = mn-(m+n-1) = (m-1) (n-1)
Unitech Example
Test of Independence
Style
Price Colonial Log Duplex Studio Total
<=30 lacs 18 6 19 12 55 Observed
> 30 lacs 12 14 16 3 45 Frequency
Total 30 20 35 15 100
Exp Freq for the cell (1,1)
= 30*55/100= 16.5, Style
E(2,1) = 30*45/100=13.5Price Colonial Log Duplex Studio Total
<=30 lacs 16.5 11 19.25 8.25 55 Expected
> 30 lacs 13.5 9 15.75 6.75 45 Frequency
Total 30 20 35 15 100

Style
Price Colonial Log Duplex Studio Total
<=30 lacs 0.1364 2.2727 0.0032 1.7045 4.1169
> 30 lacs 0.1667 2.7778 0.0040 2.0833 5.0317
Total 0.3030 5.0505 0.0072 3.7879 9.1486 Chi-Square
(O11 E11)2 /E11 3 dof
= (18-16.5)2 /16.5 0.05 Alpha
Chi-Square
= 0.1364 7.815 (Critical Value)
Reject H0
CASE:
A Bipartisan Agenda for Change
Question: Should legislative pay be cut for every day the
state budget is late?
Yes No Totals
Democrat 22 14 36
Independent 10 9 19
Republican 39 6 45
Totals 71 29 100

H0: The party affiliation is independent of the response


H1: The party affiliation is NOT independent of the response

You might also like