SDA 3E Chapter 5

2007 Pearson Education
Chapter 5: Hypothesis Testing

and Statistical Inference

Hypothesis Testing
Hypothesis testing involves drawing

inferences about two contrasting propositions
(hypotheses) relating to the value of a
population parameter, one of which is
assumed to be true in the absence of
contradictory data.
We seek evidence to determine if the

hypothesis can be rejected; if not, we can
only assume it to be true but have not
statistically proven it true.

Hypothesis Testing Procedure
1. Formulate the hypothesis
2. Select a level of significance, which defines
the risk of drawing an incorrect conclusion
that a true hypothesis is false
3. Determine a decision rule
4. Collect data and calculate a test statistic
5. Apply the decision rule and draw a
conclusion

Hypothesis Formulation
Null hypothesis, H
0
a statement that is
accepted as correct
Alternative hypothesis, H
1
a proposition that
must be true if H
0
is false
Formulating the correct set of hypotheses

depends on burden of proof what you
wish to prove statistically should be H
1
Tests involving a single population parameter

are called one-sample tests; tests involving
two populations are called two-sample tests.

Types of Hypothesis Tests
One Sample Tests
H
0
: population parameter constant vs.
H
1
: population parameter < constant
H
0
: population parameter constant vs.
H
1
: population parameter > constant
H
0
: population parameter = constant vs.
H
1
: population parameter constant
Two Sample Tests
H
0
: population parameter (1) - population parameter (2) 0 vs.
H
1
: population parameter (1) - population parameter (2) < 0
H
0
: population parameter (1) - population parameter (2) 0 vs.
H
1
: population parameter (1) - population parameter (2) > 0
H
0
: population parameter (1) - population parameter (2) = 0 vs.
H
1
: population parameter (1) - population parameter (2) 0

Four Outcomes
1. The null hypothesis is actually true, and the
test correctly fails to reject it.
2. The null hypothesis is actually false, and the
hypothesis test correctly reaches this
conclusion.
3. The null hypothesis is actually true, but the
hypothesis test incorrectly rejects it (Type I
error).
4. The null hypothesis is actually false, but the
hypothesis test incorrectly fails to reject it
(Type II error).

Quantifying Outcomes
Probability of Type I error (rejecting H

0
when
it is true) = = level of significance
Probability of correctly failing to reject H

0
= 1
= confidence coefficient
Probability of Type II error (failing to reject H

0

when it is false) =
Probability of correctly rejecting H

0
when it is
false = 1 = power of the test

Decision Rules
Compute a test statistic from sample data and

compare it to the hypothesized sampling
distribution of the test statistic
Divide the sampling distribution into a

rejection region and non-rejection region.
If the test statistic falls in the rejection region,

reject H
0
(concluding that H
1
is true);
otherwise, fail to reject H
0

Rejection Regions

Hypothesis Tests and
Spreadsheet Support
Type of Test Excel/PHStat Procedure
One sample test for mean, unknown PHStat: One Sample Test Z-test for the
Mean, Sigma Known
One sample test for mean, unknown PHStat: One Sample Test t-test for the
Mean, Sigma Unknown
One sample test for proportion PHStat: One Sample Test Z-test for the
Proportion
Two sample test for means, known Excel z-test: Two-Sample for Means
PHStat: Two Sample Tests Z-Test for
Differences in Two Means
Two sample test for means, unknown,
unequal
Excel t-test: Two-Sample Assuming
Unequal Variances

Spreadsheet Support (contd)
Type of Test Excel/PHStat Procedure
Two sample test for means, unknown,
assumed equal
Excel t-test: Two-Sample Assuming Equal
Variances
PHStat: Two Sample Tests t-Test for
Differences in Two Means
Paired two sample test for means Excel t-test: Paired Two-Sample for Means
Two sample test for proportions PHStat: Two Sample Tests Z-Test for
Differences in Two Proportions
Equality of variances Excel F-test Two-Sample for Variances
PHStat: Two Sample Tests F-Test for
Differences in Two Variances

One Sample Tests for Means
Standard Deviation Unknown
Example hypothesis
H
0
:
0
versus H
1
: <
0
Test statistic:
Reject H
0
if t < -t
n-1,

n s
x
t
/
0

Example
For the Customer Support Survey.xls data, test the
hypotheses
H
0
: mean response time 30 minutes
H
1
: mean response time < 30 minutes
Sample mean = 21.91; sample standard deviation =
19.49; n = 44 observations
Reject H0 because t = 2.75 < -t43,0.05 = -1.6811

PHStat Tool: t-Test for Mean
PHStat menu > One Sample

Tests > t-Test for the Mean,
Sigma Unknown
Enter null hypothesis and alpha
Enter sample statistics or data
range
Choose type of test

Results

Using p-Values
p-value = probability of obtaining a test

statistic value equal to or more extreme than
that obtained from the sample data when H
0

is true
Test Statistic
Lower one-tailed test Two-tailed test
0
Test Statistic

One Sample Tests for
Proportions
Example hypothesis
H
0
:
0
versus H
1
: <
0
Test statistic:
Reject if z < -z

) 1 (
0 0
0

p
z

Example
For the Customer Support Survey.xls data, test the hypothesis that the
proportion of overall quality responses in the top two boxes is at least
0.75
H
0
: .75
H
0
: < .75
Sample proportion = 0.682; n = 44
For a level of significance of 0.05, the critical value of z is -1.645;

therefore, we cannot reject the null hypothesis

PHStat Tool: One Sample z-
Test for Proportions
PHStat > One Sample Tests > z-Tests

for the Proportion
Enter null hypothesis,
significance level, number
of successes, and sample
size
Enter type of test

Results

Type II Errors and the Power
of a Test
The probability of a Type II error, , and the

power of the test (1 ) cannot be chosen by
the experimenter.
The power of the test depends on the true

value of the population mean, the level of
confidence used, and the sample size.
A power curve shows (1 ) as a function of
1
.

Example Power Curve

Two Sample Tests for Means
Standard Deviation Known
Example hypothesis
H
0
:
1

2
0 versus H
1
:
1
-
2
< 0
Test Statistic:
Reject if z < -z

2
2
2 1
2
1
2 1
/ / n n
x x
z
+

Sigma Unknown and Equal
Example hypothesis
H
0
:
1

2
0 versus H
1
:
1
-
2
> 0
Test Statistic:
Reject if z > z

2 1
2 1
2 1
2
2 2
2
1 1
2 1
2
) 1 ( ) 1 (
n n
n n
n n
s n s n
x x
z
+

Sigma Unknown and Unequal
Example hypothesis
H
0
:
1

2
= 0 versus H
1
:
1
-
2
0
Test Statistic:
Reject if z > z
/2
or z < - z
/2

t = (x
1
-
x
2
) /
2
2
2
1
2
1
n
s
n
s
+
1
]
1
+
1
]
1
1
]
1
+
1
) / (
1
) / (
2
2
2
2
2
1
2
1
2
1
2
2
2
2
1
2
1
n
n s
n
n s
n
s
n
s
with df =

Excel Data Analysis Tool: Two
Sample t-Tests
Tools > Data Analysis > t-test: Two Sample

Assuming Unequal Variances, or t-test: Two
Sample Assuming Equal Variances
Enter range of data, hypothesized mean

difference, and level of significance
Tool allows you to test H

0
:
1
-
2
= d
Output is provided for upper-tail test only
For lower-tail test, change the sign on t

Critical one-tail, and subtract P(T<=t) one-tail
from 1.0 for correct p-value

PHStat Tool: Two Sample
t-Tests
PHStat > Two Sample Tests > t-Test

for Differences in Two Means
Test assumes equal variances
Must compute and enter the sample

mean, sample standard deviation, and
sample size

Comparison of Excel and PHStat
Results Lower-Tail Test

Two Sample Test for Means
With Paired Samples
Example hypothesis
H
0
: average difference

= 0 versus
H
1
: average difference 0
Test Statistic:
Reject if t > t
n-1,
/2
or t < - t
n-1,
/2

n s
D
t
D
D
/

Two Sample Tests for
Proportions
Example hypothesis
H
0
:
1

2
= 0 versus H
1
:
1
-
2
0
Test Statistic:
Reject if z > z
/2
or z < - z
/2

,
_
2 1
2 1
1 1
) 1 (
n n
p p
p p
z
where
2 1
n n
samples both in successes of number
p
+

Confidence Intervals
If a 100(1 )% confidence interval contains

the hypothesized value, then we would not
reject the null hypothesis based on this value
with a level of significance .
Example hypothesis
H
0
:
0
versus H
1
: <
0
If a 100(1-)% confidence interval does not

contain
0
, then we can reject H
0

F-Test for Differences in Two
Variances
Hypothesis
H
0
:
1
2

2
2
= 0 versus H
1
:
1
2
-
2
2
0
Test Statistic:
Assume s
1
2
> s
2
2
Reject if F > F
/2,n1-1,n2-1
(see Appendix A.4)
Assumes both samples drawn from normal

distributions
2
2
2
1
s
s
F

Excel Data Analysis Tool: F-
Test for Equality of Variances
Tools > Data Analysis > F-test for

Equality of Variances
Specify data ranges
Use /2 for the significance level!
If the variance of Variable 1 is greater

than the variance of variable 2, the
output will specify the upper tail;
otherwise, you obtain the lower tail
information.

PHStat Tool: F-Test for
Differences in Variances
PHStat menu > Two Sample Tests > F-

test for Differences in Two Variances
Compute and enter sample standard

deviations
Enter the significance level , not /2

as in Excel

Excel and PHStat Results

Analysis of Variance (ANOVA)
Compare the means of m different

groups (factors) to determine if all are
equal
H
0
:
1

1
...
m
H
1
: at least one mean is different from the
others

ANOVA Theory
n
j
= number of observations in sample j
SST = total variation in the data
SSB = variation between groups
SSW = variation within groups

SST = SSB + SSW

n
j
n
i
ij
j
X X SST
1 1
2
) (

n
j
j j
X X n SSB
1
2
) (

n
j
n
i
j ij
j
X X SSW
1 1
2
) (

ANOVA Test Statistic
MSB = SSB/(m 1)
MSW = SSW/(n m)
Test statistic: F = MSB/MSW
Has an F-distribution with m-1 and n-m

degrees of freedom
Reject H
0
if F > F
/2,m-1,n-m

Excel Data Analysis Tool for
ANOVA
Tools > Data Analysis > ANOVA: Single

Factor

ANOVA Results

ANOVA Assumptions
The m groups or factor levels being studied

represent populations whose outcome
measures are
Randomly and independently obtained
Are normally distributed
Have equal variances
Violation of these assumptions can affect the

true level of significance and power of the
test.

Nonparametric Tests
Used when assumptions (usually

normality) are violated. Examples:
Wilcoxon rank sum test for testing

difference between two medians
Kurskal-Wallis rank test for determining

whether multiple populations have equal
medians.
Both supported by PHStat

Tukey-Kramer Multiple
Comparison Procedure
ANOVA cannot identify which means

may differ from the rest
PHStat menu > Multiple Sample Tests

> Tukey-Kramer Multiple Comparison
Procedure
Enter Q Statistic from Table A.5

Chi-Square Test for
Independence
Test whether two categorical variables

are independent
H
0
: the two categorical variables are
independent
H
1
: the two categorical variables are
dependent

Example
Is gender independent of holding a CPA

in an accounting firm?

Chi-Square Test for
Independence
Test statistic
Reject H
0
if
2
>
2
, (r-1)(c-1)
PHStat tool available in Multiple Sample

Tests menu
e
e o
f
f f
2
2
) (
where f
0
= observed frequency
f
e
= expected frequency if H
0
true
in the cells of the contingency table

Example
Expected No CPA CPA Total
Female 6.74 7.26 14
Male 6.26 6.74 13
Total 13 14 27
Critical value with = 0.05 and (2 - 1)(2 - 1) - 1 df =
3.841; therefore, we cannot reject the null hypothesis
that the two categorical variables are independent.

PHStat Procedure Results

Design of Experiments
A test or series of tests that enables the

experimenter to compare two or more
methods to determine which is better,
or determine levels of controllable
factors to optimize the yield of a
process or minimize the variability of a
response variable.

Factorial Experiments
All combinations of levels of each factor are considered.

With m factors at k levels, there are k
m
experiments.
Example: Suppose that temperature and reaction time

are thought to be important factors in the percent yield of
a chemical process. Currently, the process operates at a
temperature of 100 degrees and a 60 minute reaction
time. In an effort to reduce costs and improve yield, the
plant manager wants to determine if changing the
temperature and reaction time will have any significant
effect on the percent yield, and if so, to identify the best
levels of these factors to optimize the yield.

Designed Experiment
Analyze the effect of two levels of each

factor (for instance, temperature at 100
and 125 degrees, and time at 60 and
90 minutes)
The different combinations of levels of

each factor are commonly called
treatments.

Treatment Combinations
Low
High
Low High

Experimental Results

Main Effects
Measures the difference in the response that

results from different factor levels
Calculations
Temperature effect = (Average yield at high level) (Average yield

at low level)
= (B + D)/2 (A + C)/2
= (90.5 + 81)/2 (84 + 88.5)/2
= 85.75 86.25 = 0.5 percent.
Reaction effect = (Average yield at high level) (Average yield at

low level)
= (C + D)/2 (A + B)/2
= (88.5 + 81)/2 (84 + 90.5)/2
= 84.75 87.25 = 2.5 percent.

Interactions
When the effect of changing one factor

depends on the level of other factors.
When interactions are present, we

cannot estimate response changes by
simply adding main effects; the effect
of one factor must be interpreted
relative to levels of the other factor.

Interaction Calculations
Take the average difference in response

when the factors are both at the high or low
levels and subtracting the average difference
in response when the factors are at opposite
levels.
Temperature Time Interaction

= (Average yield, both factors at same level)
(Average yield, both factors at opposite levels)
= (A + D)/2 (B + C)/2
= (84 + 81)/2 (90.5 + 88.5)/2 = -7.0 percent

Graphical Illustration of
Interactions

Two-Way ANOVA
Method for analyzing variation in a 2-factor

experiment
SST = SSA + SSB + SSAB + SSW

where
SST = total sum of squares
SSA = sum of squares due to factor A
SSB = sum of squares due to factor B
SSAB = sum of squares due to interaction
SSW = sum of squares due to random variation (error)

Mean Squares
MSA = SSA/(r 1)
MSB = SSB/(c 1)
MSAB = SSAB/(r-1)(c-1)
MSW = SSW/rc(k-1),
where k = number of replications of
each treatment combination.

Hypothesis Tests
Compute F statistics by dividing each mean square

by MSW.
F = MSA/MSW tests the null hypothesis that means for

each treatment level of factor A are the same against the
alternative hypothesis that not all means are equal.
F = MSB/MSW tests the null hypothesis that means for

each treatment level of factor A are the same against the
alternative hypothesis that not all means are equal.
F = MSAB/MSW tests the null hypothesis that the

interaction between factors A and B is zero against the
alternative hypothesis that the interaction is not zero.

Excel Anova: Two-Factor with
Replication

Results
Examine p-
values for
significance

SDA 3E Chapter 5

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SDA 3E Chapter 5

Uploaded by

Copyright:

Available Formats

2007 Pearson Education

Chapter 5: Hypothesis Testing

Hypothesis testing involves drawing

We seek evidence to determine if the

Formulating the correct set of hypotheses

Tests involving a single population parameter

One Sample Tests

Two Sample Tests

Probability of Type I error (rejecting H

Probability of correctly failing to reject H

Probability of Type II error (failing to reject H

Probability of correctly rejecting H

Compute a test statistic from sample data and

Divide the sampling distribution into a

If the test statistic falls in the rejection region,

PHStat menu > One Sample

p-value = probability of obtaining a test

Sample proportion = 0.682; n = 44

For a level of significance of 0.05, the critical value of z is -1.645;

PHStat > One Sample Tests > z-Tests

The probability of a Type II error, , and the

The power of the test depends on the true

A power curve shows (1 ) as a function of

Tools > Data Analysis > t-test: Two Sample

Enter range of data, hypothesized mean

Tool allows you to test H

Output is provided for upper-tail test only

For lower-tail test, change the sign on t

PHStat > Two Sample Tests > t-Test

Test assumes equal variances

Must compute and enter the sample

If a 100(1 )% confidence interval contains

If a 100(1-)% confidence interval does not

Assumes both samples drawn from normal

Tools > Data Analysis > F-test for

Specify data ranges

Use /2 for the significance level!

If the variance of Variable 1 is greater

PHStat menu > Two Sample Tests > F-

Compute and enter sample standard

Enter the significance level , not /2

Compare the means of m different

SST = total variation in the data

SSB = variation between groups

SSW = variation within groups

Test statistic: F = MSB/MSW

Has an F-distribution with m-1 and n-m

Tools > Data Analysis > ANOVA: Single

The m groups or factor levels being studied

Randomly and independently obtained

Are normally distributed

Have equal variances

Violation of these assumptions can affect the

Used when assumptions (usually

Wilcoxon rank sum test for testing

Kurskal-Wallis rank test for determining

Both supported by PHStat

ANOVA cannot identify which means

PHStat menu > Multiple Sample Tests

Test whether two categorical variables

Is gender independent of holding a CPA

PHStat tool available in Multiple Sample

A test or series of tests that enables the