You are on page 1of 52

ANOVA

E
C
N
A
I
R
A
V
F
O
S
I
S
Y
L
A
AN
MORA, REGINALD G.
ME-SE

WHY ANOVA?

We can do basic comparison such as independent


sample t-test (random) or matched-sample t-test
(paired)

What if we wish to compare the means of more than


two populations?

WHAT IS ANOVA?

Analysis of variance (ANOVA) is a hypothesis-testing


technique used to test the equality of two or more
population (or treatment) means by examining the
variances of samples that are taken

WHAT IS ANOVA?

Allows one to determine whether the differences


between the samples are simply due to random error
(sampling errors) or whether there are systematic
treatment effects that causes the mean in one group
to differ from the mean in another

WHAT IS ANOVA?

It is a collection of statistical models used in order to


analyze the differences between group means and
their associated procedures (such as "variation"
among and between groups), developed by R. A.
Fisher.
2
2
H 0 : between within
H a :

2
between

2
within

ASSUMPTIONS OF ANOVA:

All populations involved follow a normal distribution.


All populations have the same variance (or standard
deviation).

The samples are randomly selected and independent


of one another.

MOTIVATING EXAMPLE
The analysis of variance can be used as an exploratory
tool to explain observations.

A dog show is not a random


sampling of the breed: it is
typically limited to dogs that are
male, adult, pure-bred, and
exemplary.

Suppose we wanted to
predict the weight of a dog
based on a certain set of
characteristics of each dog

SUPPOSED WE WANT TO COMPARE


THREE SAMPLE MEANS TO SEE IF
DIFFERENCE EXISTS SOMEWHERE
AMONG THEM
What we are
asking: Do all
three of these
means come
from a
common
population?

Is one mean so
far away from the
other that is like
not from the same
population?

Or all three so far


apart that ALL
likely come from
a unique
population

REMEMBER, WE ARE NOT ASKING IF THEY


ARE EXACTLY EQUAL. WE ARE ASKING IF
EACH MEAN LIKELY CAME FROM THE
LARGER OVERALL POPULATION

Variability
AMONG/
BETWEEN the
sample
means

"Classical ANOVA for balanced data does three things


at once:
1. As exploratory data analysis, an ANOVA is an organization of an
additive data decomposition, and its sums of squares indicate the
variance of each component of the decomposition (or, equivalently,
each set of terms of a linear model).
2. Comparisons of mean squares, along with f-tests ... allow testing of
nested sequence of models.
3. Closely related to the ANOVA is a linear model fit with coefficient
estimates and standard errors."

"Classical ANOVA for balanced data does three things


at once:
Additionally:
4. It is computationally elegant and relatively robust against violations
of its assumptions.
5. ANOVA provides industrial strength (multiple sample comparison)
statistical analysis.
6. It has been adapted to the analysis of a variety of experimental
designs.

CLASSES OF MODELS
FIXED-EFFECTS MODELS
RANDOM-EFFECTS MODELS
MIXED-EFFECTS MODELS

FIXED-EFFECTS MODELS
Applies to situations in which the experimenter applies
one or more treatments to the subjects of the
experiment to see if the response variable values
change. This allows the experimenter to estimate the
ranges of response variable values that the treatment
would generate in the population as a whole.

RANDOM-EFFECTS MODELS
Used when the treatments are not fixed. This
occurs when the various factor levels are sampled
from a larger population

MIXED-EFFECTS MODELS
A mixed-effects model contains experimental
factors of both fixed and random-effects types,
with appropriately different interpretations and
analysis for the two types.

Example: teaching experiments could be performed


by a university department to find a good
introductory textbook, with each text considered a
treatment. The fixed-effects model would compare a
list of candidate texts. The random-effects model
would determine whether important differences exist
among a list of randomly selected texts. The mixedeffects model would compare the (fixed) incumbent
texts to randomly selected alternatives.

PURPOSE
The reason for doing an ANOVA is to see if there is any
difference between groups on some variable.

PURPOSE
For example, you might have data on student
performance in non-assessed tutorial exercises as well as
their final grading. You are interested in seeing if tutorial
performance is related to final grade. ANOVA allows you
to break up the group according to the grade and then
see if performance is different across these grades.

TYPES OF ANOVA

1.ONE WAY
2.TWO WAY

ONE-WAY BETWEEN GROUPS


The example given above is called a one-way between groups
model. You are looking at the differences between the groups.
There is only one grouping (final grade) which you are using to
define the groups.
This is the simplest version of anova. This type of ANOVA can
also be used to compare variables between different groups tutorial performance from different intakes.

ONE-WAY REPEATED MEASURES


A one way repeated measures ANOVA is used when you have
a single group on which you have measured something a few
times. For example, you may have a test of understanding of
Classes. You give this test at the beginning of the topic, at the
end of the topic and then at the end of the subject. You would
use a one-way repeated measures ANOVA to see if student
performance on the test changed over time.

HOW TO CALCULATE ANOVA


Treatment 1

Treatment 2

Treatment 3

Treatment 4

y11

y21

y31

y41

y12

y22

y32

y42

y13

y23

y33

y43

y14

y24

y34

y44

y15

y25

y35

y45

y16

y26

y36

y46

y27

y37

y47

y18

y28

y38

y48

y19

y29

y39

y49

y110

y210

y310

y410

y17

n=10 obs./group
k=4 groups

HOW TO CALCULATE ANOVA


The group

means10
y1

10

y1 j

j 1

y 2

10

(y

1j

j 1

y1 )

10 1

j 1

2 j

j 1

The (within) group


variances 10

10

10

y 3

10

( y 2 j y 2 )

10

j 1

10 1

j 1

10

( y 3 j y 3 ) 2
10 1

10

y3 j
y 4

10

j 1

4 j

j 1

10

( y 4 j y 4 ) 2
10 1

SUM OF SQUARES WITHIN (SSW), OR SUM OF SQUARES


ERROR (SSE)
10

j 1

10

( y1 j y1 )

(y
j 1

2j

y 2 )

10

(y
j 1

10 1

10 1

3j

y 3 )

10

(y
j 1

4j

y 4 ) 2

10 1

10 1

The (within)
group
variances

Sum of Squares Within (SSW)


(or SSE, for chance error)
10

(y

1j

j 1

10

y1 ) +

j 1

10

( y 2 j y 2 ) 2

j 3

( y 3 j y 3 ) 2 +

10

( y
i 1 j 1

ij

y i )

10

j 1

( y 4 j y 4 ) 2

SUM OF SQUARES BETWEEN (SSB), OR SUM


OF SQUARES REGRESSION (SSR)
4

Overall mean of
all 40
observations
(grand mean)
4

10 x

(y
i 1

y )

10

ij

i 1 j 1

40
Sum of Squares
Between (SSB).
Variability of the group
means compared to the
grand mean (the
variability due to the

TOTAL SUM OF SQUARES (SST)


4

10

( y
i 1 j 1

ij

y )

Total sum of
squares(TSS).
Squared difference
of every observation
from the overall
mean. (numerator of
variance of Y!)

PARTITIONING OF VARIANCE
4

10

i 1 j 1

( y ij y i )

(y
i 1

10x

y )

SSW + SSB =
TSS

10

( y
i 1 j 1

ij

y )

ANOVA TABLE
Source of
variation
Between
(k groups)

Within

Sum of squares
d.f.
k-1

F-statistic
SSB

SSB/k-1

(sum of squared
deviations of group
means from grand
mean)

nk-k

(n individuals per
group)

Total variation

Mean Sum of
Squares

nk-1

SSW
(sum of squared
deviations of
observations from their
group mean)

Go to

SSB
SSW

s2=SSW/nk-k

TSS
(sum of squared deviations of
observations from grand mean)

p-value

k 1

nk k

Fk1,nkk
chart

If f > f reject Null


If f < f fail to
reject Null

TSS=SSB + SSW

ANOVA TABLE
Source Degrees Sum of Mean Square Computed
of
of
Squares
f
Variation Freedom
Between
Between

SSB
SSB

k-1
k-1

Error
Error

SSE
SSE

k(n-1)
k(n-1)

SST

nk-1

Total

If f > f reject Null


If f < f fail to
reject Null

EXAMPLE
Treatment 1

Treatment 2

Treatment 3

Treatment 4

60 inches

50

48

47

67

52

49

67

42

43

50

54

67

67

55

67

56

67

56

68

62

59

61

65

64

67

61

65

59

64

60

56

72

63

59

60

71

65

64

65

EXAMPLE
Step 1) calculate the
sum of squares between
groups:
Mean for group 1 = 62.0
Mean for group 2 = 59.7
Mean for group 3 = 56.3
Mean for group 4 = 61.4
Grand mean= 59.85

Treatment 1

Treatment 2

Treatment 3

Treatment 4

60 inches

50

48

47

67

52

49

67

42

43

50

54

67

67

55

67

56

67

56

68

62

59

61

65

64

67

61

65

59

64

60

56

72

63

59

60

71

65

64

65

SSB = [(62-59.85)2 + (59.7-59.85)2 + (56.3-59.85)2 +


(61.4-59.85)2]xnpergroup=19.65x10=196.5

Treatment 1

Treatment 2

Treatment 3

Treatment 4

60 inches

50

48

47

67

52

49

67

Step 2) calculate the sum of


squares within groups:

42

43

50

54

67

67

55

67

56

67

56

68

62

59

61

65

64

67

61

65

59

64

60

56

72

63

59

60

71

65

64

65

EXAMPLE
(60-62) +(67-62) + (42-62) + (6762)2+ (56-62)2+ (62-62)2+ (64-62)
2
+ (59-62)2+ (72-62)2+ (71-62)2+
(50-59.7)2+ (52-59.7)2+ (43-59.7)
2
+67-59.7)2+ (67-59.7)2+ (69-59.7)
2
+.(sum of 40 squared
deviations) = 2060.6
2

STEP 3) FILL IN THE ANOVA TABLE


Sum of
squares

Mean Sum
of Squares

Source of
variation

d.f.

Between

196.5

65.5

Within

36

2060.6

57.2

Total

39

2257.1

F-statistic

1.14

INTERPRETATION of ANOVA:
How much of the variance in height is explained by treatment
group?
2=

COEFFICIENT OF DETERMINATION
SSB
SSB
R

SSB SSE SST


2

The amount of variation in the outcome variable (dependent


variable) that is explained by the predictor (independent
variable).

EXAMPLE

Suppose the national transportation safety board (NTSB) wants


to examine the safety of compact cars, midsize cars, and fullsize cars. It collects a sample of three for each of the
treatments (cars types). Using the hypothetical data provided
below, test whether the mean pressure applied to the drivers
head during a crash test is equal for each types of car. Use =
5%.

(1.) State the null and alternative hypotheses


The null hypothesis for an ANOVA always assumes the
population means are equal. Hence, we may write the null
hypothesis as:
H0: 1 2 3 = = - The mean head pressure is statistically
equal across the three types of cars.
Since the null hypothesis assumes all the means are equal, we
could reject the null hypothesis if only mean is not equal. Thus,
the alternative hypothesis is:
Ha: At least one mean pressure is not statistically equal.

(2.) Calculate the appropriate test statistic


The test statistic in ANOVA is the ratio of the between and within
variation in the data. It follows an F distribution.
Total Sum of Squares the total variation in the data. It is the
sum of the between and within variation.

Between Sum of Squares (or Treatment Sum of Squares)


variation in the data between the different samples (or
treatments).
Sum of the square between
(SSB)

SSB

Within variation (or Error Sum of Squares) variation in the data


from each individual treatment.

SSB

ANOVA TABLE
Source Degrees Sum of Mean Square Computed
of
of
Square
f
Variation Freedom
s
Between 86049.55

43024.775

Error

10254

1709

96303.55

Total

25.175

(5.) Interpretation
Since we rejected the null hypothesis, we are 95% confident (1-)
that the mean head pressure is not statistically equal for compact,
midsize, and full size cars. However, since only one mean must be
different to reject the null, we do not yet know which mean(s)
is/are different. In short, an ANOVA test will test us that at least
one mean is different, but an additional test must be conducted to
determine which mean(s) is/are different.

Determining which mean(s) is/are different


If you fail to reject the null hypothesis in an ANOVA then you are
done. You know, with some level of confidence, that the treatment
means are statistically equal. However, if you reject the null then
you must conduct a separate test to determine which mean(s)
is/are different.
There are several techniques for testing the differences between
means, but the most common test is the Least Significant
Difference Test

Least Significant Difference (LSD) for a balanced sample:


, where MSE is the mean square error
and r is the number of rows in each treatment.

In the example above, LSD =

Thus, if the absolute value of the difference between any two


treatment means is greater than 82.61, we may conclude that
they are not statistically equal.
Compact cars vs. midsize cars:
|666.67 473.67| = 193. Since 193 > 82.61 = mean head
pressure is statistically different between compact and
midsize cars.
Midsize cars vs. Full-size cars:
|473.67 447.33| = 26.34. Since 26.34 < 82.61 = mean
head pressure is statistically equal between midsize and
full-size cars.

EXAMPLE
Susan sound predicts that students will learn most effectively with a constant
background sound, as opposed to an unpredictable sound or no sound at all. She
randomly divides twenty-four students into three groups of eight. All students study a
passage of text for 30 minutes. Those in group 1 study with background sound at a
constant volume in the background. Those in group 2 study with noise that changes
volume periodically. Those in group 3 study with no sound at all. After studying, all
students take a 10 point multiple choice test over the material. Their scores follow:

GROUP
1) constant sound
2) random sound
3) no sound

test scores
7 4 6 8 6 6 2 9
5 5 3 4 4 7 2 2
2 4 7 1 2 1 5 5

x1

x12

x2

x22

x3

x32

49

25

16

25

16

36

49

64

16

36

16

36

49

25

81
Sx12 = 322

4
Sx22 = 148

5
Sx3 = 27

25
Sx32 = 125

Sx1 = 48
(Sx1)2 = 2304

Sx2 = 32
(Sx2)2 = 1024

M1 = 6

M2 = 4

(Sx3)2 = 729
M3 = 3.375

Source SS

df

MS

Among 30.08 2 15.04 3.59


Within 87.88 21 4.18
(according to the F sig/probability table with df = (2,21) F
must be at least 3.4668 to reach p < .05, so F score is
statistically significant)
Interpretation: Susan can conclude that her hypothesis
may be supported. The means are as she predicted, in
that the constant music group has the highest score.
However, the significant F only indicates that at least two
means are significantly different from one another, but she
can't know which specific mean pairs significantly differ
until she conducts a post-hoc analysis.

You might also like