Professional Documents
Culture Documents
ANOVA
Probably the most popular analysis in
psychology
Why?
Ease of implementation
Allows for analysis of several groups at once
Allows analysis of interactions of multiple
independent variables.
Assumptions
As before, if the assumptions of the test are
Null hypothesis
H0: 1= 2 = k
H1: not H0
Anova will tell us that the means are different in some
way
interactions
Probability of type one error goes up with multiple
tests
Review
With an independent samples t-test we looked to see
X1 X 2
s 2p
n1
s 2p
n2
Comparison to t-test
In this sense with our t-statistic we have a ratio of the
T o t a l V a ria n c e
V a r ia n c e d u e t o o u r
e x p e r im e n t a l m a n ip u la t io n
V a r i a n c e d u e t o n o n - s y s t e m a t ic
fa c to rs
ts and Fs
Computation
Sums of squares
Treatment
Error
Total (Treatment + Error)
SSTotal = sums of squared deviations of scores about
the grand mean
SSTreat = sums of squares of the deviations of the
means of each group from the grand mean (with
consideration of group N)
Sserror = the rest
or SSTotal SSTreat
Sums of squared deviations of the scores about
their group mean
SSTotal =
( X ij X .. )
2
n
(
X
X
)
SSTreat =
j
..
SSerror =
SS
SS
2
(
X
X
)
ij j
to ta l
SS
tre a tm e n t
SS
SS
B e tw e e n g r o u p s
e rror
to ta l
SS
w it h in g r o u p s
The more the sample means differ, the larger will be the betweensamples variation
Example
Ratings for a reality tv show involving former WWF stars, people
1) 18-25 group
2) 25-45 group
3) 45+ group
7 4 6 8 6 6 2 9 Mean = 6 SD = 2.2
5 5 3 4 4 7 2 2 Mean = 4 SD = 1.7
2 3 2 1 2 1 3 2 Mean = 2 SD = .76
Now what?
Well now we just need df and were set to go.
SStreat = k 1 where k is the number of groups (each
The F Ratio
The F ratio has a sampling distribution like the t did
That is, estimates of F vary depending on exactly which
sample you draw
Again, this sampling distribution has known properties
Next
Construct an ANOVA table:
Source
df
SS
MS
Treatment
64
Error
21
58
Total
23
122
MS refers to the Mean Squares which are found by dividing the SS by their
respective df. Since both of the SS values are summed values they are
influenced by the number of scores that were summed (for example, SStreat used
the sum of only 3 different values (the group means) compared to SSerror, which
used the sum of 24 different values). To eliminate this bias we can calculate the
average sum of squares (known as the mean squares, MS).
Our F ratio (or F statistic) is the ratio of the two MS values.
Finally
Source
df
SS
MS
Treatment
64
32
11.57
Error
21
58
2.76
Total
23
122
Interpretation
There is some statistically significant difference among the group
means.
Measure of the ratio of the variation explained by the
Error
As always p(D|H0)
Interpretation
So they are different in some fashion, what
else do we know?
Nada.
Unequal n
Want equal ns if at all possible
If not we will have to adjust our formula for SS treat,
X
)
j ..
Violations of assumptions
When we violate our assumption of
HoV violation
Options:
Kruskal-Wallis
Welch procedure
Brown-Forsythe
Welchs correction
nk
wk 2
sk
wX
X
w
k
w (X
X .) 2
k 1
F
wk
2(k 2) 1
1 2
1
k 1
n
1
w
k
k
k
k 2 1
df
1
k
wk
w
k
Brown-Forsythe
n (X
F*
(1 n
k
X .. ) 2
2
/
N
)
S
k
k
df (k 1), f
f
nk 1
2
c
k
k
(1 nk / N ) S k 2
c
2
(1
n
/
N
)
S
k
k
k
Example output
Test of Homogeneity of Variances
proportion correct
Levene
Statistic
3.817
df1
df2
104
violation of HoV
Sig.
.025
ANOVA
proportion correct
Between Groups
Within Groups
Total
Sum of
Squares
.594
10.390
10.985
df
2
104
106
Mean Square
.297
.100
F
2.974
Sig.
.055
Regular F
Test Statisticsa,b
proportion correct
a
Welch
Brown-Forsythe
Statistic
3.755
2.964
a. Asymptotically F distributed.
df1
2
2
df2
65.775
92.818
Sig.
.029
.057
Chi-Square
df
Asymp. Sig.
proportion
correct
5.474
2
.065
Violation of normality
When normality is a concern, we can
Gist
Approach One-way ANOVA much as you would a t-
test.