You are on page 1of 12

Analysis of Variance

Lecture 11 April 26th, 2011

A. Introduction
When you have more than two groups, a t-test (or the nonparametric equivalent) is no longer applicable. Instead, we use a technique called analysis of variance. This chapter covers analysis of variance designs with one or more independent variables, as well as more advanced topics such as interpreting significant interactions, and unbalanced designs.

B. One-Way Analysis of Variance


The method used today for comparisons of three or more groups is called analysis of variance (ANOVA). This method has the advantage of testing whether there are any differences between the groups with a single probability associated with the test. The hypothesis tested is that all groups have the same mean. Before we present an example, notice that there are several assumptions that should be met before an analysis of variance is used. Essentially, we must have independence between groups (unless a repeated measures design is used); the sampling distributions of sample means must be normally distributed; and the groups should come from populations with equal variances (called homogeneity of variance). Example:

15 Subjects in three treatment groups X,Y and Z.

X 700 850

Y 480 460

Z 500 550

820 640 920

500 570 580

480 600 610

The null hypothesis is that the mean(X)=mean(Y)=mean(Z). The alternative hypothesis is that the means are not all equal. How do we know if the means obtained are different because of difference in the reading programs(X,Y,Z) or because of random sampling error? By chance, the five subjects we choose for group X might be faster readers than those chosen for groups Y and Z. We might now ask the question, What causes scores to vary from the grand mean? In this example, there are two possible sources of variation, the first source is the training method (X,Y or Z). The second source of variation is due to the fact that individuals are different. SUM OF SQUARES total; SUM OF SQUARES between groups; SUM OF SQUARES error within groups ;

F ratio = MEAN SQUARE between groups/MEAN SQUARE error = (SS between groups/(k-1)) / (SS error/(N-k)) SAS codes:
DATA READING; INPUT GROUP $ WORDS @@; DATALINES; X 700 X 850 X 820 X 640 X 920 Y 480 Y 460 Y 500 Y 570 Y 580 Z 500 Z 550 Z 480 Z 600 Z 610 ; PROC ANOVA DATA=READING; TITLE 'ANALYSIS OF READING DATA'; CLASS GROUP; MODEL WORDS=GROUP; MEANS GROUP; RUN;

The ANOVA Procedure Dependent Variable: words

Sum of Source Model Error Corrected Total DF 2 12 Squares 215613.3333 77080.0000 14 Mean Square 107806.6667 6423.3333 F Value 16.78 Pr > F 0.0003

292693.3333

Now that we know the reading methods are different, we want to know what the differences are. Is X better than Y or Z? Are the means of groups Y and Z so close that we cannot consider them different? In general , methods used to find group differences after the null hypothesis has been rejected are called post hoc, or multiple comparison test. These include Duncans multiple-range test, the Student-Newman-Keuls multiple-range test, least significant-difference test, Tukeys studentized range test, Scheffes multiple-comparison procedure, and others. To request a post hoc test, place the SAS option name for the test you want, following a slash (/) on the MEANS statement. The SAS names for the post hoc tests previously listed are DUNCAN, SNK, LSD, TUKEY, AND SCHEFFE, respectively. For our example we have:
MEANS GROUP / DUNCAN; Or MEANS GROUP / SCHEFFE ALPHA=.1

At the far left is a column labeled Duncan Grouping. Any groups that are not significantly different from one another will have the same letter in the Grouping column.
The ANOVA Procedure Duncan's Multiple Range Test for words

NOTE: This test controls the Type I comparison wise error rate, not the experiment wise error rate.

Alpha

0.05 12

Error Degrees of Freedom Error Mean Square

6423.333

Number of Means Critical Range

2 110.4

3 115.6

Means with the same letter are not significantly different.

Duncan Grouping

Mean

group

786.00

B B B

548.00

518.00

C. Computing Contrasts

Suppose you want to make some specific comparisons. For example, if method X is a new method and methods Y and Z are more traditional methods, you may decide to compare method X to the mean of method Y and method Z to see if there is a difference between the new and traditional methods. You may also want to compare method Y to method Z to see if there is a difference. These comparisons are called contrasts, planned comparisons, or a priori comparisons. To specify comparisons using SAS software, you need to use PROC GLM (General Linear Model) instead of PROC ANOVA. PROC GLM is similar to PROC ANOVA and uses many of the same options and statements. However, PROC GLM is a more generalized program and can be used to compute contrasts or to analyze unbalanced designs.
PROC GLM DATA=READING; TITLE 'ANALYSIS OF READING DATA -- PLANNED COMPARIONS'; CLASS GROUP; MODEL WORDS = GROUP; CONTRAST 'X VS. Y AND Z' GROUP -2 1 1; CONTRAST 'Method Y VS. Z' GROUP 0 1 -1; RUN;

The GLM Procedure

Contrast

DF

Contrast SS

Mean Square

F Value

Pr > F

X VS. Y AND Z METHOD Y VS Z

1 1

213363.3333 2250.0000

213363.3333 2250.0000

33.22 0.35

<.0001 0.5649

D. Analysis of Variance: Two Independent Variables


Suppose we ran the same experiment for comparing reading methods, but using 15 male and 15 female subjects. In addition to comparing reading-instruction methods, we could compare male versus female reading speeds. Finally, we might want to see if the effects of the reading methods are the same for males and females.

DATA TWOWAY; INPUT GROUP $ GENDER $ WORDS @@; DATALINES; X M 700 X M 850 X M 820 X M 640 X M 920 Y M 480 Y M 460 Y M 500 Y M 570 Y M 580 Z M 500 Z M 550 Z M 480 Z M 600 Z M 610 X F 900 X F 880 X F 899 X F 780 X F 899 Y F 590 Y F 540 Y F 560 Y F 570 Y F 555 Z F 520 Z F 660 Z F 525 Z F 610 Z F 645 ; PROC ANOVA DATA=TWOWAY; TITLE 'ANALYSIS OF READING DATA'; CLASS GROUP GENDER; MODEL WORDS=GROUP | GENDER; MEANS GROUP | GENDER / DUNCAN; RUN;

In this case, the term GROUP | GENDER can be written as GROUP GENDER GROUP*GENDER

Source group gender group*gender

DF 2 1 2

Anova SS 503215.2667 25404.3000 2816.6000

Mean Square 251607.6333 25404.3000 1408.3000

F Value 56.62 5.72 0.32

Pr > F <.0001 0.0250 0.7314

In a two-way analysis of variance, when we look at GROUP effects, we are comparing GROUP levels without regard to GENDER. That is, when the groups are compared we combine the data from both GENDERS. Conversely, when we compare males to females, we combine data from the three treatment groups. The term GROUP*GENDER is called an interaction term. If group differences were not the same for males and females, we could have a significant interaction.

E. Interpreting Significant Interactions

Now consider an example that has a significant interaction term. We have two groups of children. One group is considered normal; the other, hyperactive.
data ritalin; do group = 'normal' , 'hyper'; do drug = 'placebo','ritalin'; do subj = 1 to 4; input activity @; output; end; end; end; datalines; 50 45 55 52 67 60 58 65 70 72 68 75 51 57 48 55 ; proc anova data=ritalin; title 'activity study'; class group drug; model activity=group | drug; means group | drug; run;

Source

DF

Anova SS

Mean Square

F Value

Pr > F

group drug group*drug

1 1 1

121.0000000 42.2500000 930.2500000

121.0000000 42.2500000 930.2500000

8.00 2.79

0.0152 0.1205 <.0001

61.50

From the ANOVA table above, we notice that there is a strong GROUP*DRUG interaction term. When this occurs, we need to be very careful about interpreting any of the main effects (GROUP and DRUG in this example). Then we look closely at the means of the interaction groups. The best way to explain a two-way interaction is to take the cell means and plot them.

proc means data=ritalin nway noprint; class group drug; var activity; output out=means mean=M;

run; symbol1 value=square color=black I=join; symbol2 v=circle c=black i=join; proc gplot data=means; plot M*drug=group; run;

activity study
M
80

70

60

50 placebo ritalin

drug group
hyper normal

The graph shows that normal children increase their activity when given Ritalin, while hyperactive children are calmed by Ritalin. Before comparing the means of DRUG, we combined the data from normal and hyperactive children. Since the means of normal and hyperactive children tend to cancel the effect of DRUG of each other, the average activity with placebo and Ritalin is about the same. Then we can compare the means between placebo and Ritalin within the normal and hyperactive children. We can create a new variable named cond with the four values: normal_placebo, normal_ritalin, hyper_placebo and hyper_ritalin.
data ritalin_new; set ritalin; cond=group || drug; run; proc anova data=ritalin_new; title 'one-way anova ritalin study'; class cond; model activity = cond; means cond / duncan; run;

Duncan Grouping

Mean

cond

A B C C C

71.250 62.500 52.750

4 4 4

hyper placebo normal ritalin hyper ritalin

50.500

normal placebo

We can see that placebo is different from Ritalin within normal and hyper groups and while given Ritalin normal children act differently from hyperactive children.

F. N-Way Factorial Designs


With three independent variables, we have three main effects, three two-way interactions, and one three-way interaction. One usually hopes that the higher-order interactions are not significant since they complicate the interpretation of the main effects and the low-order interactions.
PROC ANOVA DATA=THREEWAY; TITLE THREE WAY ANALYSIS OF VARIANCE; CLASS GROUP GENDER DOSE; MODEL ACTIVITY = GROUP | GENDER | DOSE; MEANS GROUP | GENDER | DOSE; RUN;

G. Unbalanced Designs: PROC GLM

Designs with an unequal number of subjects per cell are called unbalanced designs. For all designs that are unbalanced (except for one-way designs), we cannot use PROC ANOVA; PROC GLM (general linear model) is used instead. LMEANS will produce least-square, adjusted means for main effects. PDIFF option computes probabilities for pair wise difference. Notice that there are two sets of values for SUM OF SQUARES, F VALUES, and probabilities; Notice new TITLE statements. TITLE2, TITLE3, TITLE4;
data pudding; input sweet flavor : $9. rating; datalines; 1 vanilla 9 1 vanilla 7 1 vanilla 8 1 vanilla 7 2 vanilla 8 2 vanilla 7 2 vanilla 8 3 vanilla 6 3 vanilla 5 3 vanilla 7 1 chocolate 9 1 chocolate 9 1 chocolate 7 1 chocolate 7 1 chocolate 8 2 chocolate 8 2 chocolate 7 2 chocolate 6 2 chocolate 8 3 chocolate 4 3 chocolate 5 3 chocolate 6 3 chocolate 4 3 chocolate 4 ; proc glm data=pudding; title 'pudding taste evaluation'; title3 'two-way ANOVA - unbalanced design'; title5 '---------------------------------'; class sweet flavor; model rating = sweet | flavor; means sweet | flavor; lsmeans sweet | flavor / pdiff; run;
Source DF Sum of Squares Mean Square F Value Pr > F

Model Error Corrected Total Source sweet flavor sweet*flavor

5 18

39.96666667 7.99333333 15.36666667 0.85370370 23 55.33333333 DF Type III SS Mean Square

9.36

0.0002

F Value

Pr > F

2 1

29.77706840 14.88853420 17.44 <.0001 1.56666667 1.56666667 1.84 0.1923 2 2.77809264 1.38904632 1.63 0.2241

By checking the p-values, we can see that sweetness is significant in the model; flavor and sweet*flavor interaction are not significant respectively.
rating LSMEAN LSMEAN Number 7.87500000 7.45833333 5.30000000 1 2 3

sweet 1 2 3

Least Squares Means for effect sweet Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: rating i/j 1 2 3 1 0.3866 <.0001 2 0.3866 3 <.0001 0.0003

0.0003

The 3-by-3 matrix above shows all the pairwie multiple comparisons for SWEET. We can see that sweetness levels 1 vs. 3 and 2 vs. 3 are both significant.

flavor chocolate vanilla

H0:LSMean1= rating LSMean2 LSMEAN Pr > |t| 6.61666667 7.13888889 0.1923

Chocolate and vanilla are not different from each other.

sweet 1 1 2 2 3 3

flavor

rating LSMEAN LSMEAN Number 2 4 1 3 5

chocolate 8.00000000 vanilla 7.75000000 chocolate 7.25000000 vanilla 7.66666667 chocolate 4.60000000 vanilla 6.00000000

i/j 1 2 3 4 5 6

1 0.6914 0.2419 0.6273 <.0001 0.0083

2 0.6914 0.4540 0.9073 <.0001 0.0233

3 0.2419 0.4540

5 0.6273 0.9073 0.5622 0.0003 0.0404

6 <.0001 <.0001 0.0005 0.0003 0.0083 0.0233 0.0934 0.0404 0.0526

0.5622 0.0005 0.0934

0.0526

From the matrix above, we can see the significance of different sweetness and flavors respectively. In this example, we can see people really didnt like sweet chocolate.

You might also like