You are on page 1of 5

NCSSM Statistics Leadership Institute Notes Experimental Design

Randomized Complete Block Design


In the previous section, we analyzed results from a completely randomized design
with a one-way analysis of variance. This design ignored the physical layout of the cages
and the potential effect of the height of cage in which a rabbit was housed. If we wanted
to acknowledge the potential effect of height of the cage on the coagulation time, we
should organize the experiment using a randomized complete block design. One diet of
each type will be used on each of the 4 shelves. The randomization procedure would
assign a number 1-16 to each of the rabbits. Put four slips of marked either 1, 2, 3, or 4
in a bowl. Select a number at random 1-16 to select a rabbit for Diet A, and pull a
number out of the bowl to select a position on the top row. Repeat three times without
replacement for Diets B, C, and D to complete the assignment to the top row. Follow the
same procedure to assign the other three rows. An example is shown below.

This is a randomized block design and should not be analyzed with the one-way
ANOVA. For the sake of illustration, the data collected with this method is the same as
with the completely randomized design. Ordering the data so it is easier to read, we have
the following observations.

Diet A Diet B Diet C Diet D Mean for


Shelf
Shelf 1 62 63 68 56 62.25
Shelf 2 60 67 66 62 63.75
Shelf 3 63 71 71 60 66.25
Shelf 4 59 64 67 61 62.75

Mean for 61 66.25 68 59.75 Grand Mean


Diet 63.75

Two-way ANOVA

The model that includes the blocking variable is

Yt i = µ + τ t + β i + ε t i .
By blocking on the shelf position, we hope to increase the power of the test by removing
variability associated with shelf height. This would allow us to detect smaller differences
between treatments.
Our estimates are
yt i = ygg + ( yt g − ygg ) + ( ygi − ygg ) + εˆt i
↑ ↑ .
τt βi
Considering the sums of squares, we have

∑∑ y = nyg2g + ∑ ( yt g − ygg ) + ∑ ( ygi − ygg ) + ∑∑ εˆti2


2 2 2
ti
t i t i t i
↑ ↑
Trt SS Block SS
The expression ∑∑ yti2 − nyg2g is the Total sum of squares, so we have
t i
Total SS = Trt SS + Block SS + Error SS.

Computing the sums of squares, we have

65300 − 65025 = 191.5 + 38 + 45.5


↑ ↑ ↑ ↑

∑∑ y − nyg2g = ∑ ( yt g − ygg ) + ∑ ( ygi − ygg ) + ∑∑ εˆti2


2 2 2
ti
t i t i t i

The computer output gives the same computed sums of squares.

Response: Coagulation Time

Effect Test
Source DF Sum of Squares Mean Square F Ratio Prob>F
Diet 3 191.50000 63.833 12.6264 0.0014
Row 3 38.00000 12.667 2.5055 0.1250
Error 9 45.50000 5.0556
C Total 15 275.00000

Diet Mean Shelf Mean


A 61.0000 1 62.2500
B 66.2500 2 63.7500
C 68.0000 3 66.2500
D 59.7500 4 62.7500

From the computer output, we see that there is again a statistically significant
difference in coagulation time, with the p-value slightly smaller ( p = 0.0014) . The mean
square error has been reduced to 5.0556 and the degrees of freedom are reduced to 9. In
2 ⋅ 5.0556
this case, LSD = 2.262 = 3596
. . Any difference in means greater than 3.6 is
4
considered significant. The means for Diet B and Diet C are not significantly different,
as are those for Diet A and Diet D. However, Diets B and C have larger mean
coagulation times than Diets A and D.

33
There is no evidence that there is a difference in mean coagulation time among
shelves. We did reduce the variation by blocking, but there was no significant shelf
effect. If we were to perform this experiment another time, a completely randomized
design would be appropriate.

Latin Square Design

Suppose we want to block for both the row and column position in the room. In
this case, we need to insure that each diet is found once on each row and once in each
column. The Latin square allows this constraint to be satisfied. To set up a Latin square
begin with the standard square shown below.

Now permute the columns. One example is C3, C4, C2, C1. Then permute the rows.
One example is R4, R1, R2, R3. This creates the Latin square below.

Note that each letter appears once in each row and column. Put the integers 1-16 in a
bowl. Select a rabbit and pick a number out of the bowl. That rabbit is assigned the
position in the array indicated and given the diet specific for that position. One possible
configuration is shown below.

34
Now the model is
Yt i j = µ + τ t + βi + γ j + ε t i j .

By blocking on the shelf position and the column position, we hope to increase the power
of the test by removing variability associated with both row and column positions. We
give up degrees of freedom for reduction in variability.

Our estimates are


yt i j = yggg + ( yt gg − yggg ) + ( ygi g − yggg ) + ( ygg j − yggg ) + εˆt ij
↑ ↑ ↑ .
τt βi γj
Considering the sums of squares, we have

∑∑∑ y = nyg2gg + ∑ ( yt gg − yggg ) + ∑ ( ygi g − yggg ) + ∑ ( ygg j − yggg ) + ∑∑∑ εˆtij2


2 2 2 2
tij
t i j t i j t i j
↑ ↑ ↑
Trt SS Shelf SS Column SS

The expression ∑∑∑ y


t i j
2
tij − nyg2g is again the Total sum of squares, so we have

Total SS = Trt SS + Shelf SS + Column SS + Error SS.

The computer output is given below.

Response: Coagulation Time


Analysis of Variance
Source DF Sum of Squares Mean Square F Ratio Prob>F
Diet 3 191.50000 63.833 13.2069 0.0047
Row 3 38.00000 12.667 2.6207 0.1455
Column 3 16.50000 5.5 1.1379 0.4065
Error 6 29.00000 4.8333
Total 15 275.00000

Diet Mean Shelf Mean Column Mean


A 61.0000 1 62.2500 I 63.0000
B 66.2500 2 63.7500 II 64.7500
C 68.0000 3 66.2500 III 64.7500
D 59.7500 4 62.7500 IV 62.5000

35
From the computer output, we see that there is again a statistically significant
difference in coagulation time among the diets, with the p-value larger than when only
one blocking variable was used ( p = 0.0047) . The mean square error is now 4.8333 and
2 ⋅ 4.8333
the degrees of freedom are reduced to 6. In this case, LSD = 2.447 = 3804
. ,
4
which is larger than the LSD when only one blocking variable was used. Since we now
need a larger difference in means to consider the difference significant, we can conclude
that adding the second blocking variable does not add power. The reduction in mean
square error is compensated for in the loss in degrees of freedom. As a result of blocking
on a meaningless variable, we can only distinguish between treatments whose means are
more than 3.8 units apart. Blocking only on shelf allowed us to distinguish as different
treatments whose means were 3.6 units or more apart.

The completely randomized design has the simplest analysis, and it should be
used if there are no other mitigating structural factors in the experiment. If the cages
were arranged so all cages were essentially equal relative to outside influences there
would be no need to block. The completely randomized design would be the most
efficient. As shown in this example, blocking when there is no need for blocking reduces
the power of the test, since it reduces the degrees of freedom without appreciably
reducing the variability.

36

You might also like