You are on page 1of 11

The Analysis of Variance of Diallel Tables Author(s): B. I. Hayman Reviewed work(s): Source: Biometrics, Vol. 10, No. 2 (Jun.

, 1954), pp. 235-244 Published by: International Biometric Society Stable URL: http://www.jstor.org/stable/3001877 . Accessed: 13/02/2012 07:03
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access to Biometrics.

http://www.jstor.org

THE ANALYSIS OF VARIANCE OF DIALLEL TABLES


B. I.
HAYMAN

A.R.C. Unit of Biometrical Genetics, Department of Genetics, University of Birmingham

1. Introduction A diallel cross is the set of n2 possible single crosses and selfs between n homozygous (inbred) lines; it provides a powerful method of investigating the relative genetical properties of these lines. A diallel table is a set of n2 measurements associated with a diallel cross, e.g. measurements from the progeny of a diallel cross, or from later generations obtained by selfing or backerossing these progeny. A summary of a method of describing the genetical situation generating a diallel table has already appeared (Jinks and Hayman, 1953) aild fuller accounts will appear in papers by Jinks and by Hayman. Here an analysis of variance is described which tests additive and dominance effects in diallel tables obtained from the progeny of a diallel cross. 2. Additive systems A single diallel table will be considered at first, but in practice it is desirable to replicate the experiment to provide estimates of error from the block interactions, because many of even the more complex interactions within the diallel table have a genetical meaning. Suppose that the measured character is controlled by genes at k loci. In the simplest genetical system with the genes acting independently and additively the measurement of the progeny of a single cross is the mean of the two parental measurements. Maternal effects may cause differences between the progeny of reciprocal crosses so that we suppose the additive property to hold for means of reciprocal crosses. Let Yrs be the entry in the rth row and sth column of the diallel table, the common parent of each row being of one sex, and the common parent of each column of the other sex. (Hermaphrodites would be used as male
235

236

BIOMETRICS, JUNE 1954

parents for rows and as female parents for columns). The appropriate statistical model to test foi additive variation between the parents, and for maternial effects, is obtained by fitting coinstants to the table as follows
Yrs =

m +

jr

js + irs

kr -

+ kCs

kCrs

where

m = grand mean, jr = mean deviation from the grand mean due to the rth parents, jrs = remaining discrepancy in the rsth reciprocal sum, 2kr = difference between the effects of the rth parental line used as male parent and as female parent, = remaining discrepancy in the rsth reciprocal difference. 2kr,

Table 1 is the corresponding analysis of variance. A dot indicates summation over all values from 1 to n of the omitted suffix and the
TABLE 1.

Constant
a b
c
d

Sum of Squares 2 (yr. + Y. r)2/2n - 2y2/n2 2 (Yrs + Ysr)2/4 - 2 (Yr. + Y.r)2/2n Y.r)2/2n 2 (yr.
2
(y

Degrees of
freedom n-1 +
y2 /n2

jr jrs
klr
kcrs

jn(n

1)
-

rs -

Ysr)2/4

(Yr.

-y.r)2/2n

n - 1 (n - 1)(n
n2-1

2)

Total

y2

-2.

/n2

sigmas summation over all values of r or r and s. The four sums of squares measure (a) (b) (c) (d) variation between the mean effects of each parental line, variation in the reciprocal sums not ascribable to (a), average maternal effects of each parental line, variation in the reciprocal differences not ascribable to (c).

This analysis was given by Yates (1947) who used (b) as the error against which to test line differences (a), and (d) as the error for maternal effects (c). That is equivalent to analysing separately the row (or column) means of two distinct two-way tables, one containing the sums of measurements from reciprocal single crosses, and the other the differences of reciprocals.

DIALLEL TABLES

237

3. Domintance The inclusion of dominance in the genetical system alters the situation radically. Since the deviation of progeny from their parental mean depends on dominance, (b) in Table 1 is a measure of dominance. Hence, in the absence of replication, (d) must be used as the common error against which to test (a), (b) and (c). To interpret the components (a) and (b) more precisely we introduce a biometrical genetical model similar to Mather's (1949) specification of the effects of a polygenic system. As there are n ( > 2) homozygous parents in a diallel cross we consider multiple allelic systems and suppose that mi different alleles occur at the ith locus (i = 1, 2, * k) in the set of parents. The genotype at the ith locus of any individual may be represented by a pair of integers (a, b) where a and b = 1, 2, . ..Mi . The whole genotype controlling a character is represented by k pairs of numbers (a, b). In a parent the representation is k pairs of identical numbers (a, a). If the genes at non-homologous loci do not interact let dabi be the contribution of (a, b) at the ith locus to the measurement. Then the measurements of two parents and their F1 are respectively Ei dai a Ei dbi and Ei dabi (writing dai for daai). In the additive system of section 2, dabi = 2(dai + dbi) but, with interaction between alleles at homologous loci, i.e. dominanice, we put dabi = habi + 2(dai + dbi), habi being the measure of dominance. Lastly, let Uai (Ea Uai = 1) be the frequency of allele a at the ith locus in the parents. Assumin-gthat the genes at different loci are distributed independently in the parents, we find that the mean squares corresponding to (a) and (b) are 2n Ei Ea Uai (2dai Ubihabi - Eb,c Eb Ubidbi + Eb + oe and 2 Ei + a,b UaiUbi (habi - Ec Ucihaci Ucihbci Ec UbiUcihb0,) + tevracie ~o,du0~u~h0i)2 + _2 ._2.'e is the variance of entries in the diallel table UciUdih,di)2 Ec,d 2e due to environmental causes and is assumed to be independent of the genetic variation. Table 3 contains in the second column the corresponding quantities for the two-allele case with uli = ui , u2i = vi, ui -vi = wi dli = di , d2i = -di and h12i = hi . This is Mather's (1949, p. 74) notation, and equivalents in terms of his random mating D and H are in the fourth column. The third column contains equivalents in the notation of Jinks and Hayman (1953) with the additional definlition h = 4 ,i uivihi . We will continue to discuss the general case but essentially the same conclusions may be drawn from the simpler two-allele system. Since (b) reduces to cr2 only when all habi = 0, it clearly detects mean square dominance. The other mean square (a), which in section 2

238

BIOMETRICS, JUNE 1954

detected additive variation, here detects dominance variation as well, unless the frequencies Uai satisfy the symmetry condition given later. (a) and (b) respectively measure general and specific combining ability differences as defined by Henderson (1952). The mean squares (c) and (d) both estimate T-2in the absence of maternal effects. At this stage biometrical genetics tends to diverge from this simple statistical approach. The obvious estimator of purely additive genetic variation is the variance of the parental measurements-the diagonal entries in the diallel table. This isEi Uai (dai Ubidbi) Ea Eb (Jinks and Hayman D in the two-allele case), whether or not dominance is present, but unfortunately we cannot test its significance by this analysis of variance. Many other interesting statistics exist whose significance is difficult to establish. However, we can extend the linear statistical model of section 2 by fitting constants for the dominance difference between parental mean and progeny mean and for deviations from this due to specific parents. The new corresponding sums of squares will be components of (b) but their meaning may not be clear until they have been expressed in terms of genetical parameters. Let
Yrs =m
+ jr

+ js

+
-

I +

Ir +

Is +

Irs

kr -

ks + krs

(r

s)

Yr = m + 2ir - (n

1)1-

(n

2)1r

(for yrr)

The new constants are I = mean dominance deviation, = further dominance deviation due to the rth parent, ir, = remaining discrepancy in the rsth reciprocal sum.
ir

The sum of squares (b) in Table 1 is replaced by those in Table 2. The third item is more conveniently obtained as a difference.
TABLE 2.

Constant
b1
b2
b3

Sum of Squares
(y.. - ny. )2/n2(n-1) - (2y.. -ny. -2)
-

Degrees of freedom 1
)2/n2(n
-

1
Ir irs

2
2

(yr.

Y.r - nYr)2/n(n

2)

n'n(n
-

1 3)

(Yrs + Ysr)2/4

Yr r

(Yr.

Y.r

2(n

2) + (y..

--y.)2/(n-)(n

-2)

nf

(Zi

In terms of the biometrical genetical model the mean square (b,) is - 1) + . which estimates the square of UaiUbihabi)2/(n Za,b

DIALLEL TABLES

239

the mean dominance as expected. Table 3 contains the corresponding mean square for the two-allele case. Since habi may be either positive or negative this mean dominance may be zero without the mean square dominance vanishing. The mean square (b2)is 4n Ei Ea Uai (Zb Ubihabi _2 - 2) + This reduces to _2 either in the . UbiU0ihbi)2/(fn Zb,c absence of dominance or when the gene frequencies satisfy the symmetry
TABLE 3.

Mean squares with two alleles

Jinks and Hayman (1953)

Mather (1949) 1nD + E 'H + E

a b bi
b2

2n 2 Se uivi(di hiwi)2 + 8 2 uYM + Se 4n2(2 uivihi)2/(n - 1) + 4n 2 uiviwhA/(n -2) + Se


2
2

n(D-F
-2

+ Hi-H2)

+ E

1H2 + E 11n2h2/)(n

+ E
+ E E

n(Hi

H2)/

(n2) E

relation Uai

Zb

Habi/Eab

Habi

where Habi is the cofactor of

habi

in the determinant {habi} (a, b = 1, 2, m. i). This is also the condition that mean square (a) should detect only additive variation. Illustrative examples of this relation are (i) When habi = constant for all a 5 b then any one locus are equally frequent. (ii) In the two-allele case ui - v = Table 3.
,
Uai =

1/mi, i.e. all alleles at

which is also obvious from


-h23i):

(iii) In the three-allele case u1i ; u2i u3i = h23i (h31i + h12i h3li (hl2i + h23i - h31i):h12i (h23i + h3li -h2i)

The mean square (b3) also estimates dominance but has no simple interpretation, though, when the gene frequencies satisfy the symmetry relation, (b,) and (b3) together provide a test of dominance equivalent to (b). 4. Subdividing the experiment Limitations of labour and equipment may necessitate the performance of the diallel cross in sections in different places or at different times, as in a Drosophila experiment of Durrant and Mather (1954). If a Latin square is superimposed upon the diallel table each letter indicates

240

BIOMETRICS, JUNE 1954

a set of single crosses which may be performed apart from the other sets. In the analysis of variance the sum of squares for the time or distance effect is computed in the way usual for Latin squares. The letters of the Latin square are orthogonal to its rows and columns so that this sum of squares is independent of (a) and (c) in the analysis of variance of the diallel table; it is not independent of the other components. The analysis of variance thus contains the time component, (a), (c) and a remainder. By restricting the Latin square, further orthogonal items may be extracted from the above remainder sum of squares. If the Latin square is symmetrical about the main diagonal of the diallel table, i.e. each pair of reciprocal crosses lies in one set, then (d) is also independent of the time effect. When the Latin square has n different letters in the leading diagonal, so that each self lies in a different set, (b1), the measure of mean dominance, is orthogonial to the time effect. When n is odd the Latin square can both be symmetrical and have n different letters in the diagonal and an analysis is possible into the independent sums of squares (a), (b1), (c), (d), time effect and remainder. Unfortunately no test of mean square dominance seems possible. The second square in Table 4 is an example of a restricted 5 X 5 Latin square derived from
TABLE 4.

A B C D E

B C D E A

C D E A B

D E A B C

E A B C D

A B C D E

B D E C A

C E B A D

D C A E B

E A D B C

the first square by simultaneous permutation of the rows and columns at random. 5. Workedexample. The data used to illustrate the analysis of sections 2 and 3 were kindly supplied by Dr. Jinks. They are the flowering times, in days from a certain date in 1951, of Nicotiana rustica plants from a diallel cross of eight inbred varieties. These plants were grown in two blocks, each containing 64 plots; each cross or self was represented by 10 progeny, grown in two plots of 5, with one plot in each block. This duplication of the experiment provides independent tests of the significance of every one of the components described in the analysis of variance of a single diallel table. The twvodiallel tables, I and II, in Table 5 contain 10 times the mean flowering time per plot.

DIALLEL

TABLES
TABLE 5.

241

9
I 1 2 3 d 4 5 6 7 8 Y.r Yr. + Y.r Yr. - Y.r Yr. + Y.r - 8Yr Yrs - Ysr 1 276 136 246 318 150 182 174 152 1634 3367 99 1159 2 156 166 158 132 124 136 86 128 1086 2124 -48 796 -20 3 322 164 416 218 164 204 194 158 1840 3549 -131 221 -76 -6 4 250 134 213 272 164 216 142 136 1527 3032 -22 856 68 -2 5 5 162 102 160 138 156 133 86 126 1063 2193 67 945 -12 22 4 26 6 193 150 222 195 158 174 92 114 1298 2575 -21 1183 -11 -14 -18 21 -25 7 222 96 128 108 100 112 58 84 908 1834 18 1370 -48 -10 66 34 -14 -20 8 152 90 166 124 114 120 94 142 1002 2042 38 906 0 38 -8 12 12 -6 -10 Y/r. 1733 1038 1709 1505 1130 1277 926 1040 10358 20716 1660 7436 y.. 2y.. y 2y..

8y.

II 1 2 3 d 4 5 6 7 8 Y. r Yr. + Y.r Yr. - Y.r Yr. + Y.r 8Y,r Yrs -Ysr

1 302 142 242 204 180 186 162 151 1572 3324 180 908

2 178 175 174 138 140 146 100 138 1189 2302 -76 902 -36

3 274 136 360 206 156 202 162 140 1636 3252 -20 372 -32 38

4 246 128 178 210 146 222 100 144 1374 2768 20 1088 -42 10 28

5 140 128 140 130 176 150 98 124 1086 2350 178 942 40 12 16 16

6 204 174 208 192 192 166 84 112 1332 2716 52 1388 -18 -28 -6 30 -42

7 254 116 160 138 104 136 48 96 1052 1948 -156 1564 -92 -16 2 -38 -6 -52

8 154 114 154 176 170 176 142 166 1252 2326 -178 998 0 24 -14 -32 -46 -64 -46

Yr. 1752 1113 1616 1394 1264 1384 896 1074 10493 20986 1603 8162 y.. 2y* y. 2y..

-8y.

The computations should be carefully arranged as in Table 5. Diallel table III contains the sum of corresponding pairs of entries in the first two diallel tables. Beside each of the three diallel tables are the row sums Yr. and below them are the column sums Y.r 1 the combined

242
TABLE B Cont.

BIOMETRICS,

JUNE 1954

9 III 1 2 3 4 5 6 7 8 1 578 278 488 522 330 368 336 306 3206 6691 279 2067 2 334 341 332 270 264 282 186 266 2275 4426 -124 1698 -56 3 596 300 776 424 320 406 356 298 3476 6801 -151 593 -108 32 4 496 262 391 482 310 438 242 280 2901 5800 -2 1944 26 8 33 5 302 230 300 268 332 283 184 250 2149 4543 245 1887 28 34 20 42 6 397 324 430 387 350 340 176 226 2630 5291 31 2571 7 476 212 288 246 204 248 106 180 1960 3782 -138 2934 8 306 204 320 300 284 296 236 308 2254 4368 -140 1904 0 62 -22 -20 -34 -70 -56 Yr. 3485 2151 3325 2899 2394 2661 1822 2114 20851 41702 3263 15598 y.. 2y.. y. 2y.

Y.r Yr. + Y.r Yr. - Y.r Yr. ? Y.r - 8y Yrs Ys

8y.

-29 -140 -42 -26 -24 68 -4 51 -20 -67 -72

row and column sums Yr. + Y.r , the row and column differences Yr. Y.r ,

the parental deviations Yr. + Y.r - nYr and the full set of differences between reciprocal crosses. The totals of the sets of sub-totals provide simple checks and the values of y.. and 2y.. - ny. . The parental totals y. have been placed at the ends of the rows of values of Yr. - Y.r which, of course, sum to zero. Table 6 contains intermediate sums of squares, computed directly for the first two diallel tables, and halved for the third. The formulae
TABLE 6.

I 2;yrs
Y../n2 1 (Yr. + Y r)2/2n (y.. - ny. ) 2/n2(n 1 (Yr. + Y. r - nYr)

II 1,890, 133 1,720,360 3,543,103

III 3,795, 662 3,396,595 7,070,907

1)
2/n (n
-

2)

1,931,932 1,676,378 3,538,105 19,058 161,432

12,128
192,004

30,797
350,946

(2y.. z z

(Yr. (Yrs -

ny. ) 2/n2(n Y. r)2/2n Ysr) 2/4

2)

143,995
2,278 12,168

173,485
8,086 17,754

316,794
6,739 19,112

DIALLEL

TABLES

243

of Tables 1 and 2, applied to the third intermediate set of sums of squares provide the final sums of squares (a), (b,), (b2), (b3), (c) and (d) which measure mean effects over the two diallel tables. The excesses of the totals of the two similar final sums of squares for the first two diallel tables over the final sums for the third measure the block interactions or errors of the mean effects. As a check, (b3) and its block interaction can be computed from sums of reciprocal crosses, but we have simply obtained them by difference from the total sums of squares. The sum of squares (B) for the overall block difference is computed in the usual way. Table 7 contains this analysis of variance. (b) is the sum of
TABLE 7.

Sum of Squares
a bi
b2

df
7 1 7 20 28 7 21 63 1

Mean square
39,674 30,797 4,879 1,864 3,651 963 589

P
<.001 <.001 <.001 <.001 <.001 .05-.01 .20-.10

b3 b c d t B

277,717 30,797 34,153 37,289 102,238 6,739 12,373 399,067 142

142

Ba Bb1 Bb2 Bb3 Bc Bd Bt Total

10,016 390 1,803 3,241 3,625 7,185 26,260 425,470

7 1 7 20 7 21 63 127

1,431 390 258 162 518 342 417

(b1), (b2) and (b3), (t) is the sum of the main effects apart from (B), and (Bt) is the sum of the interaction sums of squares. Each error is the interaction with the environment of the corresponding mean effect and, since we would not expect, for example, additive and dominance variation to be influenced to the same extent by the environment, we must generally test each mean effect against its own

244

BIOMETRICS, JUNE 1954

interaction. However, Bartlett's test for heterogeneity of the six error variances gives X2 = 6.4, so that in this case the error variances may be pooled to give (Bt) as a common error variance. Comparison with this provides the significance levels in the last column of Table 7. The interpretation of the results is straightforward. The significance of (a) shows genetical variation amongst the parents and of (b) dominance at some of the loci. The parental mean is greater than the progeny mean (from (b,)) indicating dominance for early flowering time. The significance of (b2) implies asymmetry in the gene distribution. The two items (c) and (d) show that some maternal effect may be present. Finally, there is no evidence that the difference in environment between the blocks (B) has caused any variation in flowering time. 6. Summary An anlaysis of variance of diallel tables is developed which detects both additive genetic variation and dominance deviations. The mean squares are formulated in terms of a biometrical genetical model. Flowering times from a diallel cross of eight inbred varieties of Nicotiana rustica are analysed and the type of genetic variation present described. 7. References
Durrant, A. and Mather, K. Heritable variation in a long inbred line of Drosophila, Genetica, in the press. Hayman, B. I. The theory and analysis of diallel crosses, Genetics, in the press. Henderson, C. R. Specific and general combining ability, Heterosis, Iowa State College Press, 352-370, 1952. Jinks, J. L. and Hayman, B. I. The analysis of diallel crosses, Maize Genetics Cooperation News Letter, 27, 48-54, 1953. Jinks, J. L. The analysis of heritable variation in a diallel cross of Nicotiana rustica varieties. Genetics, in the press. Mather, K. BionmetricalGenetics, London, Methuen and Co., 1949: Yates, F. Analysis of data from all possible reciprocal crosses between a set of parental lines. Heredity, 1, 287-301, 1947.