Professional Documents
Culture Documents
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
International Biometric Society is collaborating with JSTOR to digitize, preserve and extend
access to Biometrics
This content downloaded from 189.191.173.43 on Tue, 26 Feb 2019 03:20:26 UTC
All use subject to https://about.jstor.org/terms
BIOMETRICS 32, 429-434
June 1976
C. JAMES SCHEIRER
Department of Psychology, State University of New York at Binghamton, Binghamton, New York 13901,
U.S.A.
WILLIAM S. RAY
NATHAN HARE
Department of Psychology, State University of New York at Binghamton, Binghamton, New York 13901
Summary
This paper presents a method for the analysis of ranked data arising from completely random-
ized factorial designs. The procedure, which is an extension of the Kruskal-Wallis ranks test,
allows for the calculation of interaction effects and linear contrasts. A Monte Carlo study of the
convergence of the test and a worked example are presented.
1. Introduction
2. Procedure
Key Words: Kruskal-Wallis test; H-test; Ranked data; Linear contrasts on ranks; Interaction on ranks.
429
This content downloaded from 189.191.173.43 on Tue, 26 Feb 2019 03:20:26 UTC
All use subject to https://about.jstor.org/terms
430 BIOMETRICS, JUNE 1976
a single score obtained for each subject. For the purpose of this paper it will be assumed
that there are n subjects in each of the k treatment groups so that N = nk. In general the
assumption of equal numbers of observations in each cell is not required for the Kruskal-
Wallis rank test; however, since the purpose of this paper is to present a method of analysis
for data arising from factorial designs and since the assumption of proportional cell fre-
quencies is necessary in that case (unless a more complex analysis is utilized), we proceed
with the assumption of equal numbers of observations in all cells.
Let Xi i be the ith score in the jth group of a single-factor completely randomized design.
Furthermore assume that the scores are distributed continuously so that ranks, R(Xi,),
can be assigned to the Xi unambiguously. If the null hypothesis of no treatment effects
is true (,j, = , = ... = pi = .. = /f) then the expected values of the rank totals,
R= Z=1 R(Xij), are all equal. This null hypothesis may be tested by computing the
Kruskal-Wallis H-statistic
12 (k 2
H E~ ( R2i/) - 3 (N + 1),
and comparing the obtained value with the value of x2 with k - 1 degrees of freedom (d.f.)
at the a level.
If the R(Xi i) are treated as the dependent variable in an ANOVA the algebra of ranks
shows that H is equivalent to the ratio of the sum of squares for treatment, calculated on
ranks (RSS), divided by a "mean square" for the total variability (RMS), also calculated
by ranks in the manner of the analysis of variance (ANOVA). That is
H = RSStreatment
R1V[Stotal
2.2 Justification.
The derivation of the Kruskal-Wallis H-test (Kruskal and Wallis [1952]) is based on the
assumption that the rank sums, Rf , are the sums of n independent random variables and
therefore, for large n, the Central Limit Theorv may be applied. That is, the quantity
R- E(Rf)
Var ,(Rt)
[R -E(Rj)]2
Var (Rf)
This content downloaded from 189.191.173.43 on Tue, 26 Feb 2019 03:20:26 UTC
All use subject to https://about.jstor.org/terms
THE ANALYSIS OF RANKED DATA 431
individual chi-square variables is not distributed as a chi-square variable with k d.f. Rather,
Kruskal [1952] has shown that if each term in the sum is multiplied by the correction factor
(N - n)/N, then the sum of the k non-independent chi-square variables is a random vari-
able, H, which is distributed asymptotically as a chi-square variable with c - 1 d.f.
Consider now the linear contrast Tfl = y cmRi . It follows that
which, taken over all possible assignments of scores to columns, is approximately distributed
as a normal deviate for large n. Furthermore it is clear that E(Zj ciRj) = 0 if the null
hypothesis, Ho: /A1 = = Ai = = Ak , is true.
If the R, were independent
c,,,2 Var R,
and therefore
(Z c.,,,iR )2
_ 3 _ ____
t(N - n)(N + 1) E: c. 2
12
H _ (I cmjR i)2 1
XI n E ci 2 N(N + 1)/12
Examination of this result shows that the denominator of the second factor of the expression
is exactly RMStOUI while the first factor is a linear contrast applied to rank totals (RSSM).
3. Further Considerations
The statistic Hm is a chi-square variable with a single d.f. In practice, however, effects
with more than a single d.f. may be of interest. For example in a 2 X 3 factorial design, one
may be interested only in the main effects and the interaction effects.
Suppose B, an effect with 1 < r < k - 1 d.f. is of interest. RSSB , calculated on the
rank totals, can always be partitioned into X contrast sums of squares. Since each of these
This content downloaded from 189.191.173.43 on Tue, 26 Feb 2019 03:20:26 UTC
All use subject to https://about.jstor.org/terms
432 BIOMETRICS. JUNE 1976
Table 1
PROPORTION OF CASES WITH H OR H., < x FOR a AS INDICATED
Mean Proportion
Four Groups of Size Four (Overall) 53.43 32.02 20.41 8.78 3.34 .56 .07
Four Groups of Size Four (Comparison 1) 49.85 33.40 20.13 9.39 4.67 1.39 .63
Four Groups of Size Four (Comparison 2) 50.51 32.98 20.00 10.70 5.12 1.25 .66
Two Groups of Size Eight 50.44 33.06 20.35 10.94 5.13 1.31 .67
This content downloaded from 189.191.173.43 on Tue, 26 Feb 2019 03:20:26 UTC
All use subject to https://about.jstor.org/terms
THE ANALYSIS OF RANKED DATA 433
Table 2
LATENCIES AND ASSOCIATED RANKS AND RANK TOTALS FOR DATA DISCUSSED IN TEXT
LO Med Hi
CU
7-4 4.36 ( 4) 2.61 (10) 2.95 ( 7)
BW
LO MIed Hi
RSBL=(368.0 1-4.2
196 .5)2=10.4
RSSBWVXENV = RSSTREAT - RSSENV - RSSBTV = 997.58
Source df H,,
BW 2 7.98
BWL 1 6.98*
BWresidul 1 1.00
ENV 1 0.43
BW X ENV 2 6.63
BWL X ENV 1 6.19*
BWDT\ X ENV 1 0.43
Treatment ,5 15.()4
*p7 < .05
This content downloaded from 189.191.173.43 on Tue, 26 Feb 2019 03:20:26 UTC
All use subject to https://about.jstor.org/terms
434 BIOMETRICS, JUNE 1976
4. Example
4.1 An Example.
The following example is from a paper by Hahn, Haber and Fuller [1973]. Forty-two
pairs of mice, selectively bred for brain weight (small, medium and large) were raised under
one of two environmental conditions. After maturation, pairs of subjects from the same
brain weight and environmental condition were paired and their fighting behavior noted.
The 3 X 2 table consists of scores of "seconds of tail rattling per second of fighting,"
a measure of aggressive behavior in mice. Table 2 presents the data and the results of the
analysis performed on these data. It might be noted that the use of ranks allows for simpli-
fied formulae for the calculation of C' and RMStotai.
Acknowledgments
The authors wish to thank Professors Richard Burright and Jane Connor for many
helpful suggestions during the formulation of this paper.
Resume
Cet article presented une methode d'analyse de rangs sur des donnees provenant de dispositifs
factoriels completement randomises. La procedure, qui est une extension du test de rang de
Kruskal-Willis, permet le calcul d'effets d'interaction et de contrastes lineaires. On presented une
etude de Monte Carlo pour la convergence du test et un exemple d'application.
References
Bradley, J. V. [1968]. Distribution-free Statistical Tests. Prentice Hall, Englewood Cliffs, New Jersey.
Coveyou, R. R. and Macpherson, R. D. [1967]. Fourier analysis of uniform random number generators.
Assoc. Comp. Mach. J. 14, 100-19.
Dunn, 0. J. [1964]. Multiple comparisons using rank sums. Technometrics, 6, 241-52.
Gore, A. P. [1971]. A nonparametric test based on the U-statistic for interaction in a two-way layout.
(Preliminary Rep.) Ann. Math. Statist. 42, 1486.
Hahn, M. E., Haber, S. B. and Fuller, J. L. [1973]. Differential anatogonistic behavior in mice selected
for brain weight. Phys. and Behavior 10, 759-62.
Kruskal, W. H. [1952]. A nonparametric test for the several sample problem. Ann. Math. Statist.
23, 525-40.
Kruskal, W. H. and Wallis, W. A. [1952]. Use of ranks on one-criterion variance analysis. J. Amer.
Statist. Assoc. 47, 583-621.
McDonald, B. J. and Thompson, W. A. [1967]. Rank sum multiple comparisons in one- and two-way
classifications. Biometrika 54, 487-98.
Mehra, K. L. and Sen, P. K. [1969]. Conditionally distribution free tests for interactions. Ann. Math.
Statist. 40, 658-64.
Mehra, K. L. and Smith, G. E. J. [1970]. On nonparametric estimation and testing for interactions
in factorial experiments. J. Amer. Statist. Assoc. 65, 1283-96.
Patel, K. M. and Hoel, D. G. [1973]. A nonparametric test for interaction in factorial experiments.
J. Amer. Statist. Assoc. 68, 615-20.
Sherman, E. [1965]. A note on multiple comparisons using rank sums. Technometrics 7, 255-6.
Sills, D. L. (ed). [1959]. International Encyclopedia of the Social Sciences, Vol. 9, Macmillan Co.,
New York.
Steel, R. G. D. [1959]. A multiple comparison rank sum test: Treatments versus control. Biometrics
15, 560-72.
Steel, R. G. D. [1960]. A rank sum test for comparing all pairs of treatments. Technometrics 2, 197-207.
This content downloaded from 189.191.173.43 on Tue, 26 Feb 2019 03:20:26 UTC
All use subject to https://about.jstor.org/terms