You are on page 1of 3

Yule’s Characteristic- K

Abstract The novel under study is


Yule’s characteristic K (1944) is used divided into twenty parts on the basis
as a stylistic parameter. It is stated that of number of words. It was decided to
Yule’s K is independent of size of sample. take different samples of sizes 500;
This research paper is an attempt to 700; 900; : : : ; 2900 words. There are
examine the statement. thirteen different sizes of samples.

Introduction
Word distribution is a typical type From each of twenty parts one
of distribution wherein the form of the sample of each size was taken by
distribution changes with change in following sampling procedure.
sample size. i) From the start (first word) of
Word distributions belong to the each of twenty parts a sample of 500
class of distributions of ”multiple words is taken. Thus twenty samples of
happenings or repeated events." It size 500 are obtained.
gives us the frequency distribution of ii) The next sample of 700 words is
frequencies. That is the variable x, obtained from (351st) word of each part
defined as number of times a word .That is first 350 words are not taken in
occurs, itself is a frequency. second sample. Thus twenty samples,
Hence comparison of the works of size 700 are obtained.
of two authors and so also comparison iii) The same procedure is
of the works of the same author continued till each part is exhausted,
becomes very difficult. Hence that is till less than 100 words
something which characterises the remained in each part. This procedure
word distribution and yet is gave rise to thirteen samples of
independent of size of sample is different sizes from each of twenty
necessary. parts.
This type of characteristic was Thus 13 * 20 = 260 samples of
provided by Yule (1944). different sizes are collected.
It is termed as Yule's
Characteristic K. Statistical Analysis
According to Yule the A variable X is defined as
Characteristic characterises the word X = x: number of times a word
distribution and is independent of size occurs.
of sample. f x = Number of words occurring x
times.
He also says that, the conclusion Xf x = Total number of words used in
about the independence of K and size the sample.
of sample is purely theoretical and the If,
practical student will not be thoroughly S1 = ∑ x f x
convinced unless the Characteristic S 2 = ∑ x2 f x
stands the test of actual trait. Yule's Characteristic is defined as,
K = 104 (S2/S12-1/S12)
Methodology The data is analysed by using Karl
For different sample sizes well Pearson’s coefficient of correlation &
spread over the text, we computed graphs.
values of Characteristic proposed by
Yule (1944).
For this study we have selected Application
the novel “Tess Of D'Urbervillies" of
famous English author Thomas Hardy.
A bivariate coefficient of is necessary for an accurate
correlation was obtained between calculation of Yule's K. Hence further
sample size and Yule’s Characteristic an attempt was made to find out
K. The variables are defined as, coefficient of correlation between
X = x: Sample size Yule's K and sample size above 2000,
and x = 500, 700, 900, . . . , which gave the following results,
2900.
Y = y: Yule's Characteristic K. rxy = zero
Then, the following results are
obtained which shows that Yule's K is not
linearly dependent on sample of size
r xy = − 0.238206 above 2000 words.
G-2 is the graph of the same
which shows that the two variables type for sample sizes ranging from
are certainly not independent of each 2100 to 2900 shows very little
other. dispersion in K. The five lines are not
This result is putting a question easily distinguishable. Also the values
mark on the result obtained by Yule, of K have come down or decreased
that the Characteristic K is with increased sample size. Which
independent of size of sample. conforms with the result of negative
Further the result shows that the correlation.
two variables are negatively
correlated. That is as there is an
increase in the sample size the value
of the Characteristic K decreases.

Graphs G-1 and G-2 reveal a very


peculier Characteristic.

We may say that for increased sample


sizes Yule’s K is constant. Rather if the
sample size is greater than 2000, then
Yule’s K may be independent of size of
sample.
G-1 is a graph plotted for Yule’s K Conclusion
against twenty samples of one size. i) Yule's K may not be independent of
Each graphical line represents one sample size.
sample size. Thus for sample sizes 500 ii) Yule's K may remain constant for
to 1900 there are six different lines samples above 2000 words.
easily distinguishable or showing
enough dispersion among them.

That is for different sample sizes


ranging from 500 to 1900 Yule’s K is
not constant and shows quite
dispersed values.
Bibliography:
Ellegard,(1962), stated that, that
sample containing at least 2000 words
1) Dolezal Lubomir and Bailey Richard,
Statistics and style (1969) American
Elsevier publishing company INC, New York

2) Gustav Herdan, Quantitative Linguistics


(1964) Butterworth and Co. (Publishers)

3) Thomas Hardy, Tess Of D’Urbervillies,


Library Edition (1952) Mcmillan and Co.
Ltd., New York.

4) Yule G. Udny, The Statistical Study Of


Literary Vocabulary (1994), Cambridge
University Press.

________

You might also like