Professional Documents
Culture Documents
Introduction
Word distribution is a typical type From each of twenty parts one
of distribution wherein the form of the sample of each size was taken by
distribution changes with change in following sampling procedure.
sample size. i) From the start (first word) of
Word distributions belong to the each of twenty parts a sample of 500
class of distributions of ”multiple words is taken. Thus twenty samples of
happenings or repeated events." It size 500 are obtained.
gives us the frequency distribution of ii) The next sample of 700 words is
frequencies. That is the variable x, obtained from (351st) word of each part
defined as number of times a word .That is first 350 words are not taken in
occurs, itself is a frequency. second sample. Thus twenty samples,
Hence comparison of the works of size 700 are obtained.
of two authors and so also comparison iii) The same procedure is
of the works of the same author continued till each part is exhausted,
becomes very difficult. Hence that is till less than 100 words
something which characterises the remained in each part. This procedure
word distribution and yet is gave rise to thirteen samples of
independent of size of sample is different sizes from each of twenty
necessary. parts.
This type of characteristic was Thus 13 * 20 = 260 samples of
provided by Yule (1944). different sizes are collected.
It is termed as Yule's
Characteristic K. Statistical Analysis
According to Yule the A variable X is defined as
Characteristic characterises the word X = x: number of times a word
distribution and is independent of size occurs.
of sample. f x = Number of words occurring x
times.
He also says that, the conclusion Xf x = Total number of words used in
about the independence of K and size the sample.
of sample is purely theoretical and the If,
practical student will not be thoroughly S1 = ∑ x f x
convinced unless the Characteristic S 2 = ∑ x2 f x
stands the test of actual trait. Yule's Characteristic is defined as,
K = 104 (S2/S12-1/S12)
Methodology The data is analysed by using Karl
For different sample sizes well Pearson’s coefficient of correlation &
spread over the text, we computed graphs.
values of Characteristic proposed by
Yule (1944).
For this study we have selected Application
the novel “Tess Of D'Urbervillies" of
famous English author Thomas Hardy.
A bivariate coefficient of is necessary for an accurate
correlation was obtained between calculation of Yule's K. Hence further
sample size and Yule’s Characteristic an attempt was made to find out
K. The variables are defined as, coefficient of correlation between
X = x: Sample size Yule's K and sample size above 2000,
and x = 500, 700, 900, . . . , which gave the following results,
2900.
Y = y: Yule's Characteristic K. rxy = zero
Then, the following results are
obtained which shows that Yule's K is not
linearly dependent on sample of size
r xy = − 0.238206 above 2000 words.
G-2 is the graph of the same
which shows that the two variables type for sample sizes ranging from
are certainly not independent of each 2100 to 2900 shows very little
other. dispersion in K. The five lines are not
This result is putting a question easily distinguishable. Also the values
mark on the result obtained by Yule, of K have come down or decreased
that the Characteristic K is with increased sample size. Which
independent of size of sample. conforms with the result of negative
Further the result shows that the correlation.
two variables are negatively
correlated. That is as there is an
increase in the sample size the value
of the Characteristic K decreases.
________