You are on page 1of 4

Correlation 29 Unit – 4

CORRELATION procedures to judge correlation between two variables. One variable is


taken along X-axis and other Y-axis. The graph is known as scatter
Definition: diagram.
According to Croxton and Cowden, “When the relationship is of a 12. Merits of scatter diagram:
quantitative nature, the appropriate statistical tool for discovering  It is easy to plot and identify the correlation.
and measuring th relationship, and expressing it in a brief formula  We can find the value of some independent variables if unknown.
is known as correlation.”  Abnormal values in a sample can be easily detected.
Uses of correlation Exercise A
Correlation: It refers to analysis deals with the association between two
1. Calculate the coefficient of correlation between X and Y from the following data:
or more variables. Thus correlation is a statistical device which helps us in X: 1 2 3 4 5 6 7
analyzing the co-variation of two or more variables. Y: 2 4 5 3 8 6 7
Merits of correlation : [Ans: 0.78]
 It helps to study the type of association between two variable. 2. Calculate Karl Pearson’s coefficient of correlation from the data given below:
 It helps to know the correlation is significant or not X: 2 4 6 8 10
 If two variables are closely related we can estimate the value of Y: 12 14 16 18 20
one variable by the help of another variable. [Ans: +1]
1. Positive correlation: It refers to when the direction of items in two Calculate Karl Pearson’s coefficient of correlation from the data given below: [Ans: -1]
series are in the same way.
X: 2 4 6 8 10
2. Negative correlation: It refers to the change in both variables is in
Y: 20 18 16 14 12
the opposite direction. That is as one variable is increasing, the other
is decreasing or vice versa, correlation is said to be negative. 3. Calculate Karl Pearson’s coefficient of correlation from the data given below:
3. Linear correlation: It refers to the amount of change in one variable X: 12 9 8 10 11 13 7
is at a constant ratio to the change in the other variable. Y: 14 8 6 9 11 12 3
4. Non-linear correlation: It refers to if the amount of change in one [Ans: 0.95]
variable is not at constant ratio to the change in the other variable. If 4. Calculate the coefficient of correlation between X and Y for the values given below:
we double the amount of rainfall the production of rice or whet etc, X: 2 5 7 9 19 16
would not necessarily be doubled.
Y: 25 27 26 29 34 39
5. Simple correlation: It refers to when only two variables are involved
for correlation calculation link supply and price.
[Ans: 0.89]
6. Multiple Correlation: If we establish relationship between more than Calculate the coefficient of correlation for the following: [Ans: 0.70]
two variables it is called as multiple correlation. When we study the X: 128 140 125 121 122 146
relationship between the yield of wheat per acre and both the amount Y: 87 96 101 93 99 140
of rainfall and the amount of fertilizers used, it in an example of 5. Calculate the Pearson’s coefficient of correlation from the following data, taking 69 and
multiple correlation. 112 as the assumed average of X and Y respectively. Also, find the probable error.
7. No-Correlation: When figure plotted does not represent any specific X: 78 89 96 69 59 79 68 61
trend, then we cannot estimate any type of relation between them. We Y: 125 137 156 112 107 136 123 108
conclude that there is no correlation existing between these two
[Ans: +0.95; 0.02]
variables.
8. Perfect Positive Correlation: If coefficient of correlation is +1, then
5.B. Find out the coefficient of correlation in the following case, taking 67 and 68 as the
the relationship is said to be perfect positive correlation.
9. Perfect Negative Correlation: If coefficient of correlation is 1, then the
assumed average of X and Y respectively: [Ans: 0.47]
relationship is said to be perfect negative correlation. Height of fathers (X) 65 66 67 67 68 69 71 73
10. Probable error: It is an error studied to know whether the value or Height of sons (Y) 67 68 64 68 72 70 69 70
“r”is significant or not. P.E = .6745(1r2) / N 6. Compute Karl Pearson’s coefficient of correlation in the following series relating to cost of
11. Scatter Diagram: it is a graphical method of measuring correlation by living and wages:
plotting the value of items in a graph paper. It is one of the simplest
Correlation 30 Unit – 4
Wages in ` 100 101 103 102 100 99 97 98 96 95 4. From the following data find the Pearson’s coefficient of correlation:
Cost of living ` 98 99 99 97 95 92 95 94 90 91 ∑ X = 392, ∑ Y = 328, ∑(X − 56)2 = 160, ∑(Y − 45)2 = 447,
[Ans: 0.85] ∑(X − 56). (Y − 45) = 200, and N = 7 [Ans: 0.77]
7. Calculate the value of coefficient of correlation between the price and supply. What is the 5. From the following data, determine the value of ‘r’, X̅ = 6, Y̅ = 8, ∑ X2 = 220,
probable error? ∑ Y 2 = 340, ∑(X + Y)2 = 1008 and N = 5 [Ans: – 0.57]
Price: 8 10 15 17 20 22 24 25 6. Find out the value of N from the following data,
Supply: 25 30 32 35 37 40 42 45 r = 0.25, ∑ xy = 120, σx = 8 and ∑ X 2 = 360 [Ans: 10]
[Ans: 0.98; 0.01] 7. Determine the value of r, where the value of P.E. was calculated to be 0.05 based on 16
8. Calculate Pearson’s coefficient of correlation between advertisement cost and sales: observations. (Assume 0.6745 = 2/3) [Ans: 0.84]
Advertisement` 39 65 62 90 82 75 25 98 36 78 8. Determine the value of N, where r is 0.70 and its probable error is 0.034 [Ans: 100]
Sales ` 47 53 58 86 62 68 60 91 51 84 9. Coefficient of correlation between two variates, X and Y is 0.48. their covariance is 36.
[Ans: 0.78] The variance of X is 16. Find the standard deviation of Y series. [Ans: 18.75]
9. Find the correlation coefficient between the income and expenditure of a wage earner 10. From the following data, determine the Karl Pearson’s coefficient of correlation:
and comment thereon: X – Variable Y – Variable
Income ` 46 54 56 56 58 60 62 Number of items 10 10
Expenditure ` 36 40 44 54 42 58 54 Arithmetic mean 65 66
[Ans: 0.77] Standard deviation 23.33 14.91
And the sum of the products of the deviation from mean of both the variables is 2704.
[Ans: 0.78]
Formulae of Correlation 11. Coefficient of correlation between two variates X and Y is 0.8. Their covariance is 20.
Direct Method (based on ∑ 𝒙𝒚
The variance of X is 16. Find the standard deviation of Y series. [Ans: 𝛔𝐲 = 𝟔. 𝟐𝟓]
𝒓=
deviation from Mean) √∑ 𝒙𝟐 ∑ 𝒚𝟐 12. From the following data, determine the value of ‘r’, X̅ = 5, Y̅ = 4, ∑ X2 = 300,
𝑵 𝑿𝒀 − ∑ 𝑿 × ∑ 𝒀
∑ ∑ Y 2 = 200, ∑(X + Y)2 = 900 and N = 8 [Ans: 0.47]
𝒓=
Direct Method (on values) √𝑵 ∑ 𝑿𝟐 − (∑ 𝑿)𝟐 × 𝑵 ∑ 𝒀𝟐 − (∑ 𝒀)𝟐 13. From the following data find the Pearson’s coefficient of correlation:
𝑵 ∑ 𝒅𝒙 𝒅𝒚 − ∑ 𝒅𝒙 × ∑ 𝒅𝒚 ∑ X = 140, ∑ Y = 150, ∑(X − 10)2 = 180, ∑(Y − 15)2 = 215,
𝒓= ∑(X − 10). (Y − 15) = 50, and N = 10 [Ans: 0.76]
Short cut method (based on
deviation from Assumed Mean) √𝑵 ∑ 𝒅𝒙 𝟐 − (∑ 𝒅𝒙 )𝟐 × 𝑵 ∑ 𝒅𝒚 𝟐 − (∑ 𝒅𝒚 )𝟐 14. Determine the value of r, where the value of P.E. was calculated to be 0.045 based on 25
Probable error of observations. (Assume 0.6745 = 2/3) [Ans: 0.8139]
𝟏 − 𝒓𝟐
measurement 𝑷𝑬𝒓 = 𝟎. 𝟔𝟕𝟒𝟓 × 15. Determine the value of N, where r is 0.80 and its probable error is 0.025 [Ans: 94]
√𝑵
Exercise B 16. A computer while calculating the correlation coefficient between two variable X
and Y, obtained the following constants:
1. Determine the Karl Pearson’s coefficient of correlation, when ∑ XY = 130, ∑ X = ∑ XY = 508, ∑ X = 125, ∑ Y = 100, ∑ X 2 = 650, ∑ Y 2 = 460 and N = 25
15, ∑ Y = 40, ∑ X 2 = 55, ∑ Y 2 = 330 and N = 5 [Ans: 1]
It was discovered later that certain wrong pairs of data were taken in place of the correct pairs
2. From the following data, find out the coefficient of correlation as given by Karl Pearson, as follows:
∑ dx = −5, ∑ dy = −10, ∑ d2x = 109, ∑ d2y = 62, ∑ dx dy = 43 & N = 10 Wrong Pairs Correct pairs
[Ans: 0.51] X Y X Y
3. Find out the Karl Pearson’s coefficient of correlation between the two variables X and Y, 6 14 8 12
when ̅
X = 74.5, ̅ Y = 125.5, Ax = 69, Ay = 112, σx = 112, σy = 8 6 6 8
13.07, ∑ dx dy = 2176 and N = 8 Obtain the correct value of the correlation coefficient between X and Y. [Ans: 0.67]
[Ans: 0.961] Exercise – C
Correlation 31 Unit – 4
10. From the table given below calculate the coefficient of correlation between the ages of Special Features of Rank Correlation
husbands and wives: i. The value of such co-efficient of correlation lies between +1 and −1.
Age of Age of Husband ii. The sum of the differences between the corresponding ranks i.e, ∑d
Wives 20−30 30−40 40−50 50−60 60 − 70 = 0.
15 − 25 5 9 3 iii. It is independent of the nature of distribution from which the sample
25 − 35 10 25 2 data are collected for calculation of the co-efficient.
35 -- 45 1 12 2 iv. It is calculated on the basis of the ranks of the individual items rather
45 − 55 4 16 5 than their actual values.
55 − 65 4 2 v. Its result equals with the result of Karl Pearson’s Co-efficient of
[Ans: r = 0.7962] correlation unless there is repetition of any rank. This is because,
Spearman’s correlation is nothing more than the Pearson’s co-
11. Find the coefficient of correlation between the age and the sum assured from the following
efficient of correlation between the ranks.
table: [Ans: r = 0.26]
𝟔 ∑ 𝐃𝟐
Age 10000 20000 30000 40000 50000 𝐑=𝟏−
𝐍(𝐍 𝟐 − 𝟏)
20 − 30 4 6 3 7 1
30 − 40 2 8 15 7 1 In case of tied:
40 − 50 3 9 12 6 2 𝐭𝟑 − 𝐭
𝟔 (∑ 𝐃𝟐 + ∑ )
50 − 60 8 4 2 𝟏𝟐
𝐑=𝟏 −
𝐍(𝐍 𝟐 − 𝟏)
17 27 32 20 4
12. Find the coefficient of correlation between the marks obtained by sixty candidates at an
examination in two subjects- Economics and Statistics-from the data given below: Practical Problems:
Stats 5 − 15 15 − 25 25 − 35 35 − 45 1. 10 students were given tests in English and Mathematics. Their marks are given below:
0 − 10 1 1 No. 1 2 3 4 5 6 7 8 9 10
10 − 20 3 6 5 1 Eng. 78 40 50 55 52 49 60 54 59 58
20 − 30 1 8 9 2 Math 70 60 60 75 69 55 70 65 65 61
30 − 40 3 9 3
40 −50 4 4 2. The coefficient of rank correlation between the debenture prices and share prices of a
Total 5 18 27 10 company was 0.8. If the sum of the squares of the difference in ranks was 33, find the value of
[Ans: r=0.5329] n.. [n=10]

Spearman’s Rank Correlation 3. 10 students got the following percentage of marks in Mathematics and Statistics.
This method is a development over Karl Pearson’s method of correlation No. 1 2 3 4 5 6 7 8 9 10
on the point that- Math 78 36 98 25 75 82 90 62 65 39
i. It does not need the quantitative expression of the data and Stat 84 51 91 60 68 62 86 58 53 47
ii. It does not assume that the population under study is normally Calculate the correlation coefficient. [Ans: R = 0.82]
distributed.
This method was introduced by the British Psychologist Charles 4. Find the rank correlation coefficient of the following data:
Edward Spearman in 1904. under this method, correlation is measured on A 115 109 112 87 98 120 98 100 98 118
the basis of the ranks rather than the original values of the variables. For B 75 73 85 70 76 82 65 73 68 80
this, the values of the two variables are first converted into ranks in a [Ans: R = 0.73]
particular order i.e., the ranks may be assigned to the different values 5. The coefficient of rank correlation between the marks in Statistics and Mathematics
either in ascending or in descending order .
obtained by a certain group of students is 2/3 and the sum of the squares of the differences in
marks is 55. Find the number of students in the group. [n = 10]
Correlation 32 Unit – 4
6. The coefficient of rank correlation of the marks obtained by 10 students in Mathematics Explain the following sentences in one word
and Statistics was found to be 0.5. it was then detected that the difference in ranks in the two 1. The change in the value of one variable arises a change in the value of another
subjects for one particular student was wrongly taken to be 3 in place of 7. What should be the variable. [
correct rank correlation coefficient? [Ans: R = 0.2576] 2. The cause and effect relationship between two variables
3. The relationship between any two variables only is studied
7. The rank of the same 15 students in two subjects A and B are given below the two 4. The relationship between any two out of three or more variables is studied ignoring
numbers within the brackets denoting the ranks of the same students in A and B respectively: the effect of the other related variables
(1, 10); (2,7); (3,2); (4,6); (5,4), (6,8); (7,3); (8,1); (9,11); (10,15); (11,9); (12,5); (13,14); 5. The relationship between any three or more variables are studied at a time
(14,12); (15,13). Use Spearman’s formula to find the rank correlation coefficient. [Ans: R = 6. With the increase in the value of one variable, the value of the other variable
0.51] increases
7. With the increase in the value of one variable the value of the other variable
8. The coefficient of rank correlation of the marks obtained by 10 students in Statistics and deceases
Accountancy was found to be 0.2. It was later discovered that the difference in ranks in the 8. When the value of the correlations are either +1 or – 1
two subjects obtained by one of the students was wrongly taken as 9 instead of 7. Find the 9. When the data relating to correlation plotted on a graph paper give rise to a straight
correct value of coefficient of rank correlation. [R = 0.394] line
10. The ratio of covariance between the two variables to product of their standard
deviation
Multiple Choice Questions; 11. The square of the Pearson’s coefficient of correlation
1. The relationship between the number of devotees in a temple and production of wheat in 12. One minus square of the coefficient of correlation
India is: 13. The value of coefficient of correlation between ______
Positive correlation Negative correlation 14. When r = 0, correlation is _______
Spurious correlation Perfect correlation 15. Coefficient of correlation between x and y is – 0.48. If x increases, how y behaves?
2. When an increase in one variable corresponds to an increases in the other variable then 16. If coefficient of correlation is 0.35, then what will be the degree of correlation?
the correlation is said to be 17. If r > P.E., correlation is _______
Positive correlation negative correlation 18. What measure of relationship between two variables if Karl Pearson’s coefficient of
Liner correlation useless correlation relationship?
3. In moderate degree correlation the value of r 19. The square of correlation coefficient is called coefficient of _______
Lies between ±0.75 and ±1 lies between ±0.25 and ±0.75 20. What are the limiting values of Karl Pearson’s coefficient of correlation?
Is zero is ±1 or −1 [Ans: Correlation; Causation; Simple correlation; Partial Correlation; Multiple
4. The value of Karl Pearson’s coefficient of correlation lies between correlation; Positive correlation; Negatives correlation; Perfect correlation; Linear
±0.75 to ±1 0 and 1 correlation; Coefficient of correlation; coefficient of determination; coefficient of Non
−1 and +1 −1 and 0 determination; ±1; absent; decreases; moderate; certain; quantitative; determination;
5. Coefficient of correlation between X and Y is 0.48. If X increases, the Y ±1]
Increases Decreases
Zero None of these
Fill in the Blanks:
6. If coefficient of correlation is 0.35, then what will be the degree of correlation 1. When r = ±1, correlation is ______
Zero Moderate 2. When r = −1, correlation is ______
Positive Negative 3. When r = 0, correlation is ______
7. If r > P.E., Correlation is 4. The coefficient of correlation is significant when r is more than _____
Certain uncertain 5. The coefficient of correlation of population lies between _____
Zero None of These 6. Probable error is never _____
∑ 𝑥𝑦
8. The square of correlation coefficient is called 7. represents _______
𝑁
Coefficient of determination Coefficient of correlation [Ans: Perfect Positive; Perfect negative; Absent; 6 times of P.E.; r ± PE; Negative; Co-
Coefficient of Variation none of these variance]

You might also like