Professional Documents
Culture Documents
(
(
=
(
n
y
y
n
x
x
n
xy
xy
r
or
y y n x x n
y x xy n
r
2
2
2
2
2
2
2
2
sanizah@tmsk.uitm.edu.my
20
QMT412 Pn. Sanizah's Notes 02/05/2013
6
Example 2
Refer to Example 1. Compute Pearson coefficient
of correlation and interpret the result.
________ n
________ xy
________ y
________ y
________ x
________ x
=
=
=
=
=
=
2
2
sanizah@tmsk.uitm.edu.my
21
( ) ( )
(
(
(
(
n
y
y
n
x
x
n
xy
xy
r
2
2
2
2
The Spearman rank correlation coefficient
Spearmans rank correlation coefficient is a measure of association
between two variables that are at least of ordinal scale (suitable for
qualitative data).
Can also be applied to quantitative data but the variables must firsts
be ranked and then only it is calculated based on these rankings.
where:
d = difference between two ranks
n = number of pairs of observations
NOTE: Be careful with tied observations
) 1 (
6
1
2
2
=
n n
d
s
22
sanizah@tmsk.uitm.edu.my
How to calculate Spearmans rank
correlation coefficient?
1. List each set of scores in a column.
2. Rank the two sets of scores.
3. Place the appropriate rank beside each score.
4. Head a column dand determine the difference in rank for
each pair of scores.
(Note: Sum of the dcolumn should always be 0)
5. Square each number in the dcolumn and sum the
values (Ed
2
).
6. Use the formula to calculate the correlation coefficient.
sanizah@tmsk.uitm.edu.my
23
Refer Example 5 pg. 140
Student
Subject d d
2
Statistics Computer
A 1 3
B 2 1
C 3 4
D 4 2
E 5 5
sanizah@tmsk.uitm.edu.my
24
Five students A, B, C, D, E are ranked in two subjects, statistics and
computer programming with the following results.
Calculate the Spearmans rank correlation coefficient.
) 1 (
6
1
2
2
=
n n
d
s
\
|
|
|
.
|
\
|
|
|
.
|
\
|
n
x
b
n
y
x b y a
=
=
28
QMT412 Pn. Sanizah's Notes 02/05/2013
8
x y
8 78
2 92
5 90
12 58
15 43
9 74
6 81
Absences
Final
Grade
Example 3: Application
95
90
85
80
75
70
65
60
55
45
40
50
0 2 4 6 8 10 12 14 16
F
i
n
a
l
G
r
a
d
e
X
Absences
29
sanizah@tmsk.uitm.edu.my
Calculate a and b.
Write the equation of the
line of regression with
x = number of absences
and y = final grade.
The line of regression is:
6084
8464
8100
3364
1849
5476
6561
624
184
450
696
645
666
486
57 516 3751 579 39898
1 8 78
2 2 92
3 5 90
4 12 58
5 15 43
6 9 74
7 6 81
64
4
25
144
225
81
36
xy x
2
y
2
x y
30
sanizah@tmsk.uitm.edu.my
0 2 4 6 8 10 12 14 16
40
45
50
55
60
65
70
75
80
85
90
95
Absences
F
i
n
a
l
G
r
a
d
e
The line of regression is: y = -3.924x + 105.667
Note that the point = (8.143, 73.714) is on the line.
The Line of Regression
31
sanizah@tmsk.uitm.edu.my
The regression line can be used to predict values of y
for values of x falling within the range of the data.
The regression equation for number of times absent and final
grade is:
Use this equation to predict the expected grade for a student with
(a) 3 absences (b) 12 absences
Predicting y Values
(a) y = 3.924(3) + 105.667 = 93.895
(b) y = 3.924(12) + 105.667 = 58.579
y = 3.924x + 105.667
32
sanizah@tmsk.uitm.edu.my
QMT412 Pn. Sanizah's Notes 02/05/2013
9
Coefficient of Determination
The coefficient of determination, r
2
,
measures the
strength of the association and is the ratio of explained
variation in y to the total variation in y.
Interpretation : proportion of the variation in
y that is explained by the variation in x
( )
2
2
t coefficien n correlatio
variation total
variation explained
=
= r
sanizah@tmsk.uitm.edu.my
33
The correlation coefficient of number of times absent and final
grade is r = 0.975. The coefficient of determination is
r
2
= (0.975)
2
= 0.9506.
Interpretation: About 95.06% of the variation in final grades can be
explained by the number of times a student is absent.
Note: The other 4.94% is unexplained and can be due to sampling
error or other variables such as intelligence, amount of time
studied, etc.
Recall Example 3
( )
2
2
t coefficien n correlatio
variation total
variation explained
=
= r
34
sanizah@tmsk.uitm.edu.my