You are on page 1of 30

Learning R

Laboratory Exercise-III
BPS651 Research Methodology
R.S.Rajput, Assistant Professor (Computer Science)
Laboratory Instructor

BPS651

Department of Mathematics, Statistics &


Computer Sc.

Laboratory Exercise -III


Correlation

Regression
Analysis of Variance

BPS651

Department of Mathematics, Statistics &


Computer Sc.

Correlation
Correlation is used to test for a relationship between two

numerical variables or two ranked (ordinal) variables.


Correlation is a bi variants analysis that measures the strengths
of association between two variables. In statistics, the value of
the correlation coefficient varies between +1 and -1.
Usually, in statistics, we measure three types of correlations:

BPS651

Pearson correlation
Kendall rank correlation
Spearman correlation

Department of Mathematics, Statistics &


Computer Sc.

Pearson Correlation
Pearson r correlation: Pearson r correlation is widely used in

statistics to measure the degree of the relationship between


linear related variables. Pearson r correlation, both variables
should be normally distributed.
The following formula is used to calculate the Pearson r
correlation:

Where:
r = Pearson r correlation coefficient
N = number of value in each data
xy = sum of the products of paired scores
x = sum of x scores
y = sum of y scores
x2= sum of squared x scores
y2= sum of squared y scores

BPS651

Department of Mathematics, Statistics &


Computer Sc.

Kendall rank Correlation


Kendall rank correlation is a non-parametric test that measures

the strength of dependence between two variables. If we


consider two samples, a and b, where each sample size is n, we
know that the total number of pairings with a b is n(n-1)/2. The
following formula is used to calculate the value of Kendall rank
correlation:
Where:
Nc= number of concordant
Nd= Number of discordant
Concordant: Ordered in the same way
Discordant: Ordered differently

BPS651

Department of Mathematics, Statistics &


Computer Sc.

Spearman Correlation
Spearman rank correlation is a non-parametric test that is used to

measure the degree of association between two variables. Spearman


rank correlation test does not assume any assumptions about the
distribution of the data and is the appropriate correlation analysis when
the variables are measured on a scale that is at least ordinal.
The following formula is used to calculate the Spearman rank
correlation:

Where:
P= Spearman rank correlation
di= the difference between the ranks of corresponding values Xi and Yi
n= number of value in each data set

BPS651

Department of Mathematics, Statistics &


Computer Sc.

The cor( ) function


The cor( ) function to produce correlations .A simplified format is
cor(X , use=, method= )

where

BPS651

X: Matrix or data frame


Use:Specifies the handling of missing data. Options
all.obs (assumes no missing data, * missing data
produce an error), complete.obs (listwise deletion),
pairwise.complete.obs (pairwise deletion)
Method: Specifies the type of correlation. Options
pearson, spearman or kendall.
Department of Mathematics, Statistics &
Computer Sc.

are
will
and

are

Visualizing Correlations
plot(), abline(), lowess() ,line(), pairs()
plot(): The basic function is plot(), denoting the (x,y) points to plot

>plot(x, y)
pairs() to create scatterplot matrices
>pairs(y~ x)
lines() use to point match
> lines(x, y, col="black")
abline() use to print regression line (y~ x)
>abline(lm(x~ y), h=0,v=0,col="red")
lowess line function
> lines(lowess(x, y), col="blue")

BPS651

Department of Mathematics, Statistics &


Computer Sc.

Exercise 10
Protein intake X and fat intake Y (in gm) for ten old

women given as
X 56,47,33,39,42,38,46,47,38,32
Y 56,83,49,52,65,52,56,48,59,70
Calculate correlation Coefficient (Pearson)
Draw scatter plot matrix, scatter plot

BPS651

Department of Mathematics, Statistics &


Computer Sc.

Exercise 11
Find correlation coefficient (Pearson) between the

sales and expenses from the data given below:


Firm:
1 2 3 4 5 6 7 8 9 10
Sales (Rs Lakhs):
50 50 55 60 65 65 65 60 60 50
Expenses (Rs Lakhs):11 13 14 16 16 15 15 14 13 13

Draw scatter plot matrix, scatter plot

BPS651

Department of Mathematics, Statistics &


Computer Sc.

10

Simple Linear Regression


A simple linear regression model that describes the relationship

between two variables x and y can be expressed by the


following equation. The numbers and are called parameters,
and is the error term.
Estimated Simple Regression Equation

Coefficient of Determination
Significance Test for Linear Regression
Confidence Interval for Linear Regression
Prediction Interval for Linear Regression

Residual Plot
Standardized Residual
Normal Probability Plot of Residuals
BPS651

Department of Mathematics, Statistics &


Computer Sc.

11

Simple Linear Regression cont.


Estimated Simple Regression Equation

>W=lm(y~x)
Where
X

Y
w

BPS651

Department of Mathematics, Statistics &


Computer Sc.

12

Multiple Linear Regressions


Estimated Multiple Regression Equation

Multiple Coefficient of Determination


Adjusted Coefficient of Determination
Significance Test for MLR

Confidence Interval for MLR


Prediction Interval for MLR

BPS651

Department of Mathematics, Statistics &


Computer Sc.

13

Logistic Regression
Estimated Logistic Regression Equation

Significance Test for Logistic Regression

BPS651

Department of Mathematics, Statistics &


Computer Sc.

14

Exercise 11
Geographical area x and area under paddy

cultivated y ( in hectares) for 15 villages of a


district are given below X 103,106,120,120,100,151,160,155,136,178,196,140,160,166,112
Y 041,033,087,078,035,081,090,085,070,100,102,070,082,085,050

Calculate correlation coefficient


Calculate regression equation of y on x
Estimate paddy cultivation whore geograhical

area is 136 hater


BPS651

Department of Mathematics, Statistics &


Computer Sc.

15

Exercise 12
Calculate correlation coefficient between

marks obtained in 1st prefinal and 2nd prefinal


examination on the basis of the following data
collected for a sample of 12 students
I 12,14,9.5,10.5,8,11.5,10,14,8,9.5,11,12
II 11.5,13.5,12,14,7,14,8,12.5,6.5,10,9,12
Calculate correlation coefficients
Calculate regression equation of y on x

BPS651

Department of Mathematics, Statistics &


Computer Sc.

16

Exercise 10
Protein intake X and fat intake Y (in gm) for

ten old women given as


X 56,47,33,39,42,38,46,47,38,32
Y 56,83,49,52,65,52,56,48,59,70
Calculate correlation Coefficient
Calculate regression equation of y on x
Estimate fat intake of a women whose protein

intake is 38 gm.

BPS651

Department of Mathematics, Statistics &


Computer Sc.

17

Exercise 13
Twelve students for the following percentage

of makes in physics & statistics calculate: Correlation coefficient


Linear regression equation of y on x
Calculate predicted values & residual value
for x=80, for the given data below
X 73,42,88,38,68,75,80,54,64,48,35,37
Y 73,48,86,58,65,60,76,54,50,38,32,30
BPS651

Department of Mathematics, Statistics &


Computer Sc.

18

Exercise 14

BPS651

Department of Mathematics, Statistics &


Computer Sc.

19

Exercise 15

BPS651

Department of Mathematics, Statistics &


Computer Sc.

20

Exercise 16

BPS651

Department of Mathematics, Statistics &


Computer Sc.

21

Exercise 17

BPS651

Department of Mathematics, Statistics &


Computer Sc.

22

Exercise 18

BPS651

Department of Mathematics, Statistics &


Computer Sc.

23

Exercise 19

BPS651

Department of Mathematics, Statistics &


Computer Sc.

24

Exercise 20

BPS651

Department of Mathematics, Statistics &


Computer Sc.

25

Analysis of Variance

BPS651

Department of Mathematics, Statistics &


Computer Sc.

26

Completely Randomized Design

BPS651

Department of Mathematics, Statistics &


Computer Sc.

27

Randomized Block Design

BPS651

Department of Mathematics, Statistics &


Computer Sc.

28

Factorial Design

BPS651

Department of Mathematics, Statistics &


Computer Sc.

29

Laboratory Exercise -IV


T Test

Chi square test


Non-parametric Methods

BPS651

Department of Mathematics, Statistics &


Computer Sc.

30

You might also like