You are on page 1of 25

Expected values, covariance, correlation and expected values

Introduction to Bivariate Regression

Review
Mean

Mode
Median Freq Variance Standard deviation

Is the perception that the majority of Russians believe the same way you do related to how often you discuss politics with friends?

Is this a causal relationship?


Majority of Russians believe the same Discussions of politics with friends

Discussions of politics with friends

Majority of Russians believe the same

When it comes to politics, how close do you think your opinions are to the opinions of the majority of Russians? very close, rather close, not very close, not close at all
maj rcl How close your opinions to the opinions of the majority of Russians about politics Cumulative Percent 1.1 29.3 97.0 100.0

Freq uency Valid 1.00 not close at all 2.00 not very close 3.00 rather close 4.00 v ery close Total Missing Total System 3 74 178 8 263 63 326

Percent .9 22.7 54.6 2.5 80.7 19.3 100.0

Valid Percent 1.1 28.1 67.7 3.0 100.0

freq vars = majrcl / stats = mean stddev var.

How often do you do the following discuss political questions with friends, neighbors, or coworkers almost never, a few times a year, a few times a month, a few times a week, or practically every day?
discfrnd How often do you discuss political questions with friends, neighbors Cumulative Percent 24.6 37.5 68.1 92.4 100.0

Valid

1.00 Almost never 2.00 A few times a year 3.00 A few times a month 4.00 A few times a week 5.00 Practically every day Total

Freq uency 78 41 97 77 24 317 1 8 9 326

Percent 23.9 12.6 29.8 23.6 7.4 97.2 .3 2.5 2.8 100.0

Valid Percent 24.6 12.9 30.6 24.3 7.6 100.0

Missing

8.00 Refuse 9.00 Unsure Total

Total

freq vars = discfrnd / stats = mean stddev var.

Review standard deviation and variance


Variance: for each unit or observation, it is the

distance from the mean squared and then divide by the number of units Standard deviation squareroot of variance since variance is in squared units, it doesnt make any sense. The standard deviation can be understood in terms of the original measurement unit

Calculating variance and standard deviations


unit value mean distance squared distance 4 2.9375 1.06 1.13 3 2.9375 0.06 0.00 4 2.9375 1.06 1.13 1 2.9375 -1.94 3.75 1 2.9375 -1.94 3.75 4 2.9375 1.06 1.13 3 2.9375 0.06 0.00 3 2.9375 0.06 0.00 3 2.9375 0.06 0.00 2 2.9375 -0.94 0.88 3 2.9375 0.06 0.00 3 2.9375 0.06 0.00 4 2.9375 1.06 1.13 5 2.9375 2.06 4.25 1 2.9375 -1.94 3.75 3 2.9375 0.06 0.00 1.31 1.143938

Review: Units, mean, variance and standard deviation


majrcl discfrnd 2.00 2.00 . . 2.00 3.00 3.00 3.00 . 2.00 . 3.00 3.00 3.00 3.00 3.00 4.00 3.00 4.00 1.00 1.00 4.00 3.00 3.00 3.00 2.00 3.00 3.00 4.00 5.00 1.00 3.00

Descriptiv e Statistics N majrcl How close your opinions to the opinions of the majority of Russians about politics discfrnd How often do you discuss political questions with friends, neig hbors Valid N (listwise) 12 Mean 2.6667 Std. Deviation .49237 Variance .242

16 12

2.9375

1.18145

1.396

Expected value v. probability


If our population set of numbers is: 1,1,3,3,17,

then the expected value is 5, even though P(5) = 0. Suppose we know that E(X) = 5 with the equation y = 5 + 7x. What is E(Y)?

Expected values
Statistics majrcl How close your opinions to the opinions of the majority of Russians about politics N Valid 263 Missing Mean Std. Deviation Variance 63 2.7262 .53249 .284

What is the expected value of majrcl? What is the range? Mode? Why are there 63 missing?

Statistics discfrnd How often do you discuss political questions with friends, neighbors N Valid 317 Missing 9 2.7729 1.26996 1.613 Mean Std. Deviation Variance

What is the expected value of discfrnd? Why is the standard deviation and variance so high?

Crosstab
maj rcl How close your opinions to the opinions of the maj ority of Russians about politics * discfrnd How often do y ou discuss political questions with friends, neighbors Crosstabulation discfrnd How often do you discuss political q uestions with friends, neighbors 1.00 Almost never 1 1.6% 28 44.4% 33 52.4% 1 1.6% 63 100.0% 2.00 A few times a year 0 .0% 8 34.8% 15 65.2% 0 .0% 23 100.0% 3.00 A few times a month 1 1.3% 21 27.3% 52 67.5% 3 3.9% 77 100.0% 4.00 A few times a week 1 1.4% 14 19.2% 56 76.7% 2 2.7% 73 100.0% 5.00 Practically every day 0 .0% 2 9.5% 17 81.0% 2 9.5% 21 100.0% Total 3 1.2% 73 28.4% 173 67.3% 8 3.1% 257 100.0%

majrcl How close your opinions to the opinions of the majority of Russians about politics

1.00 not close at all

Count % within discfrnd How often do you discuss political q uestions with friends, neighbors Count % within discfrnd How often do you discuss political q uestions with friends, neighbors

2.00 not very close

3.00 rather close

Count % within discfrnd How often do you discuss political q uestions with friends, neighbors

4.00 very close

Count % within discfrnd How often do you discuss political q uestions with friends, neighbors

Total

Count % within discfrnd How often do you discuss political q uestions with friends, neighbors

Causation
Time ordering Covariation

Co-variation from variation?


(xi - xmean)^2/n

average distance between the mean of x and each x value, squared


aka (xi - xmean) (xi - xmean)/n

Covariation?

(xi - xmean) * (yi - ymean) / n-1

Covariation
covariance can take any value negative infinity to positive

infinity

Intuitive explanation
(xi - xmean) * (yi - ymean) / n-1
When x and y are high at the same time and x and y

are low at the same time, then the covariance is positive They are both higher than their means and so the products being added together are positive

Plot showing positive covariance


Mean urban %

Mean female literacy

Intuitive explanation
(xi - xmean) * (yi - ymean) / n-1
When x is low when y is high and vice versa, then

the covariance is negative They are both higher than their means and so the products being added together are negative

Plot showing negative covariance


Mean calorie intake

Mean infant mortality

Intuitive explanation
(xi - xmean) * (yi - ymean) / n
When sometimes: x and y are high at the same time and x and y are low at the

same time And about half of the other time x is low when y is high and vice versa Then the covariance is about 0
High positive numbers are added to high negative numbers

Plot showing no covariance


Mean GDP

Mean crop production

Covariance is a function of
Variance (standard deviation) of x

Variance (standard deviation) of y


Relationship between x and y

How can you compare a covariance of 132 and 134,847?


134, 847 could be high variance of x, high variance

of y, high variance of both variables, or a high relationship between x and y?


Not that helpful?

How can you change the covariance to a number that tells you only the magnitude of the relationship between x and y?

Divide by the standard deviation of x * the standard

deviation of y

Correlation = (x-xmean)*(y-ymean) /Sd(x) * sd (y)

Pearson r ranges from -1 to +1 Weak correlation = .1 moderate correlation = .4 strong correlation = .7

You might also like