Professional Documents
Culture Documents
Thistleton
Lecture 25
Sections from Text and Homework Problems: Read: 5.3 ; Problems 30, 31, 32 Topics from Syllabus: Correlation Results Harvard Lectures: Lecture 21 Review and Looking Ahead What do we know about joint (typically pairwise, so far) distributions? We know about Joint and marginal for discrete and continuous Conditional distributions, conditional expectation, total expectation theorem Covariance
We are about to review another example of a covariance calculation, the one you computed in the last lecture with face cards and spades. But first, lets explore some theory. Ill prove these in the continuous case- you should take out a blank sheet of paper and work them for the discrete case. [ ] [ ] [ ]
Just apply some Calc III now. We will distribute, and then switch order of integration, pulling independent variables through the integral as we go: [ ] ( ) ( )
Page 1
Prof. Thistleton
Lecture 25
Play the same game. Just remember that independence allows us to factor the joint distribution as the product of the marginal, and then pull the constant through the integral. [ ] ( ) ( ) ( )
( ) [
( )
( )[
[ ] [ ]
Really, you just multiply and pull constants through the expected value operator. [( [( )( )( )] [ )] ] [ [ ] ]
Show that the converse is not true. Here we go. If we find one counterexample, we are done. Luckily, you calculated covariance in the last lecture for faces and spades. As a reminder
Page 2
Prof. Thistleton
Lecture 25
X= number of spades 0 1 2
0 Y=number of face
For the marginal expectations you can apply the definition or recall that X and Y are individually hypergeometric, so In either case, [ ] and [ ] . (A quick aside- the first time I thought up this
example, I did my calculation with a hand calculator and had a terrible time. The calculation is very sensitive to rounding, so stay in fractions. Looping around the table
[ ] [ ]
Prof. Thistleton
Lecture 25
This must be important, since I put it in a box. An especially important special case is the multivariate normal distribution. In that case, uncorrelated is synonymous with independent. More on that will follow. Derive a formula for the variance of a linear combination of random variables. That is, find [ We can say [ ] [( ) ] for [ ] ] ] ] [ ] [ ] [ ( ] [ ( ] [ ] [ ] [ ] as convenient: ) ) ]. This will be a crucial result over the next several lectures.
As an interesting special case, when [ This extends to the following. If [ So, if we take an average [ ] [ ] ] ]
are independent [ ] [ ]
Page 4
Prof. Thistleton
Lecture 25
Look at that denominator- this means that the variability of an average is decreasing dramatically as the sample size increases. This is why we trust a sample of size 100 much more than a sample of size 10. We have now seen several results concerning joint, marginal, and conditional distributions. We have also seen that a measure of linear relation between random variables is the covariance. There are other ways to measure dependency, such as mutual information, but a remarkable theory may be built upon covariances, especially when working with multivariate distributions. We take a moment here, before moving onto continuous distributions to define the correlation between two random variables. This is useful because we feel that the strength of the relationship should not depend upon whether we measure in inches, feet, or miles.
We can define the correlation coefficient as the covariance between the standardized random variables [ ]
Page 5
Prof. Thistleton
Obviously, if and
Lecture 25
are independent, then the correlation between them is zero. People like the
correlation coefficient because, among other things, it makes the degree of linear relationship easy to understand. In particular, we can show that
To see how a correlation might be unity, consider the following ideas. First, supposing are random variables such that , show (trivially) that (We have seen this before- this is just a reminder). Then, consider the product [ Relate the variances of and and show that ] as [ ] Finally, we have that ( ) [ ] | | [ ( )] [ ] [ ]
and
We will soon be considering statistics, many of which are built off of the sum of independent random variables. Take a moment to compute the variance of the sum [ In the special case that and ]
Finally, if
Page 6