You are on page 1of 3

STAT 3843A Fall 2012 Assignment 2 Solutions

1. For any sampling scheme, i is the probability that unit i is selected. = The Horvitz-Thompson estimator of the population total t is t iS yi /i . = Ny (a) Show that the SRS estimate t S is a Horvitz-Thompson estimator by computing i and writing it in the form above. For SRS we know i = n/N . Thus = Ny t S = N/n
i S

yi

=
i S

yi /(n/N ) yi /i
i S

= as required.

= H h is a (b) Show that the stratied sampling estimator t h=1 Nh y Horvitz-Thompson estimator by computing i and writing it in the form above. For stratied sampling, i is nh /Nh when item i is in stratum h. Thus the argument that this is a Horvitz-Thompson estimator is the same as the one above, but with n and N replaced by nh and Nh respectively. (c) Suppose that the population is y1 = 1 and yi = 0 for i = 2, . . . , N . Suppose we have some sampling scheme (not necessarily SRS or = iS ki yi , where stratied sampling), and we estimate t by t the ki are constants determined independently of the the yi values. Show that to be unbiased we must have k1 = 1/1 . ) = 1 for an In this population the total is 1, so we must have E (t unbiased estimator. For any sample that does not include y1 , t = k1 . will be zero, while any sample that does include it will give t The probability of the latter case is 1 , so the expectation is 1 k1 and for this to be 1 we must have k1 = 1/1 . 1

(d) (For a bonus point): Show that in a general population where the yi values are unspecied, the Horvitz-Thompson estimator is the = iS ki yi with the ki only unbiased estimator in the form of t values xed in advance. as N Write t i=1 ki yi Zi , and we see that for unbiasedness, we must N have i=1 ki yi i = N i=1 yi . This has to work regardless of the yi values. If we set them as above, we see k1 = 1/1 . If we put y2 = 1 and the rest zero, we see k2 = 1/2 . Repeat for all yi , and we nd the H-T estimator is the only unbiased estimator of this type. 2. (Lohr Ch 3) Consider a population of 6 students. Suppose we know the test scores of the students to be equal to the last 6 digits of your student id number. (For example, if your id is 250766362, the scores would be y1 = 7, y2 = 6, y3 = 6, y4 = 3, y5 = 6, y6 = 2). (a) Find the mean y U and S 2 for the population. The mean is 5 and S 2 is 4. (b) How many SRSs of size 4 are possible? There are
6 4

= 15 possible samples.

(c) List all possible SRSs. For each one, nd the sample mean. Find the sample variance of those means. Compare to V ( yS ) obtained using the standard formula. Explain any dierence. NB: In the sample data, the samples {1, 2, 3, 4} and {1, 3, 4, 5} both contain the same yi values, but they are dierent samples. First we put the data in order from smallest to largest; this isnt needed for this question, but helps with the next: 2 3 6 6 6 7 The samples and statistics are: S= S= S= S= S= S= S= 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 4 4 5 4 4 5 6 5 6 6 5 yvals= yvals= yvals= yvals= yvals= yvals= yvals= 2 2 2 2 2 2 2 3 3 3 3 3 3 6 6 6 6 6 6 6 6 2 6 6 7 6 7 7 6 mean= mean= mean= mean= mean= mean= mean= 4.25 4.25 4.5 4.25 4.5 4.5 5

S= S= S= S= S= S= S= S=

1 1 1 2 2 2 2 3

3 3 4 3 3 3 4 4

4 5 5 4 4 5 5 5

6 6 6 5 6 6 6 6

yvals= yvals= yvals= yvals= yvals= yvals= yvals= yvals=

2 2 2 3 3 3 3 6

6 6 6 6 6 6 6 6

6 6 6 6 6 6 6 6

7 7 7 6 7 7 7 7

mean= mean= mean= mean= mean= mean= mean= mean=

5.25 5.25 5.25 5.25 5.5 5.5 5.5 6.25

The sample variance of those means is 0.35714. The standard formula is V ( y ) = (1/n)(1 n/N )S 2 = 0.33333. These are different because the sample variance of the means used 1/14 in the denominator, while the standard formula used 1/15. (d) Now let stratum 1 consist of the three lowest digits, and stratum 2 consist of the three highest digits. (For the example above, stratum 1 would be 2, 3, 6 and stratum 2 would be 6, 6, 7). How many stratied random samples of size 4 are possible in which 2 students are selected from each stratum? There are
3 2 2

= 9 stratied samples.

(e) List the possible stratied random samples. Which samples from the SRS cannot occur with the stratied design? S= S= S= S= S= S= S= S= S= 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 5 4 4 5 4 4 5 5 6 6 5 6 6 5 6 6 yvals= yvals= yvals= yvals= yvals= yvals= yvals= yvals= yvals= 2 2 2 2 2 2 3 3 3 3 3 3 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 6 7 7 6 7 7 mean= mean= mean= mean= mean= mean= mean= mean= mean= 4.25 4.5 4.5 5 5.25 5.25 5.25 5.5 5.5

(f) Find y str for each possible stratied random sample. Find V ( ystr ), and compare it to V ( yS ) from SRS. See above for the means. Their sample variance is 0.21875. Hand in this assignment on paper in class.

You might also like