You are on page 1of 9

SCHOOL OF ECONOMICS

ECON 2206 INTRODUCTORY ECONOMETRICS


FINAL EXAMINATION
SESSION 1, 2008

1. TIME ALLOWED - 2 Hours.


2. READING TIME = 10 Minutes
3. THIS EXAMATION PAPER HAS 9 PAGES
4. TOTAL NUMBER OF QUESTIONS - 6.
5. ANSWER ALL QUESTIONS.
6. ALL QUESTIONS ARE OF EQUAL VALUE
7. TOTAL MARKS AVAILABLE FOR THIS EXAMATION - 60.
8. THE MARKS AWARDED TO EACH PART OF A QUESTION ARE INDICATED.
9. CANDIDATES MAY BRING THEIR OWN CALCULATORS TO THE EXAM
10. STATISTICAL TABLES ARE PROVIDED AT THE END OF THE EXAM PAPER
11. ALL ANSWERS MUST BE WRITTEN IN PEN. PENCILS MAY BE USED ONLY FOR DRAWING,
SKETCHING OR GRAPHICAL WORK.
12. THIS PAPER MAY BE RETAINED BY THE CANDIDATE

Please see over

ANSWER ALL SIX QUESTIONS


REMINDER: When performing statistical tests, always state the null and alternative hypotheses, the test statistic and its distribution under the null hypothesis, the level of significance
and the conclusion of the test.

Question 1. (10 Marks).


(i) What are the distinctive features of panel data compared to cross-section data ? Explain. (2 marks).

(ii) In the model:


price = 0 + 1 area + 2 bdrms + 3 area bdrms + u

d ? (2 marks).
what is the partial eect of bdrms on price

(iii) What is the p-value for a test statistic ? SHAZAM reported the p-value for a test was 0.022; should I
therefore reject the null hypothesis at the 1% level of significance ? (2 marks)

(iv) Consider the following regression model explaining child birthweight (bwght):
bwght = 0 + 1 cigs + 2 f aminc + u
where cigs=average number of cigarettes the mother smoked per day during pregnancy and f aminc=family
income. In the sample of data I do not have a measure of family income. One option is to find a proxy
variable for family income. What are the desirable properties of a proxy variable ? Suggest one possible
proxy for family income. (2 marks)

(v) What is meant by the sampling distribution of an estimator ? What is known about the sampling
distribution of the OLS estimator under the first 4 Gauss-Markov assumptions ? (2 marks)

Please see over


2

Question 2. (10 Marks in total)


The variable smokes is a binary variable equal to one if a person smokes, and zero otherwise. The estimates
for the linear probability model explaining smokes are:
d
smokes
= 0.301 0.093 log(price) 0.029 log(income) 0.031 educ + 0.020 age 0.00024 age2
(0.132) (0.184)
(0.009)
(0.006)
(0.006)
(0.00006)
[0.087] [0.187]
[0.032]
[0.004]
[0.006]
[0.00005]
2
n = 220, R = 0.213
where price is the price of cigarettes, income is monthly income, educ is years of education, and age is the
persons age measured in years. The usual OLS standard errors are reported in parentheses (.), and the
heteroskedasticity-robust standard errors are reported in brackets [.].
(i) What is the interpretation of the coecient on educ ? (2 marks)
(ii) What is heteroskedasticity and what are the consequences of heteroskedasticity for:
(a) the estimation of the regression model by OLS, and
(b) conducting hypothesis tests with the usual t and F test statistics ?
(3 marks in total)
(iii) Are there any important dierences between the two sets of standard errors reported above ? Does this
provide evidence of whether heteroskedasticity is present ? Briefly explain. (3 marks)
(iv) What is the partial eect of age on the probability of smoking ? At what age is this partial eect reach
a maximum ? (2 marks)

Please see over

Question 3. (10 Marks in total)


We are interested in analysing the eect of dierent house characteristics on the sale price of a house,
and propose the following population model:
price = 0 + 1 area + 2 bdrms + 3 bthrms + u

(3.1)

where price is the house price (in $100,000), area is the floor area of the house (in square metres), bdrms is
the number of bedrooms and bthrms is the number of bathrooms.

Estimates for the model based on a sample of 108 observations from Sydney are:
d = 3.2481 + 0.2013area + 0.2363bdrms + 0.3207bthrms
price
(0.3945)(0.0093)
(0.2092)
(0.0395)
n = 108, R2 = 0.3301

(3.2)

(i) What is the interpretation of the coecient on bdrms in Model (3.2) ? (2 marks)

(ii) Test the hypothesis that the coecient on bdrms is equal to 0. Use a two-sided alternative and a 10%
level of significance. (2 marks)

(iii) We are concerned that the model may be misspecified due to the omission of non-linear terms in either
area, bdrms or bthrms. Outline the steps involved in performing the RESET test, and state the null and
alternative hypotheses of the RESET test. (4 marks)

(iv) Do the model estimates in (3.2) seem reasonable - or do the results provide evidence that it is a poor
model ? Explain. (2 marks)

Please see over


4

Question 4. (10 Marks in total).


The following model
log(price) = 0 + 1 log(nox) + 2 log(dist) + 3 rooms + u

(4.1)

relates median house prices in the community (price) to a range of community characteristics: nox which is
the amount of nitrous oxide in the air (a measure of air pollution), dist which is the average distance of the
community from major employment centres, and rooms which is the average number of rooms in houses in
the community.
(i) What is the interpretation of the coecient 1 ? (2 Marks)
Based on a sample of 620 communities in a large city, the following estimation results were obtained:
d
log(price)
= 13.08 0.454 log(nox) 0.121 log(dist) + 0.255rooms
(0.52) (0.317)
(0.083)
(0.019)
2
n = 620, R = 0.251

(4.2)

(ii) Test the overall explanatory power of the model (that is, test the null hypothesis H0 : 1 = 0, 2 =
0, 3 = 0) using a 1% significance level. (3 Marks)
(iii) Test that the two features of the local community - log(nox) and log(dist) - are jointly insignificant and
should be excluded from the model. Use a 1% significance level. The R2 statistic for the model with these
restrictions imposed is R2 = 0.125. What do you conclude ? (3 marks)

(iv) Do you think that this model captures the causal eect of nox on house prices ? Explain. (2 Marks)

NOTE: The F-test statistic is given by the formula:


F

=
=

(SSRr SSRur )/q


SSRur /(n k 1)
2
(Rur
Rr2 )/q
2 )/(n k 1)
(1 Rur

where SSR is the sum of squared residuals, q is the number of restrictions, and ur and r stand for unrestricted
and restricted models, respectively.

Please see over


5

Question 5. (10 Marks in total).


To study the impact of the Australian governments Baby Bonus policy on the fertility rate of Australian,
the following model was estimated using annual data for the period 1928 to 2007:
fcrt

= 124.09 + 0.114 Bonust 35.88 ww2t 2.53t + 0.120t2


(4.36) (0.040)
(5.71)
(0.39) (0.005)
n = 80, R2 = 0.698

(5.1)

The variable f r is the fertility rate (number of children born to every 1000 women of child-bearing age),
Bonus is the economic variable of interest which measures the real dollar value of the Baby Bonus (which
is a government payment to women for each child she gives birth to: the value of Bonus from $0 to $5000,
with an average of $100 over the sample period), ww2 is a dummy variable equal to one during the years of
the second world war. The model includes a quadratic time trend, with the variable t indexing the year of
each observation (ranging from a value of 1 for the observation for 1928 to a value of 80 for 2007).
(i) What is the interpretation of the coecient on Bonus ? (2 marks)

(ii) Construct a 90% confidence interval for the eect of Bonus on fcrt . (2 marks).
(iii) What is the purpose of including the terms t and t2 in model (5.1) ? What are the potential problems
of not including such terms ? (2 marks).

(iv)To allow for the possibility that the fertility rate may respond to Bonus with a lag, the following Finite
Distributed Lag model was estimated:
fcrt

= 95.87 + 0.035 Bonust + 0.072 Bonust1 + 0.024 Bonust2 35.37 ww2t


(3.28) (0.126)
(0.016)
(0.011)
(10.73)
2
2.21t + 0.090t
(0.61) (0.006)
n = 80, R2 = 0.811

(5.2)

What is the estimated Long Run Propensity (LRP) of Bonus on fcr in this model ? What is the interpretation
of the LRP ? (2 marks).
(v) There is not sucient information reported for model (5.2) above to test whether the LRP is statistically
significant. Represent the general form of model (5.2) as
f rt = 0 + 0 Bonust + 1 Bonust1 + 2 Bonust2 + 1 ww2t + u

(5.3)

and derive the transformed equation you would need to estimate to obtain a direct estimate of the LRP and,
most importantly, its standard error. Which coecient in the transformed model represents the LRP?
(2 marks).

Please see over


6

Question 6. (10 Marks in total).


We are interested in analysing wage discrimination against women, and how it has changed over time.
We have independent samples of data for the Australian workforce in 1990 and 2000. The two cross-sections
are pooled and the following model was estimated:
log(wage) = 0 + 1 y2000 + 2 educ + 3 f emale + 4 f emale y2000 + u

(6.1)

where y2000 is a dummy variable equal to 1 if the observation is from the Year 2000 sample (and is equal
to 0 otherwise), educ is years of education and f emale is a dummy variable equal to 1 if the observation is
for a woman (and is equal to 0 for males). The estimates obtained from our pooled sample are:
d
log(wage)
= 0.9764 + 0.3716 y2000 + 0.0653 educ 0.3022 f emale + 0.1401f emale y2000 (6.2)
(0.679) (0.0279)
(0.0051)
(0.0282)
(0.0361)
n = 1084, R2 = 0.3554, SSR = 224.84

(i) What is the interpretation of the coecient on f emale in model (6.1) ? (2 marks).
(ii) What is the expected log(wage) for a female with 16 years of education in 1990 ? (2 marks).
(iii) From the estimates in (6.2), what is the estimate of the exact percentage dierence in expected wages
between 1990 and 2000 ? (2 marks).
(iv) From the estimates in (6.2), calculate an unbiased estimate of the variance of the error term, V ar(u) = 2 .
(2 marks).

(v) From the estimation results in (6.2), would you conclude there is discrimination against women in the
labour market ? Explain the reasons for your conclusion. (2 marks).

End of Paper

Table 1. Critical Values of the t Distribution


1-Tailed:
2-Tailed:
1
2
3
4
5
6
7
8
9
10
D
e
11
g
12
r
13
e
14
e
15
s
16
17
o
18
f
19
20
F
21
r
22
e
e
23
d
24
o
25
m
26
27
28
29
30
40
60
90
120

0.10
0.20
3.078
1.886
1.638
1.533
1.476
1.440
1.415
1.397
1.383
1.372
1.363
1.356
1.350
1.345
1.341
1.337
1.333
1.330
1.328
1.325
1.323
1.321
1.319
1.318
1.316
1.315
1.314
1.313
1.311
1.310
1.303
1.296
1.291
1.289
1.282

0.05
0.10
6.314
2.920
2.353
2.132
2.015
1.943
1.895
1.860
1.833
1.812
1.796
1.782
1.771
1.761
1.753
1.746
1.740
1.734
1.729
1.725
1.721
1.717
1.714
1.711
1.708
1.706
1.703
1.701
1.699
1.697
1.684
1.671
1.662
1.658
1.645

Significance Level
0.025
0.05
12.706
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228
2.201
2.179
2.160
2.145
2.131
2.120
2.110
2.101
2.093
2.086
2.080
2.074
2.069
2.064
2.060
2.056
2.052
2.048
2.045
2.042
2.021
2.000
1.987
1.980
1.960

0.01
0.02
31.821
6.965
4.541
3.747
3.365
3.143
2.998
2.896
2.821
2.764
2.718
2.681
2.650
2.624
2.602
2.583
2.567
2.552
2.539
2.528
2.518
2.508
2.500
2.492
2.485
2.479
2.473
2.467
2.462
2.457
2.423
2.390
2.368
2.358
2.326

Example: The 1% critical value for a one tailed test with 25 df is 2.485. The 5% critical value for a two-tailed
test with large (>120) df is 1.960.

0.005
0.01
63.656
9.925
5.841
4.604
4.032
3.707
3.499
3.355
3.250
3.169
3.106
3.055
3.012
2.977
2.947
2.921
2.898
2.878
2.861
2.845
2.831
2.819
2.807
2.797
2.787
2.779
2.771
2.763
2.756
2.750
2.704
2.660
2.632
2.617
2.576

Table 2. 1% Critical Values of the F Distribution

D
e
n
o
m
i
n
a
t
o
r
D
e
g
r
e
e
s
o
f
F
r
e
e
d
o
m

10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
40
60
90
120

1
10.04
9.65
9.33
9.07
8.86
8.68
8.53
8.40
8.29
8.18
8.10
8.02
7.95
7.88
7.82
7.77
7.72
7.68
7.64
7.60
7.56
7.31
7.08
6.93
6.85
6.63

2
7.56
7.21
6.93
6.70
6.51
6.36
6.23
6.11
6.01
5.93
5.85
5.78
5.72
5.66
5.61
5.57
5.53
5.49
5.45
5.42
5.39
5.18
4.98
4.85
4.79
4.61

3
6.55
6.22
5.95
5.74
5.56
5.42
5.29
5.19
5.09
5.01
4.94
4.87
4.82
4.76
4.72
4.68
4.64
4.60
4.57
4.54
4.51
4.31
4.13
4.01
3.95
3.78

Numerator Degrees of Freedom


4
5
6
7
5.99
5.64
5.39
5.20
5.67
5.32
5.07
4.89
5.41
5.06
4.82
4.64
5.21
4.86
4.62
4.44
5.04
4.69
4.46
4.28
4.89
4.56
4.32
4.14
4.77
4.44
4.20
4.03
4.67
4.34
4.10
3.93
4.58
4.25
4.01
3.84
4.50
4.17
3.94
3.77
4.43
4.10
3.87
3.70
4.37
4.04
3.81
3.64
4.31
3.99
3.76
3.59
4.26
3.94
3.71
3.54
4.22
3.90
3.67
3.50
4.18
3.85
3.63
3.46
4.14
3.82
3.59
3.42
4.11
3.78
3.56
3.39
4.07
3.75
3.53
3.36
4.04
3.73
3.50
3.33
4.02
3.70
3.47
3.30
3.83
3.51
3.29
3.12
3.65
3.34
3.12
2.95
3.53
3.23
3.01
2.84
3.48
3.17
2.96
2.79
3.32
3.02
2.80
2.64

Example: The 1% critical value for numerator df =3 and denominator df=60 is 4.13.

8
5.06
4.74
4.50
4.30
4.14
4.00
3.89
3.79
3.71
3.63
3.56
3.51
3.45
3.41
3.36
3.32
3.29
3.26
3.23
3.20
3.17
2.99
2.82
2.72
2.66
2.51

9
4.94
4.63
4.39
4.19
4.03
3.89
3.78
3.68
3.60
3.52
3.46
3.40
3.35
3.30
3.26
3.22
3.18
3.15
3.12
3.09
3.07
2.89
2.72
2.61
2.56
2.41

10
4.85
4.54
4.30
4.10
3.94
3.80
3.69
3.59
3.51
3.43
3.37
3.31
3.26
3.21
3.17
3.13
3.09
3.06
3.03
3.00
2.98
2.80
2.63
2.52
2.47
2.32

You might also like