Professional Documents
Culture Documents
Download the two datasets jobtrain.dta and mlda.dta from the moodle elearning platform
(folder: Exercise 6) to your computer, and save it on the X: drive.
Open Stata, go to the do-file editor and set up a new do-file called econometrics ex6.
Difference-in-difference regression
1. Load the data jobtrain.dta. The dataset contains information on real earnings of workers in
the years 1974 and 1976. Some of the workers have participated in a job training program in
1975 (participation in the training program is indicated by the dummy variable train). We are
interested in the effect of the job training program on real earnings and want to measure it using
a difference-in-difference approach.
2. Consider the following regression (year76t is a dummy equal to zero in year 1976 and zero otherwise):
realearningsit = 0 + 1 year76t + 2 traini + 3 [traini year76t ] + it
(1)
Which coefficient reflects the difference-in-difference effect? How do you interpret it?
3. Use the regression above to estimate the effect of the job training program on real earnings. Interpret
the effect of the training program quantitatively and comment on its statistical significance.
4. Use the commands margin and marginplots to compare the real earnings in both groups before
and after treatment.
(3)
Interpret the estimates and explain whether they have a causal claim. Is there an Omitted Variable
Bias?
4. Plot the estimation results of the relationship between the running variable and the outcome and
interpret the graph.
5. Explain why RD tools do not necessarily produce reliable causal estimates when nonlinear trends
are neglected.
6. Modify the simple regression above with (a) including a quadratic term of the running variable:
M a = + Da + 1 (a a0 ) + 2 (a a0 )2 + a ,
(4)
(5)
Graph all three models and explain which one has the better fit. Is there a formal test for model
fit? Whats the difference between specification (a) and (b)?