You are on page 1of 5

Department of Economics

Harvard University

Economics 1123
Fall 2014

Problem Set #1
An Empirical Investigation of the Determinants of Economic Growth
Due: Tuesday September 16, 2014
This problem set considers one of the big questions in economics: What are the determinants of economic growth? One theory
holds that trade spurs growth: an economy that is open to foreign trade and to foreign investment can introduce new industries, so
workers can get better-paying industrial jobs instead of low-paying agricultural jobs. Additionally, the theory of human capital
predicts that countries with a better educated work force will have a higher rate of productivity and therefore have a higher
growth rate. In this problem set, you will quantify some basic relationships between growth, trade, physical capital, human
capital, and political stability. The data set, growthdata.dta, is described at the end of this problem set.
1.

Estimate a regression of growth on tradeshr and write out the regression in equation form, with heteroskedasticity-robust
standard error below the respective regression coefficients. [STATA HINT: use the command, reg growth tradeshr,
robust]
a) Explain in words what the coefficient on tradeshr means. Is the numerical value of your estimate large or small in an
economic (real-world) sense?

1. a. The coefficient for tradeshr (2.306434) means that for each unit
increase in tradeshr, growth increases by 2.306434 units.
The numerical estimate is large in an economic sense because an
increase in tradeshr increases growth by over an entire standard
deviation.
b)

What is the R2 of this regression? What does this mean?

b. The R-squared is 0.1237. This tells us that 12.37 percent if the


variation in growth is explained by the variation in tradeshr. The rest is
unexplained by the model.
c)

Compute the correlation coefficient between growth and tradeshr, and compare its square to the R2. How are the
correlation coefficient and the R2 related?

c. 0.3517. Its square is the same as the R-squared value.


d)

What is the value of the root mean square error (RMSE) of the regression? What does this mean?

d. 1.79. This means that the error term has a standard deviation of
1.79.
e)
f)

g)

Report the 95% confidence interval for 1, the slope of the population regression line.
From the regression of growth and tradeshr, the 95% confidence interval is .9809608 to 3.631907.
The STATA output provides three ways to test the hypothesis that the slope coefficient is statistically significantly
different from zero at the 5% significance level. State these three ways and the result of the test.
T test, p value, and confidence interval. The value for t test is 3.48, which is greater than 1.96, therefore we reject the
null hypothesis and accept the alternative hypothesis, determining that the slope is different from 0 at the 95%
confidence level (5% significance level). The p value is 0.001, which is less than 0.05; therefore, we reject the null
hypothesis and accept the alternative hypothesis that the slope is different from 0 at the 95% confidence level. The
confidence interval does not contain zero, therefore the variable (slope) is statistically significant at the 95%
confidence level.
Reestimate the regression using non-robust standard errors. Which standard error is larger: heteroskedasticity-robust or
non-robust? Which is more reliable? Briefly explain.
The robust SE is 0.6632868. The nonrobust SE is 0.773485. The nonrobust standard error is larger. The robust is more
reliable, because it accounts for both heteroskedasticity and homoscedasticity.

Department of Economics
Harvard University

2.

Economics 1123
Fall 2014

There is an outlier in the data set. Provide a scatterplot of growth vs. tradeshare. [STATA HINT: use the command,
twoway scatter growth tradeshr] Rerun the regression (use heteroskedasticity-robust standard errors) dropping
the outlier. [STATA HINT: use the command, reg growth tradeshr if tradeshr<1.5, robust think
about why this command works]
a) Does dropping the outlier make a qualitative difference to your results? Explain.
b) What is the outlier observation? Considering the economics of the relation you are investigating, in your judgment
should that outlier be omitted from the regression? (You might need to do a bit of research about that outlier to answer
this question properly.)

a.
b.

Yes, now the results of the 3 tests (T value, p value, and hypothesis test) indicate that the variable (slope) is not
statistically significant. We fail to reject the null hypothesis that the slope is zero.
The outlier is Malta (tradeshr=1.99, growth=6.65). It should be omitted because it is an island nation (with lots of

Department of Economics
Harvard University

3.

4.

Economics 1123
Fall 2014

freight transport) that brings a substantial amount of goods into the country that then immediately leave the country.
This is unlike any other country. Its behavior is not natural and should be omitted.
Estimate the regression of growth against tradeshr, school60, and oil (excluding the outlier). What is the coefficient on oil?
Explain why you obtained this result.
Table 2 presents the results of two regressions, one in each column. The results for regression (1) are already reported;
check that you can produce the regression (1) results in STATA using the commands:
reg growth tradeshr school60 if tradeshr<1.5, robust
display "Adjusted R-squared = " e(r2_a)
test tradeshr school60
(The second command reports the adjusted R2, which is stored in memory but by default is not reported when you use
the ,robust option.) Now estimate regression (2), in which tradeshr, school, and capstock60 are regressors, and fill in
the values (including the F-test results). You may either handwrite or type in the entries.

5.

Use Table 2 to answer the following questions.


a) Write the regression in column (1) in equation form, with the standard error below the respective regression
coefficient, and include the adjusted R2 and sample size.
b) Explain in words what the coefficient on school60 means in regression (1).
c) Economic theory predicts that tradeshr, school60, and capstock60 all are determinants of economic growth. Use
regression (2) to test the hypothesis (at the 5% significance level) that the coefficients on these three economic
variables are all zero, against the alternative that at least one coefficient is nonzero.
d) The neoclassical theory of human capital suggests that countries with more human capital that is, a better educated
work force will have a higher rate of productivity and therefore have a higher growth rate. Is this prediction borne
out in the regression results? Explain.
e) Explain why the coefficient on school60 and its standard error are so different in regressions (1) and (2).

Department of Economics
Harvard University

Economics 1123
Fall 2014
DATA DESCRIPTION, FILE: growthdata.dta

The data are a cross-sectional sample of n = 65 non-Communist countries, excluding economies for which oil accounts for at
least half of exports in 1960 (i.e. for which oil = 1).
Table 1
Definitions of Selected Variables in growthdata.dta
Variable
growth
tradeshr
school60
capstock60
revc
civil
oil

Definition
Average annual percentage growth of real (i.e. inflation-adjusted) per capita (i.e. divided by
population) Gross Domestic Product (GDP) from 1960 to 1995.
The average share of trade in the economy from 1960 to 1995, measured as the sum of exports
plus imports, divided by GDP; that is, the average value of (X + M)/GDP from 1960 to 1995,
where X = exports and M = imports (both X and M are positive).
Average years of schooling in total population in 1960.
Capital stock, in thousands of US$, per capita, in 1960 (the capital stock is the value of all fixed
structures and machines).
Average annual number of revolutions, insurrections (successful or not) and coup detats in that
country from 1960 to 1995.
Index of civil liberties, on a scale of 1 (most civil liberties) to 10 (none).
= 1 if oil accounted for at least half of exports in 1960.
= 0 otherwise.

Department of Economics
Harvard University

Economics 1123
Fall 2014
Economics 1123, Problem Set 1
Table 2
Growth Regression Results
Dependent variable: Growth
Regressor
tradeshare

(1)
1.90
(0.866)
0.243
(0.076)
__

school60
capstock60

(2)

Intercept

-0.122
(0.691)
F-statistics testing the hypothesis that the population coefficients on the
indicated regressors are all zero:
tradeshare, school60
6.40
(.003)
tradeshare, school60, capstock60
__
Regression summary statistics
R2

0.160
0.133

R2
Regression RMSE
n

1.691
64

Notes: Heteroskedasticity-robust standard errors are given in parentheses under estimated coefficients, and p-values are given in
parentheses under F- statistics. The F-statistics are heteroskedasticity-robust. The regression results exclude data on the outlier
observation (for which tradeshr > 1.5).

You might also like