You are on page 1of 10

Applications 4.

1
Data on U.S. gasoline consumption for the years 1953 to 2004 are given in Table F2.2. Note, the
consumption data appear as total expenditure. To obtain the per capita quantity variable, divide
GASEXP by GASP times Pop. The other variables don’t need transformation.

a. Compute the multiple regression of per capita consumption of gasoline on per capita
income, the price of gasoline, all of the other prices and a time trend. Report all
results. Do the signs of the estimates agree with your expectations?
. use "D:\JONY\PhD in International Economics\Advanced Econometrics\Lap 1\F2-
2.dta", clear

. gen gp=1000000*gasexp/(pop*gasp)
. gen t= year-1952
. reg gp t income gasp pnc puc ppt pd pn ps
Source | SS df MS Number of obs = 52
-------------+------------------------------ F( 9, 42) = 530.82
Model | 56.7083042 9 6.30092268 Prob > F = 0.0000
Residual | .49854905 42 .011870215 R-squared = 0.9913
-------------+------------------------------ Adj R-squared = 0.9894
Total | 57.2068532 51 1.121703 Root MSE = .10895
------------------------------------------------------------------------------
gp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
t | .0725037 .0141828 5.11 0.000 .0438816 .1011257
income | .0002157 .0000518 4.17 0.000 .0001113 .0003202
gasp | -.0110838 .0039781 -2.79 0.008 -.019112 -.0030557
pnc | .0005774 .0128441 0.04 0.964 -.0253432 .0264979
puc | -.0058746 .0048703 -1.21 0.234 -.0157033 .0039541
ppt | .0069073 .0048361 1.43 0.161 -.0028524 .016667
pd | .0012289 .0118818 0.10 0.918 -.0227495 .0252072
pn | .0126905 .012598 1.01 0.320 -.0127333 .0381142
ps | -.0280278 .0079962 -3.51 0.001 -.0441649 -.0118907
_cons | 1.105878 .5693784 1.94 0.059 -.0431744 2.25493
------------------------------------------------------------------------------
From the regression results we have found that most of the signs of the coefficients agree
with our expectations. The sign of coefficients of income, pnc, ppt, pd, pn as well as the
constant were positive while those for gasp, puc and ps were negative.
The positive coefficient sign of income states that with more income, people can spend more
on gasoline. If pnc increases, then consumers will buy fewer new cars. If used car use more
gasoline than new ones, then the rise in pnc would lead to higher gasoline consumption than
otherwise, as consumers will prefer buying used cars.
The negative sign of puc indicates that a reduction in price of used cars will lead to many
people buying used cars, which are not fuel efficient and hence, increase ppt will also
increase gasoline expenditure through its substitutability with private cars. Since automobile
are considered as big part of durables, so its sign is consistent with this reasoning.
b. Test the Hypothesis that at least in regard to demand for gasoline; consumers don’t
differentiate between changes in the prices of new and used cars.
The following test was run to test the hypothesis that at least in regards to demand for
gasoline, consumers don’t differentiate between change in price of new car and used cars;
. test pnc=puc
( 1) pnc - puc = 0
F(1,42) =0.24
Prob > F =0.6233
From the result we conclude that there is no significant evidence to reject the null hypothesis
that consumers don’t differentiate between change in price of new cars and old cars.

c. Estimate the own price elasticity of demand, the income elasticity, and the cross
price elasticity with respect to changes in the price of public transportation. Do the
computations at the 2004 point in the data.
Estimation of the own elasticity of demand, the income elasticity, and the cross price
elasticity with respect to changes in the price of public transportation are as follows:
. mfx, at (52 27113 123.901 133.9 133.3 209.1 114.8 172.2 222.8) eyex
Elasticities after regress
y = Fitted values (predict)
= 6.1726972
------------------------------------------------------------------------------
variable | ey/ex Std. Err. z P>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
t | .6107851 .12038 5.07 0.000 .374843 .846727 52
income | .9476598 .2263 4.19 0.000 .504127 1.39119 27113
gasp | -.2224796 .08093 -2.75 0.006 -.381102 -.063857 123.901
pnc | .0125245 .2786 0.04 0.964 -.533521 .55857 133.9
puc | -.1268632 .10488 -1.21 0.226 -.332432 .078706 133.3
ppt | .2339837 .16441 1.42 0.155 -.08826 .556228 209.1
pd | .0228545 .22098 0.10 0.918 -.410256 .455965 114.8
pn | .3540265 .35281 1.00 0.316 -.337474 1.04553 172.2
ps | -1.011648 .29332 -3.45 0.001 -1.58654 -.436759 222.8
------------------------------------------------------------------------------

. quietly scalar bgasp=_b[gasp]


. quietly scalar bincome=_b[income]
. quietly scalar bppt=_b[ppt]
. quietly scalar gp2004=gp[52]
. quietly scalar gasp2004=gasp[52]
. quietly scalar income2004=income[52]
. quietly scalar ppt2004=ppt[52]
. scalar ep=bgasp*(gasp2004/gp2004)
. scalar ei=bincome*(income2004/gp2004)
. scalar ce=bppt*(ppt2004/gp2004)
. scalar list ep ei ce
ep= -.22279148
ei= .9489883
ce= .23431171
d. Re-estimate the regression in logarithms so that the coefficients are direct estimates
of the elasticities. (Do not use the log of the time trend.)How d your estimates
compare with the results in the previous question? Which specification do you
prefer?
Re-estimating the regression in logarithms:
. gen lgp=log(gp)

. gen lincome=log(income)

. gen lgasp=log(gasp)

. gen lpnc=log(pnc)

. gen lpuc=log(puc)

. gen lppt=log(ppt)

. gen lpd=log(pd)

. gen lpn=log(pn)

. gen lps=log(ps)

. reg Lgp t lincome lgasp lpnc lpuc lppt lpd lpn lps

Source | SS df MS Number of obs = 52


-------------+------------------------------ F( 9, 42) = 351.33
Model | 2.87044868 9 .318938742 Prob > F = 0.0000
Residual | .038128217 42 .000907815 R-squared = 0.9869
-------------+------------------------------ Adj R-squared = 0.9841
Total | 2.9085769 51 .05703092 Root MSE = .03013

------------------------------------------------------------------------------
Lgp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
t | .0379721 .0075137 5.05 0.000 .0228088 .0531354
lincome | .9929907 .2503763 3.97 0.000 .4877109 1.49827
lgasp | .0605177 .0540101 1.12 0.269 -.0484792 .1695146
lpnc | -.1547138 .2669637 -0.58 0.565 -.6934683 .3840408
lpuc | -.4890899 .0851996 -5.74 0.000 -.6610297 -.31715
lppt | .0192726 .136449 0.14 0.888 -.2560926 .2946378
lpd | 1.732055 .2598871 6.66 0.000 1.207582 2.256529
lpn | -.7295301 .2650689 -2.75 0.009 -1.264461 -.1945995
lps | -.8679929 .3529119 -2.46 0.018 -1.580198 -.1557878
_cons | -7.287192 2.520568 -2.89 0.006 -12.37391 -2.200479
------------------------------------------------------------------------------
The above estimates differ from the regression results without logarithms. Firstly, the
Analysis of Variance (ANOVA) table results differ. The Sum of Squares (SS) accounted
for by the model for logarithm regression is 2.8704 while the previous regression is
56.703. The residual SS also differs in that those for the logarithm regression are 0.03813
while the previous regression is 0.4985. Both the R2 and adjusted R2 differ but not
significantly, in that the logarithm model are at 0.9869 and 0.9841 respectively, while in
the previous regression they are at 0.9913 and 0.9894 respectively. There are also
differences in coefficient results from the two regressions. In the first regression income,
pnc, ppt, pd, pn and the constant were positive while gasp, puc, and ps were negative. In
this regression, coefficients fr income, gasp, ppt, pd are positive and those of pnc, pn, and
the constant have negative constants. I prefer the logarithm regression specification
because it captures most of the expected signs as compared to first regression.

e. Compute the simple correlations of the price variables. Would you conclude that
multicollinearity is a “problem” for the regression in part a or part d?
Computing the simple correlations of the price variables:
. cor pnc puc ppt pd pn ps
(obs=52)
| pnc puc ppt pd pn ps
-------------+------------------------------------------------------
pnc | 1.0000
puc | 0.9939 1.0000
ppt | 0.9807 0.9824 1.0000
pd | 0.9933 0.9878 0.9585 1.0000
pn | 0.9885 0.9822 0.9899 0.9773 1.0000
ps | 0.9785 0.9769 0.9975 0.9563 0.9936 1.0000

. cor lgasp lpnc lpuc lppt lpd lpn lps


(obs=52)
| lgasp lpnc lpuc lppt lpd lpn lps
-------------+---------------------------------------------------------------
lgasp | 1.0000
lpnc | 0.9667 1.0000
lpuc | 0.9674 0.9940 1.0000
lppt | 0.9665 0.9891 0.9910 1.0000
lpd | 0.9776 0.9932 0.9945 0.9864 1.0000
lpn | 0.9839 0.9900 0.9902 0.9942 0.9923 1.0000
lps | 0.9742 0.9902 0.9912 0.9985 0.9886 0.9979 1.0000

f. Notice that the price index for gasoline is normalized to 100 in 2000, whereas the
other price indices are anchored at 1983 (roughly). If you were to renormalize the
indices so that they were all 100.00 in 2004, then how would the results of the
regression in part a change? How would the results of the regression in part d
change?
In the linear case, if we normalize to 100, the results will be the same because the
coefficient will be divided by the same scale factor, so that x×b would be unchanged
where x is a variable and b is the coefficient.
In the log linear case, log(k×x)=log(k)+log(x), the normalization would simply affect the
constant term. The price coefficients would be unchanged.
g. This exercise is based on the model that you estimated in part d. We are interested
in investigating the change in the gasoline market that occurred in 1973. First,
compute the average values of log of per capita gasoline consumption in the years
1953-1973 and 1974-2004 and report the values and the difference. If we divide the
sample into these two groups of observations, then we can decompose the change in
the expected value of the log of consumption into a change attributable to change in
the regressors and a change attributable to a change in the model coefficients as
shown in the section 4.5.3 using the Oaxaca-Blinder approach described there,
compute the decomposition by partitioning the sample and computing separate
regressions. Using your results, compute a confidence interval for the part of the
change that can be attributed to structural change in the market, that is, change in
the regression coefficients.
. mean lgp if year<1974
Mean estimation Number of obs = 21

--------------------------------------------------------------
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
lgp | 1.334769 .04365 1.243717 1.425822
--------------------------------------------------------------

. scalar ybar0=r(b)
. mean lgp if year>1973

Mean estimation Number of obs = 31


--------------------------------------------------------------
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
lgp | 1.730146 .012755 1.704097 1.756195
--------------------------------------------------------------

. scalar ybar1=r(b)

. gen cons=1
. mean t lincome lgasp lpnc lpuc lppt lpd lpn lps cons if year<1974

Mean estimation Number of obs = 21

--------------------------------------------------------------
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
t | 11 1.354006 8.175592 13.82441
lincome | 9.307999 .0382598 9.22819 9.387807
lgasp | 2.973497 .0218159 2.92799 3.019005
lpnc | 3.919241 .0123054 3.893572 3.944909
lpuc | 3.319498 .0322032 3.252324 3.386673
lppt | 3.220735 .0564034 3.10308 3.338391
lpd | 3.682407 .0184103 3.644004 3.72081
lpn | 3.539391 .0300972 3.476609 3.602173
lps | 3.276916 .0479733 3.176846 3.376987
cons | 1 0 . .
--------------------------------------------------------------
. matrix x0=e(b)

. mean t lincome lgasp lpnc lpuc lppt lpd lpn lps cons if year>1973
Mean estimation Number of obs = 31
--------------------------------------------------------------
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
t | 37 1.632993 33.66498 40.33502
lincome | 9.918829 .031305 9.854896 9.982762
lgasp | 4.2413 .0585685 4.121687 4.360913
lpnc | 4.692742 .0483805 4.593936 4.791548
lpuc | 4.637867 .0770827 4.480443 4.795291
lppt | 4.765984 .0959479 4.570032 4.961936
lpd | 4.616158 .0479135 4.518305 4.71401
lpn | 4.709391 .0602159 4.586413 4.832368
lps | 4.783979 .0866703 4.606975 4.960984
cons | 1 0 . .
--------------------------------------------------------------
. matrix x1=e(b)

. regress Lgp t lincome lgasp lpnc lpuc lppt lpd lpn lps if year<1974
Source | SS df MS Number of obs = 21
-------------+------------------------------ F( 9, 11) = 584.71
Model | .798567151 9 .088729683 Prob > F = 0.0000
Residual | .001669259 11 .000151751 R-squared = 0.9979
-------------+------------------------------ Adj R-squared = 0.9962
Total | .800236411 20 .040011821 Root MSE = .01232

------------------------------------------------------------------------------
Lgp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
t | .0090829 .0184495 0.49 0.632 -.0315243 .0496901
lincome | .6648792 .2234255 2.98 0.013 .1731229 1.156636
lgasp | -.202044 .4191071 -0.48 0.639 -1.124492 .7204044
lpnc | .5912882 .3024403 1.96 0.076 -.0743785 1.256955
lpuc | -.294078 .1420679 -2.07 0.063 -.6067674 .0186114
lppt | -.3584459 .3104284 -1.15 0.273 -1.041694 .3248024
lpd | -.1022751 1.072572 -0.10 0.926 -2.462991 2.258441
lpn | -.0383662 .5780072 -0.07 0.948 -1.310551 1.233819
lps | .7541618 .7414295 1.02 0.331 -.8777136 2.386037
_cons | -6.498724 1.860316 -3.49 0.005 -10.59325 -2.404197
------------------------------------------------------------------------------
. matrix b0=e(b)
. matrix var0=e(v)
. regress Lgp t lincome lgasp lpnc lpuc lppt lpd lpn lps if year>1973

Source | SS df MS Number of obs = 31


-------------+------------------------------ F( 9, 21) = 104.38
Model | .147993657 9 .01644374 Prob > F = 0.0000
Residual | .00330819 21 .000157533 R-squared = 0.9781
-------------+------------------------------ Adj R-squared = 0.9688
Total | .151301846 30 .005043395 Root MSE = .01255
------------------------------------------------------------------------------
Lgp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
t | .0007693 .0052182 0.15 0.884 -.0100827 .0116212
lincome | .5181589 .1498427 3.46 0.002 .2065439 .8297739
lgasp | -.0770111 .0501662 -1.54 0.140 -.1813374 .0273152
lpnc | .6158313 .2687583 2.29 0.032 .0569178 1.174745
lpuc | .2402007 .0938617 2.56 0.018 .0450045 .4353969
lppt | -.1616701 .0748211 -2.16 0.042 -.3172691 -.0060711
lpd | -.6564543 .3175261 -2.07 0.051 -1.316786 .0038775
lpn | .2370631 .2603476 0.91 0.373 -.3043593 .7784855
lps | -.2148074 .1814829 -1.18 0.250 -.5922217 .1626069
_cons | -3.40315 1.449386 -2.35 0.029 -6.417314 -.3889868
------------------------------------------------------------------------------
. matrix b1=e(b)

. matrix var1=e(v)

. matrix dy_dx=b1*x1'-b1*x0'

. matrix dy_db=b1*x0'-b0*x0'

. matrix vtotal=var0+var1

. matrix vdb=x0*vtotal*x0'

. scalar dy_dxs=dy_dx[1,1]

. scalar dy_dbs=dy_db[1,1]

. scalar vdbs=vdb[1,1]

. display "dybar=" ybar1-ybar0


dybar=.39537652

. display "dy_dx=" dy_dxs


dy_dx=.12274279

. display "dy_db=" dy_dbs


dy_db=.27263373

. display "the lower limit is" dy_dbs-1.96*sqrt(vdbs) "and the upper limit is"
dy_dbs+1.96*sqrt(vdbs) the lower limit is .1848467and the upper limit
is.3604207

Therefore,
DYBAR= .395377
DY_DX=.12274279
DY_DB=.27263373
LOWER=.18484676
UPPER=.3604207
Applications 5.3
a. Testing of the hypothesis that the three aggregate price indices are not significant
determinants of the demand for gasoline:
. reg Lgp t lincome lgasp lpnc lpuc lppt lpd lpn lps

Source | SS df MS Number of obs = 52


-------------+------------------------------ F( 9, 42) = 351.33
Model | 2.87044868 9 .318938742 Prob > F = 0.0000
Residual | .038128217 42 .000907815 R-squared = 0.9869
-------------+------------------------------ Adj R-squared = 0.9841
Total | 2.9085769 51 .05703092 Root MSE = .03013

------------------------------------------------------------------------------
Lgp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
t | .0379721 .0075137 5.05 0.000 .0228088 .0531354
lincome | .9929907 .2503763 3.97 0.000 .4877109 1.49827
lgasp | .0605177 .0540101 1.12 0.269 -.0484792 .1695146
lpnc | -.1547138 .2669637 -0.58 0.565 -.6934683 .3840408
lpuc | -.4890899 .0851996 -5.74 0.000 -.6610297 -.31715
lppt | .0192726 .136449 0.14 0.888 -.2560926 .2946378
lpd | 1.732055 .2598871 6.66 0.000 1.207582 2.256529
lpn | -.7295301 .2650689 -2.75 0.009 -1.264461 -.1945995
lps | -.8679929 .3529119 -2.46 0.018 -1.580198 -.1557878
_cons | -7.287192 2.520568 -2.89 0.006 -12.37391 -2.200479
------------------------------------------------------------------------------

. test lpd lpn lps


( 1) lpd = 0
( 2) lpn = 0
( 3) lps = 0
F( 3, 42) = 23.25
Prob > F = 0.0000
. *F= 34.868735
. *The critical value from the F table is 2.827, so we would reject the
hypothesis.

b. It would be quite cumbersome to estimate and examine the loss of fit because the
restricted model is quite nonlinear. So, we can test the restriction using the
unrestricted model. For this problem-
Let,

I = [γnc – γuc, γncδs – γp f δd]′


Entire parameter vector:
𝜕𝑓1 𝜕∞
G= [ ]
𝜕𝑓2 𝜕∞
0 0 0 1 −1 0 0 0 0 0
=[ ]
0 0 0 𝛿𝑠 0 𝛿𝑑 0 −𝛾𝑝𝑡 0 𝛾𝑛𝑐
The parameter estimates are thus, f = [-.17399, .10091]′. The covariance matrix to use for the
tests is Gs2(X′X)-1G′).
The statistic for the joint test is as follows:
χ2 = f′[Gs2(X′X)-1 G′]-1 f = .4772
This is less than the critical value for a chi-squared with two degrees of freedom. So we
would not reject the joint hypothesis. For the individual hypotheses, we need only compute
the equivalent of a t ratio for each element of f. Thus,

z1 = -.6053 and z2 = .2898


Neither is large, so neither hypothesis would be rejected. Given the earlier result, this was to
be expected.

c. Testing hypothesis separately and jointly:

. reg lgp t lincome lgasp lpnc lpuc lppt lpd lpn lps

Source | SS df MS Number of obs = 52


-------------+------------------------------ F( 9, 42) = 351.33
Model | 2.87044868 9 .318938742 Prob > F = 0.0000
Residual | .038128217 42 .000907815 R-squared = 0.9869
-------------+------------------------------ Adj R-squared = 0.9841
Total | 2.9085769 51 .05703092 Root MSE = .03013

------------------------------------------------------------------------------
lgp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
t | .0379721 .0075137 5.05 0.000 .0228088 .0531354
lincome | .9929907 .2503763 3.97 0.000 .4877109 1.49827
lgasp | .0605177 .0540101 1.12 0.269 -.0484792 .1695146
lpnc | -.1547138 .2669637 -0.58 0.565 -.6934683 .3840408
lpuc | -.4890899 .0851996 -5.74 0.000 -.6610297 -.31715
lppt | .0192726 .136449 0.14 0.888 -.2560926 .2946378
lpd | 1.732055 .2598871 6.66 0.000 1.207582 2.256529
lpn | -.7295301 .2650689 -2.75 0.009 -1.264461 -.1945995
lps | -.8679929 .3529119 -2.46 0.018 -1.580198 -.1557878
_cons | -7.287192 2.520568 -2.89 0.006 -12.37391 -2.200479
------------------------------------------------------------------------------

. *test the restriction(1) separately, you can use either of the following two
commands
. lincome Lpnc-Lpuc

. (1) lpnc - lpuc = 0

------------------------------------------------------------------------------
lgp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | .3343761 .2874485 1.16 0.251 -.2457184 .9144706
------------------------------------------------------------------------------
. test lpnc = lpuc
(1) lpnc - lpuc = 0

F (1, 42) = 1.35


Prob > F = 0.2513
. *test the restriction(2) separately, you can use either of the following two
commands
. nLcom_b[Lpnc]*_b[Lps]-_b[Lppt]*_b[Lpd]

_nl_1: _b[Lpnc]*_b[Lps]- _b[Lppt]*_b[Lpd]

------------------------------------------------------------------------------
lgp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_nl_1| .1009092 .3482736 0.29 0.773 -.6019353 .8037538
------------------------------------------------------------------------------

. testnl _b[Lpnc]*_b[Lps]= _b[Lppt]*_b[Lpd]

(1) _b[Lpnc]*_b[Lps]= _b[Lppt]*_b[Lpd]

F(1, 42) = 0.08


Prob > F = 0.7734

. *test the restriction jointly


. testnl (_b[Lpnc] = _b[Lpuc]) (_b[Lpnc]*_b[Lps] = _b[Lppt]*_b[Lpd])

(1) _b[Lpnc] = _b[ Lpuc]


(2) _b[Lpnc]*_b[Lps]=_b[Lppt]*_b[Lpd]

F(2, 42) = 2.81


Prob > F= 0.0714

You might also like