You are on page 1of 54

A Brief Review of Panel Data Analysis1)

2)

Contents
1. Introduction to Panel Data

2. Static Linear Regression Models

3. Dynamic Linear Models

20

4. Nonlinear Models : Limited Dependent Variables Models

23

5. KEEP

29

1) , ,
, .
2) , shkang@sungshin.ac.kr

6th KEEP Conference

A Brief Review of Panel Data Analysis

1. Introduction to Panel Data


1.1 Properties of Panel Data
Data Structure
- individuals,
- time, ,for each
Types of Data
- large and small (most labor/micro-social-economic data)
- small and large (macroeconomics data on G7)
- both large (farm production)
Balanced vs. Unbalanced Data
- balanced : for any , there are observations
- unbalanced : may be different over ( )
Unbalanced data can be used for linear regression model, but have some
limitations on the analysis of nonlinear models such as probit or logit.
Benefits of Panel Data
- Can control unobservable individual heterogeneity(=>incorrect specification
leads to biased and inconsistent estimator)
- Rich information about cross-sectional variations and dynamics
- Can avoid problems in time series data, e.g. multicollinearity, aggregation
bias and nonstationarity
- Can identify individual and time effects which can not be identified by pure
cross-sectional or time series data
Limitations of Panel Data
- Large parts of panel data are unbalanced
- Panel attrition
- Measurement errors
- Most existing estimation technique are for panel data with short-time horizon

6th KEEP Conference

1.2 Analysis of Covariance


Initial Descriptive Formulation

where , , .
=>With degrees of freedom, and parameters to be estimated.
Workable Formulation
Most general formulation will be (nonidentical intercepts, nonidentical slopes)
<1-1>

: Regression coefficients are identical, and the intercepts are not.

<1-2>

: Regression intercepts are the same, and the slope coefficients are not.

(=>This is seldom meaningful)



: Both slopes and intercepts are the same(=>pooled regression)

<1-3>

Re-formulation of : <1-1> subject to linear restrictions.


:
Re-formulation of : <1-1> subject to linear restrictions.
:

4 6

A Brief Review of Panel Data Analysis

Tests of Hypotheses
Note that applying OLS procedure to <1-1>, we get

where

Note that applying OLS procedure to <1-2>, we get



Also note that applying OLS procedure to <1-3>, we get

where

To test , use the following F-statistic with degrees of freedom


under the assumption of normality.

where

(=>sum of residual sum of squares from each individual regressions


from <1-1>)

6th KEEP Conference


(=>residual sum of squares from <1-3>)

If is accepted, we pooled the data and estimate a single equation of <1-3>. If


is rejected, then go to test .
To test , use the following F-statistic with degrees of freedom
.



where


(=>the residual sum of squares from <1-2>).

If is rejected, go to use <1-1>. If is accepted, go to use <1-2>.


: This procedure can be also applied to the time specific effect models.


6 6

A Brief Review of Panel Data Analysis

2. Static Linear Regression Models


2.1 Brief Introduction3)
The fundamental advantage of a panel data set over a cross section is that it
will allow the researcher great flexibility in modelling differences in behavior
across individuals. The basic frame work is

There are regressors in , not including a constant term. The
heterogeneity, or individual effect is where contains a constant term and
a set of individual variables which may be observed such as race, sex, and location
or unobserved such as a family specific characteristics, individual heterogeneity
in skill or preferences, all of which are taken to be constant over time.
Case1 : Pooled Regression
If contains only a constant term, then OLS provides consistent efficient
estimates of the common and the slope vector .
Case 2. Fixed Effects
If is unobserved, but correlated with , then least squares estimator of
will be biased and inconsistent as a consequence of an omitted variable.
However, in this instance, the model

where embodies all the observable effects and specifies an estimable
conditional mean. This fixed effects approach takes to be an individual
specific constant term in the regression model.
Case 3 : Random Effects
If the unobserved individual heterogeneity, however formulated, can be assumed to
3) Greene(2003) p.285

6th KEEP Conference

be uncorrelated with the included variables, then the model may be formulated as

that is, as a linear regression model with compound disturbance that may be
consistently, albeit inefficiently, estimated by least squares. This random effect
approach specifies that is an individual specific random element, similar to
except that for each individual, there is but a single draw that enters the
regression identically in each period.
2.2 Fixed Effects Model
Formulation
The fixed effect model treat to be fixed.
<2-1> , ,
This fixed effect(FE) model is also called least squares dummy-variable(LSDV)
model and analysis of covariance model.
Estimation
Equation <2-1> can be rewritten in vector form.



where


,
,
,

, if

8 6

A Brief Review of Panel Data Analysis

By minimizing the residual sum of squares, we can get the following LS


estimators.

is called the "Within Estimator", the "Covariance Estimator(CV)", the

"Fixed Effect(FE) Estimator" and the Least Squares Dummy


Variable(LDV) Estimator". Note that
can be also re-written as

where .
Properties of CV estimator

is an OLS estimator).
In fact,
is BLUE(
Remarks
1. Individual effects are differenced away
2. Can't estimate the coefficients of time-invariant regressors.
3. OLS of on and (time-invariant regressors) and between estimator

(OLS of on and ) are biased and inconsistent.

6th KEEP Conference

2.3 Random Effects Model


Formulation
Equation <2-1> can be re-written as

.

where

This model is called the "Random Effect Model", the "Variance Component
Model", the "One-way Error Component Model".
Estimation
The error term has two components, where it is often assumed that



if , if
if , of


(=>no covariance between regressors and error term!!!)
It can be shown that

For example, for =3, then (homoscedasticity, but non-zero covariance)



CV Estimator
continues to be an unbiased and consistent estimator of in this model
(1)
when are random.
is BLUE if the are fixed, but
is not BLUE if the are random,
(2)

10 6

A Brief Review of Panel Data Analysis

does not use the between group variation when it is OK to do


because
so. => We would expect that a GLS estimator, which accounts for the
correlation in over time for a given individual , would be more efficient.
GLS Estimator
Let . The the GLS estimators are

<2-2>

This GLS estimator is BLUE.


Remarks
1. (Alternative Way of Derivation) First transforming the data by subtracting

a fraction of individual means and from their corresponding

and , then regressing


on a constant and

, where
. You can notice that this GLS is

same as CV since when , . This can be also seen in .


2. (Feasible GLS) If the variance components and are unknown, we can
use two-step estimation. In the first step, we estimate the variance
components using some consistent estimators. In the second step, we
substitute their estimated values in <2-2>. When the sample size is large (in
the sense that or ), the two-step GLS estimator will have the
same asymptotic efficiency as the GLS with known variance components.
Even for the moderate sample size (for , , for ,
), the two-step procedure is more efficient than the
covariance estimator.
3. (Other Possible Estimation Methods)
OLS of on and is consistent as . Between Estimator is
unbiased and consistent as . These two, and CV are inefficient.

6th KEEP Conference

11

4. MLE of the random effects model is similar to GLS. MLE behaves well in
finite samples.
5. As we can see in 2.3, treating as random is an intermediate solution
between ignoring between-group variation (CV estimator or Within Estimator)
and treating it as the same as within-group variation(OLS estimator).
2.4 Within and Between Estimator
We can formulate a pooled regression model in three ways.
<2-3>

<2-4>

<2-5>

Applying OLS procedures in <2-3>, <2-4> and <2-5>, we get the usual OLS
estimator
, within estimator
and between estimator
respectively.
With some algebra, we show that the OLS estimator is a matrix weighted
average of the within and between estimators.

where

It can also shown that the GLS estimator is a matrix weighted average of
the within and between estimators.



. Note that if , then and
where

12 6

A Brief Review of Panel Data Analysis

2.5 Fixed or Random ?


Whether to treat the individual effects as fixed or random makes no differences
when is large, because the CV estimator and GLS estimator become the
same estimator. When is finite and is large, whether to treat the effects
as fixed or random is not an easy question to answer. It can make surprising
amount of differences in the estimates of parameters. In fact, when only a few
observations are available for different individuals over time, it is exceptionally
important to make the best use of the lesser amount of information over time
for the efficient estimation of the common behavioral relationship. (Hsiao, p. 42)
The fixed effects model is a reasonable approach when we can be confident
that the differences between units can be viewed as parametric shifts of the
regression function. This model might be viewed as applying to the
cross-sectional units in the study, not to additional ones outside the sample.
For example, an inter-country comparison may well include the full set
of countries for which it is reasonable to assume that the model is
constant. On other settings, it might be more appropriate to view
individual specific constant terms as randomly distributed if we
believed that sampled cross-sectional units were drawn from a large
population. It would certainly be the case for the longitudinal data sets
like PSID or KLIPS(Greene, p. 567. He mentioned that this distinction is not
hard and fast : It is purely heuristic).
From a purely practical standpoint, the dummy variable approach is costly in
terms of degrees of freedom, and in a wide, longitudinal data set, the random
effects model has some intuitive virtue. On the other hand, the fixed effects
approach has one considerable virtue.

There is no justification for treating the

individual effects as uncorrelated with the other regressors, as is assumed in


random effects model. The random effects treatment, therefore, may suffer from
the inconsistency due to omitted variables(Greene, p.576).
The fixed effects model is viewed as one which investigators make inferences
conditional on the effects that are in the sample. The random effects model is

6th KEEP Conference

13

viewed as one which investigators make unconditional or marginal inferences


with respect to the population of all effects. There is no really distinction on
the "nature(of the effect)". It is up to the investigators to decide whether to
make inference with respect to the population characteristics or only with
respect to the effects that are in the sample. .....The situation to which a model
applies and the inferences based on it are deciding factors in determining
whether we should treat effects as fixed or random. When inferences are
going to be confined to the effects in the model, the effects are more
appropriately considered fixed. When inferences will be made about a
population of effects from which those in the data are considered to be
a random sample, then the effects are should be considered random.
(Greene(2003), p.43).
Remark : Conditional or Unconditional Inference? - An Econometric
Approach
Unconditional Inference(=>random effect approach)
Advantage : only a finite number of parameters in the likelihood function as
, and so it often produces efficient inference.

Disadvantage : Conditional density of given depends on the correct


specification of the conditional density of given . Incorrect specification
leads to biased and inconsistent estimators of the slope coefficients.
Conditional Inference (=>fixed effect approach)
Advantage : No need to specify the conditional density of given .
Disadvantage : There may be a loss in efficiency because of lost degrees of
freedom(This is called incidental parameter problem)
If is not asymptotically independent of , then inference on conditional
on , which is not consistently estimated, will not be consistent. In the
static linear model, the CV estimator of is independent of : thus we
obtain a consistent estimator of . This is not the case in non-linear
models.

14 6

A Brief Review of Panel Data Analysis

2.6 Hausman Test


To test the orthogonality of the random effects and the regressors, we can use
the Hausman test.
Basic Idea Under the hypothesis of no correlation, both CV and GLS are
consistent, but CV is inefficient, whereas under the alternative, CV is
consistent, but GLS is not.




Test Statistic :

. Under the null hypothesis, this test statistic
where
is asymptotically distributed as chi-squared with degrees of freedom.
Remark : Breusch and Pagan Test for Random Effect
Breusch and Pagan have devised a Lagrangian multiplier test for the random
effects model based on OLS residual. For

,
the test statistic is

2.7 Mundlak's Formulation


There are reasons to believe that in many circumstances and are indeed
correlated (ex : in farm's production function). Mundlak(1978) suggest that we
approximate the conditional expectation of given by a linear function.

<2-6>

, where

A simple approximation of <2-6> is to let

6th KEEP Conference

15


<2-7>
Note that equals zero iff the explanatory variables are uncorrelated with the
effects. In this model, we obtain the GLS of ( ).




In this case, GLS is the same as CV.(This is true only in static linear models).
2.8 Hausman Taylor Estimator
Hausman and Taylor(1981) proposed IV type estimator for panel data random
effects models in which some of the covariates are correlated with the
unobserved individual level random effect.

If is correlated with , OLS and GLS are both inconsistent. If we use
within groups estimator(CV), then (1) all time invariant variables are eliminated
by the (within) transformation , so that cannot be estimated, and (2) under
certain circumstances, the within groups estimator is not efficient since it
ignores variation cross individuals in the sample4). The first problem is
generally the more serious in some applications(for example, the coefficient of
schooling in the wage equation).
Hausman and Taylor(1981) uses assumptions about the correlation between the
( ) nad . If we are willing to assume that certain variables among the
and are uncorrelated with , then conditions may hold such that all the 's
and 's may be consistently and efficiently estimated. Intuitively, the variables
4) Hausman and Talyor(1981), pp.13771378

16 6

A Brief Review of Panel Data Analysis

in which are uncorrelated with can serve two functions because of their
variation across both individual and time (i) using vdeviations from individual
means, they produce unbiased estimates of the 's. and (ii) using the individual
means, they provide valid instrument for the variables of that are correlated
with .


) as
Their estimator is basically 2SLS on the following equation using (
set of instruments5).
<2-8>
where , is , is , is , is
,

.
(1) If , then the equation is not identified. In this case,


, and

does not exist.



, and
(2) If , then the equation is just identified. In this case,


given by <2.8>.

is
(3) If , then the equation is over-identified. In this case,
more efficient than the Within estimator.

2.9 Both Individual and Time Specific Effects


Fixed Effects Model

where ,

where , and

5)

denotes the deviations from the individual means.

6th KEEP Conference

17

Least squares estimators of the slopes


are obtained by regression of


) on (

), where
(

, i.e.

Random Effects Model


If and are random and uncorrelated with with the following assumptions,

, if , 0 if
if , 0 if
if & , 0 otherwise


then the BLUE is the GLS estimator which uses the following weighting matrix.

.
Tests for individual and time effects : Breush and Pagan Test
This is a Lagrange multiplier test for


We can use OLS estimates of parameters to obtain OLS residuals .

under

18 6

A Brief Review of Panel Data Analysis

where
,

Furthermore, to test , you can use , , you


can use . You can also use ANOVA test.
2.10 Unbalanced Panels
A simple modification is needed in fixed effects model. Modification of the
degrees of freedom in the computation of the estimate of ,
and
F-statistic. Modification of the sample size in the computation of individual and
time effect means.
In random effects model, the variance matrix is no longer because the
diagonal blocks in the variance matrix are of different sizes. There is also
heteroscedasticity because depends on . The estimation is straightforward.
Use the GLS with minor modification of in . Use instead of , where

6th KEEP Conference

19

3. Dynamic Linear Models


3.1 Inconsistency of CV estimator
The CV estimator is inconsistent for a dynamic panel data model with
individual effects, whether the effects are fixed or random In the
following model,

<3-1>

where , and , observable. The CV estimator is

where

It can be shown that

lim
.

This bias is caused by having to eliminate the unknown individual effects from
each observation, which creates a correlation of order between the
explanatory variables, and the residual in the CV transformed
model, (the lagged dependent variable is correlated with the compounded
disturbance in the model since the same enters the each equation for every
observation in ). Also the OLS estimator in random effects model overestimates
when (for the exact formula of the bias, see Hsiao, p.73).
For the estimation of dynamic models, we can use the MLE(Maximum Likelihood
Estimator), the GLS(Generalized Least Squares Estimator), the IV(Instrumental

20 6

A Brief Review of Panel Data Analysis

Variable Estimator), the GMM(Generalized Method of Moments). The statistical


properties of the MLE and GLS depends on the assumption about , but not
those of the IV or the GMM.
3.2 Estimation
There are various ways such as MLE, GLS, IV, GMM. MLE with mistaken
choices of initial conditions yields estimator that is not asymptotically equivalent
to the correct one, and hence may not be consistent.
Simple Example
IV type estimators do not suffer from the choice of initial conditions. Let me
briefly show the basic idea of the IV. From the equation <3-1>,
, where and
Conditions for valid IV are that the IV should be uncorrelated with the error
term and correlated with the regressors. Note that


Hence we can estimate by IV estimator using . Note that
can be also valid IV. Thus these are valid IVs. Two consistent estimators of
are :

6th KEEP Conference

21

We can also get the following moment conditions.


, ,

, ,

We can use the GMM by using these moment conditions.


Arellano and Bond GMM estimator
When the exogeneous regressors are also included, then the Arellano and Bond
GMM estimator takes the following form, where .

where ,






,




GMM is consistent and asymptotically normally distributed if are
across whether is fixed or random. For more GMM estimators, see
Arellano and Bover(1995), Ahn and Schmidt(1995), Blundell and Bond(1998).

For various estimation methods for dynamic panel data model, see Baltagi(2005)

22 6

A Brief Review of Panel Data Analysis

4. Nonlinear Models : Limited Dependent Variables Models6)


4.1 Logit and Probit Models
Fixed Effect Logit Model
Consider the fixed effect panel model, with

This model suffers from incidental parameters problem(as gets large, the
number of parameters to be estimated gets also large). Unlike the linear models,
this incidental parameters problem is not solved by linear transformation.
The usual solution around this incidental parameters problem is to find a minimal

sufficient statistic for . Chamberlain(1980) finds that

is a minimal

sufficient statistic for . Therefore he suggests maximizing the conditional


likelihood function to obtain the conditional logit estimates of .
For the logit model, the contribution to the Log-likelihood of the i-th person is

The conditional Log-likelihood is

or

6) Baltagi(2005), Ch11.

6th KEEP Conference

23

For example, , 0, 1 or 2


These add nothing to the conditional Log-likelihood because log(1)=0. Only
observations for which matter in the conditional Log-likelihood,

You can show that




Note that this probability does not depend on . The conditional logit
Log-likelihood when is

Consistent and efficient estimates of are obtained by maximizing this w.r.t. .


Remarks
Chamberlain(1980) showed that the conditional ML method can be also
extended to the multinomial model as well as log-linear model.
In contrast to the fixed effect logit model, the conditional likelihood approach
does not yield computational simplifications for the fixed effect probit model,
since the fixed effects do not cancel out. This implies that all fixed
effects must be estimated as part of the estimation procedure. Futhermore,
since the estimates of the fixed effects are inconsistent with small , the
fixed effect probit model gives inconsistent estimates for as well.
Random Effect Probit Models7).
With random effect model, the composite error term is correlated across
7) Maddala(1987)

24 6

A Brief Review of Panel Data Analysis

individual even if IID. With the logit model, where the errors are assumed to
have a logistic distribution, we need to use the multivariate logistic distribution.
Whereas, with the probit model, we need to use multivariate normal distribution.
The multivariate logistic distribution has the disadvantage that the correlations
are all contstrained to be 1/2. Though some generalizations are possible, the
multivariate logistic distribution does not permit much flexibility. Hence when
we consider random effect models, we usually use random effect probit models.
The first application of random effect probit model is that of Heckman and
Willies (1976). Consider the random effect panel model, with

In this case , where , and are
mutually independent as well as independent of . Also define ,
and , , and .

If we define
, then we can restate the above condition as


The joint density of the is therefore,

where is the cdf of the standard normal. Thus the expression is reduced
to the evaluation of the expressions and a single integral for which good
approximations are available. This can be evaluated using the Gaussian quadrature
procedure suggested by Butler and Moffitt(1982). Alternatively, one can use the

6th KEEP Conference

25

Chamberlain's minimum distance random effect probit estimators or recently


developed GMM estimators.
For dynamic panel logit model, one can use the Honroe and Kyriaziduo(2000)'s
conditional maximum likelihood estimator which is asymptotically normally
distributed with convergence rate slower than
.
4.2 Selection Bias Model
Consider the following one way error component model,

where is observed if a latent variable . This latent variable is given by

In order to get a consistent estimator of , a generalization of Heckman's(1979)
selectivity bias correction procedure from the cross-sectional to the panel data
can be employed.
4.3 Censored and Truncated Models
Fixed Effects Tobit Models
For random samples, one may observe the number of hours worked if the individual
is employed. This sample is censored in that the hours of worked are reported
as zero if the individual does not work and the regression model is known as
the Tobit model. Heckman and McCurdy(1980) consider a fixed effect Tobit
model to estimate a life cycle model of female labor supply. They argue that
the individual effects have a specific meaning in a lifecycle model and therefore
cannot be assumed independent of the . Hence a fixed effects rather than
random effects specification is appropriate. For this fixed effect Tobit model,

where if .

26 6

A Brief Review of Panel Data Analysis

Let if . The log-likelihood function is



Unlike the case of the linear model, in this model it is not possible to devise
estimators of and that are not functions of the fixed effects . Since the
number of observations per individual is fixed( is usually fixed and usually
small), it is not possible to consistently estimate the fixed effects and this
inconsistency carries through to the estimates of and . Heckman and
McCurdy(1980) suggest estimating the log likelihood function using iterative
method.
Random Effects Tobit Models
There are very few applications of the random effect tobit models with panel
data. Heckman and McCardy argued against it in their application on the grounds
that the were expected to be correlated with the explanatory variables.
An example of random effects tobit model with self selection is the paper by
Huasman and Wise(1979).
,
The problem is that is observed only if an index of attrition , where
is defined by


A common procedure is to discard observations for which is zero. Hausman
and Wise argue that this is an incorrect procedure if the probability of observing
varies with its vaule and other variables. The detailed procedure differs
case by case.

6th KEEP Conference

27

Remarks
Honroe(1992) suggested trimmed least absolute deviations and trimmed least
squares estimators for truncated and censored regression models with fixed
effects. These are semiparametic estimators with no distributional
assumptions necessary on the error terms Honroe(1992) also suggests GMM
type estimator with symmetric error terms.
Kyriazidou (1999) studies the panel data sample selection model, also known
as Type 2 Tobit model. Honroe(1993) considers the dynamic Tobit models
with fixed effect and Arellano, Bover and Labeaga (1999) consider a liner
autoregressive model for a latent variable which is only partly observed due
to a selection mechanism.

28 6

A Brief Review of Panel Data Analysis

5. KEEP
5.1 OLS in each wave
. reg grd_t1 gen1 en_a1 en_c1 self1 sat1 exth1 shr1 peduc1 lov1 smk1 hres pexp
Source |
SS
df
MS
-------------+-----------------------------Model | 384.367088
12 32.0305907
Residual | 997.321467 520
1.9179259
-------------+-----------------------------Total | 1381.68856 532 2.59715894

Number of obs
F( 12, 520)
Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=
=
=

533
16.70
0.0000
0.2782
0.2615
1.3849

-----------------------------------------------------------------------------grd_t1 |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
------------+----------------------------------------------------------------gen1 | .0126798 .1226856
0.10
0.918
-.2283405
.2537001
en_a1 |
.133906 .0945573
1.42
0.157
-.0518552
.3196672
en_c1 | -.8044529 .1142812
-7.04
0.000
-1.028963 -.5799432
self1 | -.0228769 .0078208
-2.93
0.004
-.0382412 -.0075127
sat1 | -.0524515 .0760839
-0.69
0.491
-.201921
.0970181
exth1 | .0336624 .0141471
2.38
0.018
.0058699
.0614549
shr1 | .0197865 .0546787
0.36
0.718
-.0876317
.1272047
peduc1 | -7.65e-07 2.41e-07
-3.17
0.002
-1.24e-06 -2.92e-07
lov1 | .0906444
.204857
0.44
0.658
-.3118046
.4930935
smk1 | .7305463 .3433152
2.13
0.034
.056091
1.405001
hres | -.0958527 .1500278
-0.64
0.523
-.3905878
.1988823
pexp | -.5941723
.135848
-4.37
0.000
-.8610507 -.3272939
_cons | 7.182453 .5777541
12.43
0.000
6.047434
8.317472
-----------------------------------------------------------------------------. reg grd_t1 gen1 en_a1 en_c1 self1 sat1 exth1 shr1 peduc1 lov1 smk1 fwrk1 mwrk1
hcon1
Source |
SS
df
MS
-------------+-----------------------------Model | 377.407042
13
29.031311
Residual | 1004.28151 519 1.93503182
-------------+-----------------------------Total | 1381.68856 532 2.59715894

Number of obs
F( 13, 519)
Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=
=
=

533
15.00
0.0000
0.2731
0.2549
1.3911

-----------------------------------------------------------------------------grd_t1 |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
------------+----------------------------------------------------------------gen1 | -.0011525 .1220838
-0.01
0.992
-.2409916
.2386866
en_a1 | .1149741 .0950275
1.21
0.227
-.0717118
.3016599
en_c1 | -.8244831 .1148491
-7.18
0.000
-1.050109
-.5988569
self1 | -.0224025 .0078461
-2.86
0.004
-.0378166
-.0069884
sat1 | -.0502676 .0764872
-0.66
0.511
-.2005301
.099995
exth1 | .0321401
.014303
2.25
0.025
.0040411
.060239

6th KEEP Conference

29

shr1 | .0319335 .0547692


0.58
0.560
-.0756631
.1395301
peduc1 | -7.84e-07 2.41e-07
-3.25
0.001
-1.26e-06
-3.11e-07
lov1 | .0022821
.20656
0.01
0.991
-.4035144
.4080787
smk1 | .6848195 .3461619
1.98
0.048
.0047688
1.36487
fwrk1 | -.2680606 .2107549
-1.27
0.204
-.6820982
.145977
mwrk1 |
.154289 .1218647
1.27
0.206
-.0851198
.3936978
hcon1 | -.0017108 .0005518
-3.10
0.002
-.0027949
-.0006267
_cons | 7.293196 .6006542
12.14
0.000
6.113183
8.473208
-----------------------------------------------------------------------------.
. reg grd_t2 gen2 en_a2 en_c2 self2 sat2 exth2 shr2 peduc2 lov2 smk2 hres pexp
Source |
SS
df
MS
-------------+-----------------------------Model | 713.264828
12 59.4387356
Residual | 1561.63073 954 1.63692948
-------------+-----------------------------Total | 2274.89555 966 2.35496434

Number of obs
F( 12, 954)
Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=
=
=

967
36.31
0.0000
0.3135
0.3049
1.2794

-----------------------------------------------------------------------------grd_t2 |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
------------+----------------------------------------------------------------gen2 | .0360556 .0845417
0.43
0.670
-.1298535
.2019647
en_a2 | .0854255 .0588753
1.45
0.147
-.0301146
.2009655
en_c2 | -.890975 .0828781 -10.75
0.000
-1.053619
-.7283306
self2 | -.0215798 .0043615
-4.95
0.000
-.030139
-.0130206
sat2 | -.0690045 .0496597
-1.39
0.165
-.1664595
.0284504
exth2 | .0216905 .0111695
1.94
0.052
-.0002291
.04361
shr2 | .0869328
.039729
2.19
0.029
.0089665
.1648991
peduc2 | -9.28e-07 1.65e-07
-5.62
0.000
-1.25e-06
-6.04e-07
lov2 | .0687802 .1358742
0.51
0.613
-.1978666
.3354269
smk2 | 1.068731
.220991
4.84
0.000
.6350463
1.502415
hres | .2163555 .1171963
1.85
0.065
-.0136368
.4463479
pexp | -.2370459 .0920911
-2.57
0.010
-.4177703
-.0563215
_cons | 6.343655 .4541473
13.97
0.000
5.452412
7.234898
-----------------------------------------------------------------------------. reg grd_t2 gen2 en_a2 en_c2 self2 sat2 exth2 shr2 peduc2 lov2 smk2 fwrk2 mwrk2
hcon2
Source |
SS
df
MS
-------------+-----------------------------Model | 703.854779
13 54.1426753
Residual | 1571.04077 953 1.64852127
-------------+-----------------------------Total | 2274.89555 966 2.35496434

Number of obs
F( 13, 953)
Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=
=
=

967
32.84
0.0000
0.3094
0.3000
1.2839

-----------------------------------------------------------------------------grd_t2 |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]

30 6

A Brief Review of Panel Data Analysis

------------+----------------------------------------------------------------gen2 | .0537621
.084074
0.64
0.523
-.1112294
.2187537
en_a2 | .0708094 .0589595
1.20
0.230
-.0448961
.186515
en_c2 | -.891394 .0832127 -10.71
0.000
-1.054695
-.7280928
self2 | -.021093 .0043767
-4.82
0.000
-.0296821
-.012504
sat2 | -.0665774
.050079
-1.33
0.184
-.1648552
.0317005
exth2 | .0203982 .0112521
1.81
0.070
-.0016835
.04248
shr2 | .1066232 .0396718
2.69
0.007
.028769
.1844774
peduc2 | -9.91e-07 1.67e-07
-5.94
0.000
-1.32e-06
-6.63e-07
lov2 | .0742037 .1365007
0.54
0.587
-.1936728
.3420803
smk2 | 1.042403 .2220916
4.69
0.000
.6065584
1.478248
fwrk2 | -.2728239 .1486557
-1.84
0.067
-.5645542
.0189064
mwrk2 | -.0252347 .0845075
-0.30
0.765
-.1910771
.1406076
hcon2 | .0005876
.000394
1.49
0.136
-.0001856
.0013608
_cons | 6.295281 .4739686
13.28
0.000
5.365138
7.225424
-----------------------------------------------------------------------------.
. reg grd_t3 gen3 en_a3 en_c3 self3 sat3 exth3 shr3 peduc3 lov3 smk3 hres pexp
Source |
SS
df
MS
-------------+-----------------------------Model | 531.409911
12 44.2841593
Residual | 1562.08697 949 1.64603474
-------------+-----------------------------Total | 2093.49688 961 2.17845669

Number of obs
F( 12, 949)
Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=
=
=

962
26.90
0.0000
0.2538
0.2444
1.283

-----------------------------------------------------------------------------grd_t3 |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
------------+----------------------------------------------------------------gen3 | .0848069 .0849833
1.00
0.319
-.08197
.2515838
en_a3 | .0212483 .0579626
0.37
0.714
-.0925014
.134998
en_c3 | -.7933162 .0793672 -10.00
0.000
-.9490718
-.6375607
self3 | -.0178432 .0029858
-5.98
0.000
-.0237029
-.0119836
sat3 | -.0401919
.050095
-0.80
0.423
-.1385017
.0581178
exth3 |
.032359 .0091774
3.53
0.000
.0143486
.0503693
shr3 | -.0061049
.037768
-0.16
0.872
-.0802233
.0680135
peduc3 | -3.70e-07 1.41e-07
-2.62
0.009
-6.47e-07
-9.32e-08
lov3 | .2296575 .1311601
1.75
0.080
-.0277398
.4870548
smk3 | .5133184 .1944777
2.64
0.008
.1316623
.8949745
hres | .1449583 .1186728
1.22
0.222
-.0879331
.3778498
pexp | -.1161795 .0919011
-1.26
0.206
-.2965324
.0641733
_cons | 6.366633 .4140872
15.38
0.000
5.554001
7.179266
-----------------------------------------------------------------------------. reg grd_t3 gen3 en_a3 en_c3 self3 sat3 exth3 shr3 peduc3 lov3 smk3 fwrk3 mwrk3
hcon3
Source |
SS
df
MS
-------------+------------------------------

Number of obs =
F( 13, 948) =

962
25.07

6th KEEP Conference

31

Model | 535.594162
13 41.1995509
Residual | 1557.90272 948
1.6433573
-------------+-----------------------------Total | 2093.49688 961 2.17845669

Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=

0.0000
0.2558
0.2456
1.2819

-----------------------------------------------------------------------------grd_t3 |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
------------+----------------------------------------------------------------gen3 | .1015382 .0843004
1.20
0.229
-.0638987
.2669751
en_a3 | .0191078 .0579052
0.33
0.741
-.0945293
.1327449
en_c3 | -.7789672 .0792262
-9.83
0.000
-.9344462
-.6234881
self3 | -.0171472 .0029583
-5.80
0.000
-.0229528
-.0113415
sat3 | -.0279292
.050322
-0.56
0.579
-.1266846
.0708262
exth3 | .0332262 .0091576
3.63
0.000
.0152548
.0511976
shr3 | .0020828 .0374836
0.06
0.956
-.0714777
.0756433
peduc3 | -2.75e-07 1.41e-07
-1.94
0.052
-5.52e-07
2.52e-09
lov3 | .2409529 .1312045
1.84
0.067
-.0165319
.4984377
smk3 | .5199689 .1942941
2.68
0.008
.1386727
.901265
fwrk3 | -.2399303 .1491641
-1.61
0.108
-.5326602
.0527996
mwrk3 | .0823882 .0849323
0.97
0.332
-.0842888
.2490652
hcon3 | -.0003649 .0003472
-1.05
0.293
-.0010463
.0003164
_cons | 6.379911 .4290074
14.87
0.000
5.537997
7.221825
------------------------------------------------------------------------------

5.2 Transforming Wide form Data into Long form data


Data In Wide Form
self1

sat1

peduc1

self2

sat2

peduc2

self3

sat3

peduc3

1003

bysid

320000

200000

200000

1004

18

300000

30

100000

1005

1007

15

150000

20

30000

1008

15

1011

10

1012

300000

50000

1014

200000

18

60000

1015

25

1016

14

1018

80000

1025

26

250000

15

160000

42

600000

1026

23

800000

30

700000

28

620000

Data in Long Form


bysid
1003

_j

gen

self

grd

sat

exth

peduc

320000

32 6

A Brief Review of Panel Data Analysis

1003
1003
1004
1004
1004
1005
1005
1005
1007
1007
1007
1008
1008
1008

2
3
1
2
3
1
2
3
1
2
3
1
2
3

1
1
1
1
1
1
1
1
1
1
1
1
1
1

1
6
5
18
30
1
1
0
15
20
2
15
5
3

5
4
-5
3
3
8
5
5
-5
4
5
-5
2
2

3
3
3
4
4
4
2
3
4
2
4
4
3
5

19
9
3
3.5
1.5
14
15
30
13
8
12
14
18
8

200000
200000
0
300000
100000
0
0
0
150000
30000
0
0
0
0

Converting wide to long


. reshape long grd grd_t cap en_a en_b en_c self sat exth peduc lov smk fwrk finc
mwrk minc hcon peduch htyp res res_a res_b res_c res_d shr, i(bysid)
(note: j = 1 2 3)
(note: en_b3 not found)
(note: htyp3 not found)
Data
wide
-> long
----------------------------------------------------------------------------Number of obs.
969
-> 2907
Number of variables
85
-> 38
j variable (3 values)
-> _j
xij variables:
grd1 grd2 grd3
-> grd
grd_t1 grd_t2 grd_t3 -> grd_t
cap1 cap2 cap3
-> cap
en_a1 en_a2 en_a3
-> en_a
en_b1 en_b2 en_b3
-> en_b
en_c1 en_c2 en_c3
-> en_c
self1 self2 self3
-> self
sat1 sat2 sat3
-> sat
exth1 exth2 exth3
-> exth
peduc1 peduc2 peduc3 -> peduc
lov1 lov2 lov3
-> lov
smk1 smk2 smk3
-> smk
fwrk1 fwrk2 fwrk3
-> fwrk
finc1 finc2 finc3
-> finc
mwrk1 mwrk2 mwrk3
-> mwrk
minc1 minc2 minc3
-> minc
hcon1 hcon2 hcon3
-> hcon
peduch1 peduch2 peduch3 -> peduch
htyp1 htyp2 htyp3
-> htyp
res1 res2 res3
-> res
res_a1 res_a2 res_a3 -> res_a

6th KEEP Conference

33

res_b1 res_b2 res_b3 -> res_b


res_c1 res_c2 res_c3 -> res_c
res_d1 res_d2 res_d3 -> res_d
shr1 shr2 shr3
-> shr
-----------------------------------------------------------------------------

Basic Transition Matrix Example


. xttrans grd_t
|
grd_t
grd_t |
1
2
3
4
5
6
7
8
9
|
Total
---------+-----------------------------------------------------------------------------------------------------+----1 |
57.14
35.71
4.76
0.00
0.00
0.00
2.38
0.00
0.00 |
100.00
2 |
7.91
54.68
28.06
6.47
1.44
1.44
0.00
0.00
0.00 |
100.00
3 |
0.69
14.78
59.11
21.31
2.41
0.69
1.03
0.00
0.00 |
100.00
4 |
0.00
3.01
21.30
52.63
18.55
3.01
1.00
0.50
0.00 |
100.00
5 |
0.00
0.60
8.36
30.45
44.78
12.84
2.09
0.60
0.30 |
100.00
6 |
0.61
0.61
5.49
13.41
36.59
34.15
7.93
0.00
1.22 |
100.00
7 |
0.00
1.54
0.00
7.69
23.08
35.38
21.54
10.77
0.00 |
100.00
8 |
0.00
2.38
2.38
7.14
16.67
23.81
11.90
33.33
2.38 |
100.00
9 |
0.00
0.00
6.67
0.00
6.67
0.00
20.00
26.67
40.00 |
100.00
---------+-----------------------------------------------------------------------------------------------------+----Total |
2.55
10.12
22.59
27.68
21.18
9.92
3.35
1.94
0.67 |
100.00
. xttrans grd_t, freq
|
grd_t
grd_t |
1
2
3
4
5
6
7
8
9
|
Total
---------+-----------------------------------------------------------------------------------------------------+----1 |
24
15
2
0
0
0
1
0
0 |
42
|
57.14
35.71
4.76
0.00
0.00
0.00
2.38
0.00
0.00 |
100.00
---------+-----------------------------------------------------------------------------------------------------+----2 |
11
76
39
9
2
2
0
0
0 |
139
|
7.91
54.68
28.06
6.47
1.44
1.44
0.00
0.00
0.00 |
100.00
---------+-----------------------------------------------------------------------------------------------------+----3 |
2
43
172
62
7
2
3
0
0 |
291
|
0.69
14.78
59.11
21.31
2.41
0.69
1.03
0.00
0.00 |
100.00
---------+-----------------------------------------------------------------------------------------------------+----4 |
0
12
85
210
74
12
4
2
0 |
399
|
0.00
3.01
21.30
52.63
18.55
3.01
1.00
0.50
0.00 |
100.00
---------+-----------------------------------------------------------------------------------------------------+----5 |
0
2
28
102
150
43
7
2
1 |
335
|
0.00
0.60
8.36
30.45
44.78
12.84
2.09
0.60
0.30 |
100.00
---------+-----------------------------------------------------------------------------------------------------+----6 |
1
1
9
22
60
56
13
0
2 |
164
|
0.61
0.61
5.49
13.41
36.59
34.15
7.93
0.00
1.22 |
100.00
---------+-----------------------------------------------------------------------------------------------------+----7 |
0
1
0
5
15
23
14
7
0 |
65
|
0.00
1.54
0.00
7.69
23.08
35.38
21.54
10.77
0.00 |
100.00
---------+-----------------------------------------------------------------------------------------------------+----8 |
0
1
1
3
7
10
5
14
1 |
42
|
0.00
2.38
2.38
7.14
16.67
23.81
11.90
33.33
2.38 |
100.00
---------+-----------------------------------------------------------------------------------------------------+----9 |
0
0
1
0
1
0
3
4
6 |
15
|
0.00
0.00
6.67
0.00
6.67
0.00
20.00
26.67
40.00 |
100.00
---------+-----------------------------------------------------------------------------------------------------+----Total |
38
151
337
413
316
148
50
29
10 |
1,492
|
2.55
10.12
22.59
27.68
21.18
9.92
3.35
1.94
0.67 |
100.00

34 6

A Brief Review of Panel Data Analysis

5.3 Fixed Effects vs Random Effects Model


(1) Student Characteristics Only
OLS
. reg grd_t en_a en_c self sat exth shr peduc lov smk gen hres pexp
Source |
SS
df
MS
-------------+-----------------------------Model | 1656.30924
12
138.02577
Residual | 4261.17248 2449 1.73996426
-------------+-----------------------------Total | 5917.48172 2461 2.40450293

Number of obs
F( 12, 2449)
Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=
=
=

2462
79.33
0.0000
0.2799
0.2764
1.3191

-----------------------------------------------------------------------------grd_t |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
------------+----------------------------------------------------------------en_a | -.107651 .0278706
-3.86
0.000
-.1623033
-.0529986
en_c | -.7981425 .0504476 -15.82
0.000
-.8970668
-.6992181
self | -.0200988 .0023378
-8.60
0.000
-.024683
-.0155146
sat | -.0359951 .0321609
-1.12
0.263
-.0990605
.0270703
exth | .0315091 .0063623
4.95
0.000
.0190331
.0439852
shr | .0432594
.024337
1.78
0.076
-.0044638
.0909827
peduc | -6.39e-07 9.91e-08
-6.45
0.000
-8.33e-07
-4.45e-07
lov | .1419757 .0865507
1.64
0.101
-.0277444
.3116959
smk | .7058129 .1358232
5.20
0.000
.4394727
.9721532
gen | .0473607 .0544266
0.87
0.384
-.0593663
.1540877
hres | .1282344 .0732342
1.75
0.080
-.015373
.2718418
pexp | -.2557146 .0592004
-4.32
0.000
-.3718027
-.1396265
_cons | 6.768923 .2670453
25.35
0.000
6.245265
7.292581
------------------------------------------------------------------------------

Fixed Effect Model


. xtreg grd_t en_a en_c self sat exth shr peduc lov smk gen hres pexp, fe
Fixed-effects (within) regression
Group variable (i): bysid

Number of obs
Number of groups

=
=

2462
969

R-sq: within = 0.0568


between = 0.3170
overall = 0.2244

Obs per group: min


avg
max

=
=
=

1
2.5
3

corr(u_i, Xb) = 0.3656

F(9,1484)
Prob > F

=
=

9.94
0.0000

-----------------------------------------------------------------------------grd_t |
Coef.
Std. Err.
z
P>|t|
[95% Conf. Interval]
------------+-----------------------------------------------------------------

6th KEEP Conference

35

en_a | -.136822 .0220913


-6.19
0.000
-.1801555
-.0934885
en_c | -.1994922
.048994
-4.07
0.000
-.2955971
-.1033873
self | -.0051754 .0020049
-2.58
0.010
-.0091081
-.0012427
sat | -.024137 .0329818
-0.73
0.464
-.0888329
.0405589
exth | .0065626 .0054578
1.20
0.229
-.0041432
.0172683
shr | .0245194 .0214191
1.14
0.252
-.0174954
.0665343
peduc | -1.10e-07 1.01e-07
-1.09
0.278
-3.08e-07
8.86e-08
lov | -.0533141 .0841841
-0.63
0.527
-.2184466
.1118183
smk | .2204073 .1472682
1.50
0.135
-.0684687
.5092834
gen | (dropped)
hres | (dropped)
pexp | (dropped)
_cons |
5.07543
.244363
20.77
0.000
4.596096
5.554764
------------+----------------------------------------------------------------sigma_u | 1.3162529
sigma_e | .80251711
rho | .7290054 (fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0:
F(968, 1484) =
5.30
Prob > F = 0.0000

Random Effect Model


. xtreg grd_t en_a en_c self sat exth shr peduc lov smk gen hres pexp, re
Random-effects GLS regression
Group variable (i): bysid
R-sq: within = 0.0485
between = 0.3492
overall = 0.2711
Random effects u_i ~ Gaussian
corr(u_i, X)
= 0 (assumed)

Number of obs
Number of groups

=
=

2462
969

Obs per group: min


avg
max

=
=
=

1
2.5
3

Wald chi2(12)
Prob > chi2

=
=

422.36
0.0000

-----------------------------------------------------------------------------grd_t |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------en_a | -.1297934 .0214687
-6.05
0.000
-.1718713
-.0877155
en_c | -.4786417 .0440305 -10.87
0.000
-.5649398
-.3923436
self | -.0114053 .0018917
-6.03
0.000
-.015113
-.0076976
sat | -.0432793 .0292355
-1.48
0.139
-.1005797
.0140211
exth |
.018881 .0051661
3.65
0.000
.0087557
.0290063
shr | .0415967 .0201382
2.07
0.039
.0021266
.0810667
peduc | -3.85e-07 8.92e-08
-4.31
0.000
-5.60e-07
-2.10e-07
lov | .0437019 .0761776
0.57
0.566
-.1056034
.1930072
smk | .5149588 .1272706
4.05
0.000
.265513
.7644047
gen |
.072456 .0764778
0.95
0.343
-.0774377
.2223498
hres | .0234364 .1031735
0.23
0.820
-.1787801
.2256528
pexp | -.3116644 .0826708
-3.77
0.000
-.4736962
-.1496325

36 6

A Brief Review of Panel Data Analysis

_cons |
5.9421 .2355998
25.22
0.000
5.480332
6.403867
------------+----------------------------------------------------------------sigma_u | 1.0049162
sigma_e | .80251711
rho | .61059468 (fraction of variance due to u_i)
------------------------------------------------------------------------------

Hausman Test/Breusch and Pagan LM test


. hausman fixed_1
---- Coefficients ---|
(b)
(B)
(b-B)
sqrt(diag(V_b-V_B))
| fixed_1
.
Difference
S.E.
-----------+----------------------------------------------------------------en_a |
-.136822
-.1297934
-.0070285
.0052077
en_c | -.1994922
-.4786417
.2791495
.021488
self | -.0051754
-.0114053
.0062299
.000664
sat |
-.024137
-.0432793
.0191423
.0152672
exth |
.0065626
.018881
-.0123184
.0017604
shr |
.0245194
.0415967
-.0170772
.0072959
peduc | -1.10e-07
-3.85e-07
2.75e-07
4.77e-08
lov | -.0533141
.0437019
-.097016
.0358321
smk |
.2204073
.5149588
-.2945515
.0740954
-----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
Test: Ho: difference in coefficients not systematic
chi2(8) = (b-B)'[(V_b-V_B)^(-1)](b-B)
=
242.83
Prob>chi2 =
0.0000
. xttest0
Breusch and Pagan Lagrangian multiplier test for random effects:
grd_t[bysid,t] = Xb + u[bysid] + e[bysid,t]
Estimated results:
|
Var
sd = sqrt(Var)
---------+----------------------------grd_t |
2.404503
1.550646
e |
.6440337
.8025171
u |
1.009857
1.004916
Test:

Var(u) = 0
chi2(1) =
Prob > chi2 =

578.68
0.0000

6th KEEP Conference

37

(2) Student & Household


OLS
. reg grd_t en_a en_c self sat exth shr peduc lov smk gen hres pexp fwrk mwrk
hcon
Source |
SS
df
MS
-------------+-----------------------------Model | 1677.21312
15 111.814208
Residual | 4240.2686 2446 1.73355217
-------------+-----------------------------Total | 5917.48172 2461 2.40450293

Number of obs
F( 15, 2446)
Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=
=
=

2462
64.50
0.0000
0.2834
0.2790
1.3166

-----------------------------------------------------------------------------grd_t |
Coef. Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------en_a | -.1090899 .0278408
-3.92
0.000
-.1636839
-.054496
en_c | -.7944365 .0504039 -15.76
0.000
-.8932752
-.6955978
self | -.0199506 .0023353
-8.54
0.000
-.0245299
-.0153713
sat | -.027326
.032204
-0.85
0.396
-.0904759
.035824
exth | .0306683 .0063593
4.82
0.000
.0181982
.0431385
shr | .0433538 .0243166
1.78
0.075
-.0043294
.091037
peduc | -5.80e-07 1.03e-07
-5.65
0.000
-7.81e-07
-3.78e-07
lov | .1463585 .0864112
1.69
0.090
-.0230882
.3158052
smk | .6961962 .1356061
5.13
0.000
.4302816
.9621108
gen | .0504104 .0543712
0.93
0.354
-.0562078
.1570287
hres | .1558863 .0742932
2.10
0.036
.0102022
.3015703
pexp | -.2407737 .0593998
-4.05
0.000
-.3572527
-.1242947
fwrk | -.2489616 .0947112
-2.63
0.009
-.434684
-.0632391
mwrk | .0579389 .0540515
1.07
0.284
-.0480526
.1639304
hcon | -.0003149 .0002405
-1.31
0.191
-.0007866
.0001568
_cons | 6.961375 .2795562
24.90
0.000
6.413184
7.509567
------------------------------------------------------------------------------

Fixed Effect Model


. xtreg grd_t en_a en_c self sat exth shr peduc lov smk gen hres pexp fwrk mwrk
hcon, fe
Fixed-effects (within) regression
Group variable (i): bysid
R-sq: within = 0.0598
between = 0.2755
overall = 0.1918

38 6

Number of obs
Number of groups

=
=

2462
969

Obs per group: min


avg
max

=
=
=

1
2.5
3

F(12,1481)

7.85

A Brief Review of Panel Data Analysis

corr(u_i, Xb) = 0.3225

Prob > F

0.0000

-----------------------------------------------------------------------------grd_t |
Coef. Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------en_a | -.1361749 .0221464
-6.15
0.000
-.1796166
-.0927332
en_c | -.1968455 .0490265
-4.02
0.000
-.2930143
-.1006768
self | -.0052408 .0020042
-2.61
0.009
-.0091721
-.0013094
sat | -.0232757 .0329694
-0.71
0.480
-.0879474
.0413961
exth | .0068599 .0054802
1.25
0.211
-.0038898
.0176096
shr | .0239398 .0214631
1.12
0.265
-.0181615
.066041
peduc | -1.13e-07 1.01e-07
-1.12
0.263
-3.11e-07
8.51e-08
lov | -.0481124 .0841805
-0.57
0.568
-.2132381
.1170133
smk | .2087753 .1474838
1.42
0.157
-.0805241
.4980747
gen | (dropped)
hres | (dropped)
pexp | (dropped)
fwrk | .1136457 .1090994
1.04
0.298
-.10036
.3276515
mwrk | .0052799 .0577552
0.09
0.927
-.1080107
.1185706
hcon | .0003943 .0002282
1.73
0.084
-.0000533
.000842
_cons | 4.881918 .2683315
18.19
0.000
4.355568
5.408268
-------------+---------------------------------------------------------------sigma_u | 1.327694
sigma_e | .80207485
rho | .73262737 (fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0:
F(968, 1481) =
5.28
Prob > F = 0.0000
. est store fixed_2

Random Effect Model


. xtreg grd_t en_a en_c self sat exth shr peduc lov smk gen hres pexp fwrk mwrk
hcon, re
Random-effects GLS regression
Group variable (i): bysid
R-sq: within = 0.0476
between = 0.3523
overall = 0.2737
Random effects u_i ~ Gaussian
corr(u_i, X)
= 0 (assumed)

Number of obs
Number of groups

=
=

2462
969

Obs per group: min


avg
max

=
=
=

1
2.5
3

Wald chi2(15)
Prob > chi2

=
=

425.04
0.0000

-----------------------------------------------------------------------------grd_t |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------en_a | -.1298275 .0215172
-6.03
0.000
-.1720005
-.0876545

6th KEEP Conference

39

en_c | -.4776755 .0440816 -10.84


0.000
-.5640738
-.3912772
self | -.0113867
.001893
-6.02
0.000
-.0150969
-.0076765
sat | -.0415663
.029267
-1.42
0.156
-.0989285
.015796
exth | .0189948 .0051816
3.67
0.000
.008839
.0291506
shr | .0430826
.020177
2.14
0.033
.0035363
.0826289
peduc | -3.81e-07 8.99e-08
-4.24
0.000
-5.57e-07
-2.05e-07
lov |
.042773 .0762029
0.56
0.575
-.1065819
.1921279
smk | .5187465 .1273168
4.07
0.000
.2692101
.7682828
gen | .0743039 .0764892
0.97
0.331
-.0756123
.22422
hres | .0276469 .1041531
0.27
0.791
-.1764893
.2317832
pexp | -.3080506 .0829389
-3.71
0.000
-.4706079
-.1454932
fwrk | -.1149316 .0919315
-1.25
0.211
-.2951141
.0652509
mwrk | .0415202 .0504134
0.82
0.410
-.0572883
.1403287
hcon | .0000381 .0002067
0.18
0.854
-.0003671
.0004433
_cons | 5.991255 .2502423
23.94
0.000
5.500789
6.481721
-------------+---------------------------------------------------------------sigma_u | 1.0035708
sigma_e | .80207485
rho | .61021965 (fraction of variance due to u_i)
------------------------------------------------------------------------------

Hausman Test/Breusch and Pagan LM test


. hausman fixed_2
---- Coefficients ---|
(b)
(B)
(b-B)
sqrt(diag(V_b-V_B))
| fixed_2
.
Difference
S.E.
-----------+----------------------------------------------------------------en_a | -.1361749
-.1298275
-.0063474
.0052414
en_c | -.1968455
-.4776755
.28083
.0214572
self | -.0052408
-.0113867
.0061459
.0006582
sat | -.0232757
-.0415663
.0182906
.0151798
exth |
.0068599
.0189948
-.0121349
.001784
shr |
.0239398
.0430826
-.0191428
.0073178
peduc | -1.13e-07
-3.81e-07
2.68e-07
4.61e-08
lov | -.0481124
.042773
-.0908854
.0357698
smk |
.2087753
.5187465
-.3099711
.074444
fwrk |
.1136457
-.1149316
.2285773
.0587475
mwrk |
.0052799
.0415202
-.0362403
.0281806
hcon |
.0003943
.0000381
.0003563
.0000967
-----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
Test: Ho: difference in coefficients not systematic
chi2(11) = (b-B)'[(V_b-V_B)^(-1)](b-B)
=
256.67

40 6

A Brief Review of Panel Data Analysis

Prob>chi2 =

0.0000

. xttest0
Breusch and Pagan Lagrangian multiplier test for random effects:
grd_t[bysid,t] = Xb + u[bysid] + e[bysid,t]
Estimated results:
|
Var
sd = sqrt(Var)
---------+----------------------------grd_t |
2.404503
1.550646
e |
.6433241
.8020748
u |
1.007154
1.003571
Test:

Var(u) = 0
chi2(1) =
Prob > chi2 =

568.47
0.0000

5.4 Hausman and Taylor IV Estimator


. xthtaylor grd_t en_a en_c self sat exth shr peduc lov smk gen hres pexp,
endog(en_c self sat exth shr lov smk gen)
Hausman-Taylor estimation
Group variable (i): bysid

Random effects u_i ~ i.i.d.

Number of obs
Number of groups

=
=

2462
969

Obs per group: min


avg
max

=
=
=

1
2.5
3

Wald chi2(12)
Prob > chi2

=
=

108.97
0.0000

-----------------------------------------------------------------------------grd_t |
Coef. Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------TVexogenous |
en_a | -.1344562 .0208356
-6.45
0.000
-.1752933
-.0936191
peduc | -1.21e-07 9.54e-08
-1.27
0.204
-3.08e-07
6.58e-08
TVendogenous |
en_c | -.1983433 .0463643
-4.28
0.000
-.2892157
-.1074708
self | -.0052174 .0018973
-2.75
0.006
-.0089361
-.0014987
sat | -.0237138 .0312152
-0.76
0.447
-.0848945
.037467
exth | .0066545 .0051653
1.29
0.198
-.0034692
.0167783
shr | .0245534 .0202728
1.21
0.226
-.0151806
.0642874
lov | -.0542038 .0796763
-0.68
0.496
-.2103665
.1019589
smk | .2182307
.139378
1.57
0.117
-.0549451
.4914066
TIexogenous |

6th KEEP Conference

41

hres | -1.038758 .6150578


-1.69
0.091
-2.244249
.1667332
pexp | -.5782443 .3017485
-1.92
0.055
-1.169661
.013172
TIendogenous |
gen |
6.44911
3.32476
1.94
0.052
-.0673004
12.96552
|
_cons | 2.520782 1.438919
1.75
0.080
-.299448
5.341011
------------+----------------------------------------------------------------sigma_u | 3.9392815
sigma_e | .80009461
rho | .96038199 (fraction of variance due to u_i)
-----------------------------------------------------------------------------note: TV refers to time varying; TI refers to time invariant.
.
.
. xthtaylor grd_t en_a en_c self sat exth shr peduc lov smk gen hres pexp fwrk
mwrk hcon, endog(en_c self sat exth shr lov smk gen)
Hausman-Taylor estimation
Group variable (i): bysid

Random effects u_i ~ i.i.d.

Number of obs
Number of groups

=
=

2462
969

Obs per group: min


avg
max

=
=
=

1
2.5
3

Wald chi2(15)
Prob > chi2

=
=

122.43
0.0000

-----------------------------------------------------------------------------grd_t |
Coef. Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------TVexogenous |
en_a | -.1324138 .0205077
-6.46
0.000
-.1726081
-.0922195
peduc | -1.56e-07 9.29e-08
-1.68
0.093
-3.38e-07
2.59e-08
fwrk |
.06309 .0992422
0.64
0.525
-.1314212
.2576012
mwrk | .0163642 .0522932
0.31
0.754
-.0861285
.1188569
hcon | .0003509 .0002088
1.68
0.093
-.0000583
.0007602
TVendogenous |
en_c | -.1980364 .0456607
-4.34
0.000
-.2875297
-.1085431
self | -.0053307 .0018679
-2.85
0.004
-.0089917
-.0016698
sat | -.0253489 .0307021
-0.83
0.409
-.0855239
.0348261
exth | .0073118 .0051051
1.43
0.152
-.0026941
.0173176
shr |
.024786 .0200054
1.24
0.215
-.0144238
.0639957
lov | -.0512957 .0784605
-0.65
0.513
-.2050754
.102484
smk | .2174626 .1373529
1.58
0.113
-.0517442
.4866693
TIexogenous |
hres | -.6853114 .3819939
-1.79
0.073
-1.434006
.0633828
pexp | -.4774397 .2057894
-2.32
0.020
-.8807794
-.0740999
TIendogenous |
gen | 4.055227 1.915909
2.12
0.034
.3001139
7.810339

42 6

A Brief Review of Panel Data Analysis

|
_cons | 3.391832 .8587276
3.95
0.000
1.708756
5.074907
-------------+---------------------------------------------------------------sigma_u | 2.8355606
sigma_e |
.798845
rho |
.92646781 (fraction of variance due to u_i)
-----------------------------------------------------------------------------note: TV refers to time varying; TI refers to time invariant.

5.5 Dynamic Linear Model


. xtabond grd_t en_a en_c self sat exth shr peduc lov smk gen hres pexp
Arellano-Bond dynamic panel-data estimation
Group variable (i): bysid

Time variable (t): year

Number of obs
Number of groups

=
=

528
528

Wald chi2(10)

15.37

Obs per group: min


avg
max

=
=
=

1
1
1

One-step results
-----------------------------------------------------------------------------D.grd_t |
Coef. Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t |
LD. | .1917622 .0844768
2.27
0.023
.0261907
.3573338
en_a |
D1. | .0677208 .0636624
1.06
0.287
-.0570553
.1924968
en_c |
D1. | -.2399974 .0855215
-2.81
0.005
-.4076165
-.0723784
self |
D1. | -.0032522 .0032166
-1.01
0.312
-.0095566
.0030521
sat |
D1. | .0112536 .0583583
0.19
0.847
-.1031266
.1256338
exth |
D1. |
.006254
.009651
0.65
0.517
-.0126617
.0251697
shr |
D1. | .0013944 .0362387
0.04
0.969
-.0696322
.0724209
peduc |
D1. | 1.20e-07 1.85e-07
0.65
0.515
-2.42e-07
4.83e-07
lov |
D1. | .0147333 .1442567
0.10
0.919
-.2680047
.2974713
smk |
D1. | -.2184055
.232481
-0.94
0.347
-.6740598
.2372488
gen |
D1. | (dropped)

6th KEEP Conference

43

hres |
D1. | (dropped)
pexp |
D1. | (dropped)
_cons | -.0311326 .0600442
-0.52
0.604
-.148817
.0865518
------------------------------------------------------------------------------

. xtabond grd_t en_a en_c self sat exth shr peduc lov smk gen hres pexp fwrk
mwrk hcon
Arellano-Bond dynamic panel-data estimation
Group variable (i): bysid

Time variable (t): year

Number of obs
Number of groups

=
=

528
528

Wald chi2(13)

17.25

Obs per group: min


avg
max

=
=
=

1
1
1

One-step results
-----------------------------------------------------------------------------D.grd_t |
Coef. Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t |
LD. | .1955437 .0839857
2.33
0.020
.0309349
.3601526
en_a |
D1. | .0669117 .0641226
1.04
0.297
-.0587663
.1925897
en_c |
D1. | -.2400576 .0859551
-2.79
0.005
-.4085266
-.0715886
self |
D1. | -.0031353 .0032312
-0.97
0.332
-.0094682
.0031977
sat |
D1. | .0137275 .0585889
0.23
0.815
-.1011046
.1285596
exth |
D1. | .0066103 .0097066
0.68
0.496
-.0124143
.0256349
shr |
D1. | -.0026892 .0366128
-0.07
0.941
-.074449
.0690706
peduc |
D1. | 1.09e-07 1.86e-07
0.58
0.559
-2.56e-07
4.73e-07
lov |
D1. | .0168986 .1448085
0.12
0.907
-.2669208
.3007179
smk |
D1. | -.2246414 .2332566
-0.96
0.336
-.6818159
.232533
gen |
D1. | (dropped)
hres |
D1. | (dropped)
pexp |
D1. | (dropped)

44 6

A Brief Review of Panel Data Analysis

fwrk |
D1. | .1419691 .1947345
0.73
0.466
-.2397036
.5236419
mwrk |
D1. | .1181656
.105405
1.12
0.262
-.0884245
.3247556
hcon |
D1. | -.0000227 .0003524
-0.06
0.949
-.0007135
.0006681
_cons | -.037347 .0606389
-0.62
0.538
-.1561971
.0815031
------------------------------------------------------------------------------

5.6 Logit Models


(1) Cross-sectional logit model
. logit cap1 grd_t1 gen en_c1 self1 peduc1 pexp lov1
Logistic regression

Log likelihood = -127.79022

Number of obs
LR chi2(7)
Prob > chi2
Pseudo R2

=
=
=
=

533
33.51
0.0000
0.1159

-----------------------------------------------------------------------------cap1 |
Coef. Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t1 | -.5415932 .1376394
-3.93
0.000
-.8113615
-.2718248
gen | -.2468722 .3407271
-0.72
0.469
-.914685
.4209406
en_c1 | .3139416 .3220469
0.97
0.330
-.3172586
.9451419
self1 | .0117223 .0181223
0.65
0.518
-.0237966
.0472413
peduc1 | 2.75e-08 5.65e-07
0.05
0.961
-1.08e-06
1.14e-06
pexp | .1541836 .4251223
0.36
0.717
-.6790408
.987408
lov1 | 1.311146 .4575086
2.87
0.004
.4144459
2.207846
_cons | -1.497059 1.292382
-1.16
0.247
-4.030082
1.035964
-----------------------------------------------------------------------------. logit cap1 grd_t1 gen en_c1 self1 peduc1 pexp lov1 fwrk1 mwrk1 hcon1
Logistic regression

Log likelihood = -127.70161

Number of obs
LR chi2(10)
Prob > chi2
Pseudo R2

=
=
=
=

533
33.68
0.0002
0.1165

-----------------------------------------------------------------------------cap1 |
Coef. Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t1 | -.5309681 .1404224
-3.78
0.000
-.806191
-.2557452
gen | -.2521301
.341473
-0.74
0.460
-.9214049
.4171446
en_c1 | .3261254 .3236669
1.01
0.314
-.3082499
.9605008
self1 | .0118159 .0181746
0.65
0.516
-.0238056
.0474374

6th KEEP Conference

45

peduc1 | -2.68e-08 6.01e-07


-0.04
0.964
-1.20e-06
1.15e-06
pexp | .1452165 .4267171
0.34
0.734
-.6911337
.9815668
lov1 | 1.318075 .4593078
2.87
0.004
.4178487
2.218302
fwrk1 | -.0210745 .6560481
-0.03
0.974
-1.306905
1.264756
mwrk1 | .0487024 .3413627
0.14
0.887
-.6203561
.717761
hcon1 | .0005545 .0014215
0.39
0.696
-.0022316
.0033406
_cons | -1.681235 1.490729
-1.13
0.259
-4.603011
1.24054
-----------------------------------------------------------------------------.
. logit cap2 grd_t2 gen en_c2 self2
Logistic regression

Log likelihood = -232.0831

peduc2 pexp lov2


Number of obs
LR chi2(7)
Prob > chi2
Pseudo R2

=
=
=
=

967
48.38
0.0000
0.0944

-----------------------------------------------------------------------------cap2 |
Coef. Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t2 | -.5863882 .1101888
-5.32
0.000
-.8023542
-.3704222
gen | -.2147409 .2554635
-0.84
0.401
-.7154402
.2859584
en_c2 | -.2036763
.254177
-0.80
0.423
-.7018541
.2945016
self2 | .0180086 .0113568
1.59
0.113
-.0042504
.0402675
peduc2 | 2.83e-07 4.11e-07
0.69
0.491
-5.22e-07
1.09e-06
pexp | -.0401663
.293321
-0.14
0.891
-.615065
.5347323
lov2 | .5512945 .3921929
1.41
0.160
-.2173895
1.319979
_cons | -.0634553 .9684829
-0.07
0.948
-1.961647
1.834736
-----------------------------------------------------------------------------. logit cap2 grd_t2 gen en_c2 self2
Logistic regression

Log likelihood = -231.1939

peduc2 pexp lov2 fwrk2 mwrk2 hcon2


Number of obs
LR chi2(10)
Prob > chi2
Pseudo R2

=
=
=
=

967
50.16
0.0000
0.0979

-----------------------------------------------------------------------------cap2 |
Coef. Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t2 | -.5839264
.110293
-5.29
0.000
-.8000967
-.3677562
gen | -.217421 .2561885
-0.85
0.396
-.7195413
.2846993
en_c2 | -.218767 .2555049
-0.86
0.392
-.7195474
.2820133
self2 | .0192423 .0113934
1.69
0.091
-.0030882
.0415729
peduc2 | 4.92e-07 4.56e-07
1.08
0.280
-4.01e-07
1.38e-06
pexp | -.034632
.293957
-0.12
0.906
-.6107772
.5415131
lov2 | .5493892 .3939479
1.39
0.163
-.2227345
1.321513
fwrk2 | -.1465971 .4742243
-0.31
0.757
-1.07606
.7828655
mwrk2 | -.1720786 .2595972
-0.66
0.507
-.6808797
.3367224
hcon2 | -.0012038
.0011825
-1.02 0.309
-.0035215
.0011138

46 6

A Brief Review of Panel Data Analysis

_cons | .3545729 1.093378


0.32
0.746
-1.788408
2.497554
-----------------------------------------------------------------------------.
. logit cap3 grd_t3 gen en_c3 self3 peduc3 pexp lov3
Logistic regression

Log likelihood = -472.41637

Number of obs
LR chi2(7)
Prob > chi2
Pseudo R2

=
=
=
=

962
67.32
0.0000
0.0665

-----------------------------------------------------------------------------cap3 |
Coef. Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t3 | -.3954613 .0679485
-5.82
0.000
-.5286379
-.2622847
gen | .0328329 .1621146
0.20
0.840
-.2849058
.3505716
en_c3 | .2740853
.155537
1.76
0.078
-.0307616
.5789323
self3 | .0004217 .0054691
0.08
0.939
-.0102976
.011141
peduc3 | 3.27e-07 2.43e-07
1.35
0.178
-1.49e-07
8.03e-07
pexp | .0390463
.181456
0.22
0.830
-.3166009
.3946934
lov3 | .4037768 .2429875
1.66
0.097
-.07247
.8800236
_cons | -.7609527 .6123723
-1.24
0.214
-1.96118
.4392749
-----------------------------------------------------------------------------. logit cap3 grd_t3 gen en_c3 self3 peduc3 pexp lov3 fwrk3 mwrk3 hcon3
Logistic regression

Log likelihood = -470.15243

Number of obs
LR chi2(10)
Prob > chi2
Pseudo R2

=
=
=
=

962
71.84
0.0000
0.0710

-----------------------------------------------------------------------------cap3 |
Coef. Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t3 | -.3884637 .0682104
-5.70
0.000
-.5221536
-.2547739
gen | .0264026 .1630921
0.16
0.871
-.2932521
.3460573
en_c3 | .2642249 .1563653
1.69
0.091
-.0422454
.5706952
self3 | -.0002937 .0054971
-0.05
0.957
-.0110678
.0104805
peduc3 | 1.91e-07 2.57e-07
0.74
0.458
-3.13e-07
6.94e-07
pexp | -.0042702 .1830117
-0.02
0.981
-.3629666
.3544261
lov3 | .3895058 .2433826
1.60
0.110
-.0875153
.8665268
fwrk3 | .0920612
.319815
0.29
0.773
-.5347646
.718887
mwrk3 | .1396861 .1667157
0.84
0.402
-.1870708
.4664429
hcon3 | .0012056 .0006524
1.85
0.065
-.0000731
.0024842
_cons | -1.10138 .6923175
-1.59
0.112
-2.458297
.2555375

6th KEEP Conference

47

(2) Panel Logit Model


Random Effect Logit Model
. xtlogit cap grd_t gen en_c self peduc pexp lov
Random-effects logistic regression
Group variable (i): bysid
Random effects u_i ~ Gaussian

Log likelihood = -832.93481

Number of obs
Number of groups

=
=

2462
969

Obs per group: min


avg
max

=
=
=

1
2.5
3

Wald chi2(7)
Prob > chi2

=
=

102.41
0.0000

-----------------------------------------------------------------------------cap |
Coef. Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t | -.5385898 .0693589
-7.77
0.000
-.6745308
-.4026489
gen | -.0812851 .1851478
-0.44
0.661
-.444168
.2815979
en_c | -.0399018 .1531278
-0.26
0.794
-.3400267
.2602231
self | .0194503 .0061209
3.18
0.001
.0074535
.0314471
peduc | 3.42e-07 2.65e-07
1.29
0.196
-1.77e-07
8.61e-07
pexp | -.0631149 .2076906
-0.30
0.761
-.4701811
.3439512
lov | .8487006 .2480845
3.42
0.001
.3624638
1.334937
_cons | -.8042374 .6139304
-1.31
0.190
-2.007519
.3990441
-------------+---------------------------------------------------------------/lnsig2u | .8859762 .1238883
.6431596
1.128793
-------------+---------------------------------------------------------------sigma_u | 1.557354 .0964689
1.379305
1.758386
rho | .4243671 .0302634
.366401
.4844913
-----------------------------------------------------------------------------Likelihood-ratio test of rho=0: chibar2(01) =
95.22 Prob >= chibar2 = 0.000

. xtlogit cap grd_t gen en_c self peduc pexp lov fwrk mwrk hcon
Random-effects logistic regression
Group variable (i): bysid
Random effects u_i ~ Gaussian

Log likelihood = -831.30573

Number of obs
Number of groups

=
=

2462
969

Obs per group: min


avg
max

=
=
=

1
2.5
3

Wald chi2(10)
Prob > chi2

=
=

104.97
0.0000

------------------------------------------------------------------------------

48 6

A Brief Review of Panel Data Analysis

cap |
Coef. Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t | -.5371949 .0695552
-7.72
0.000
-.6735206
-.4008692
gen | -.0845841 .1855445
-0.46
0.648
-.4482447
.2790764
en_c | -.0351099 .1534229
-0.23
0.819
-.3358133
.2655935
self | .0188251 .0061387
3.07
0.002
.0067934
.0308568
peduc | 2.63e-07 2.76e-07
0.95
0.341
-2.79e-07
8.04e-07
pexp | -.0915595 .2087929
-0.44
0.661
-.5007861
.3176672
lov | .8426203 .2486474
3.39
0.001
.3552804
1.32996
fwrk | .0745231 .3184044
0.23
0.815
-.5495381
.6985844
mwrk | .2104419 .1712786
1.23
0.219
-.1252581
.5461418
hcon | .0008322 .0006729
1.24
0.216
-.0004866
.0021511
_cons | -1.133944 .6946214
-1.63
0.103
-2.495377
.2274892
------------+----------------------------------------------------------------/lnsig2u | .8884498
.123559
.6462786
1.130621
------------+---------------------------------------------------------------sigma_u | 1.559281 .0963316
1.381458
1.759994
rho | .4249715 .0301942
.3671254
.4849479
-----------------------------------------------------------------------------Likelihood-ratio test of rho=0: chibar2(01) =
95.61 Prob >= chibar2 = 0.000

Fixed Effect Logit Model


. xtlogit cap grd_t gen en_c self peduc pexp lov, fe
note: multiple positive outcomes within groups encountered.
note: 783 groups (1986 obs) dropped due to all positive or
all negative outcomes.
note: gen omitted due to no within-group variance.
note: pexp omitted due to no within-group variance.
Conditional fixed-effects logistic regression
Group variable (i): bysid

Log likelihood = -157.27525

Number of obs
Number of groups

=
=

476
186

Obs per group: min


avg
max

=
=
=

2
2.6
3

LR chi2(5)
Prob > chi2

=
=

27.64
0.0000

-----------------------------------------------------------------------------cap |
Coef. Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t | -.235965 .1262152
-1.87 0.062
-.4833422
.0114121
en_c | -.4316349 .2256863
-1.91 0.056
-.873972
.0107021
self | .0283472 .0091226
3.11 0.002
.0104672
.0462273
peduc | 5.08e-07 4.24e-07
1.20 0.231
-3.24e-07
1.34e-06
lov |
1.32447 .4403175
3.01 0.003
.4614631
2.187476
------------------------------------------------------------------------------

6th KEEP Conference

49

. xtlogit cap grd_t gen en_c self peduc pexp lov fwrk mwrk hcon, fe
note: multiple positive outcomes within groups encountered.
note: 783 groups (1986 obs) dropped due to all positive or
all negative outcomes.
note: gen omitted due to no within-group variance.
note: pexp omitted due to no within-group variance.
Conditional fixed-effects logistic regression
Group variable (i): bysid

Number of obs
Number of groups

=
=

476
186

Obs per group: min


avg
max

=
=
=

2
2.6
3

LR chi2(8)
Prob > chi2

=
=

31.11
0.0001

Log likelihood = -155.54104

-----------------------------------------------------------------------------cap |
Coef. Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t | -.2444615 .1279864
-1.91 0.056
-.4953103
.0063873
en_c | -.418799 .2272978
-1.84 0.065
-.8642945
.0266966
self | .0280334 .0092571
3.03 0.002
.0098899
.0461769
peduc | 5.50e-07 4.24e-07
1.30 0.194
-2.80e-07
1.38e-06
lov | 1.366901 .4507265
3.03 0.002
.4834936
2.250309
fwrk | .2951742 .5295349
0.56 0.577
-.7426951
1.333043
mwrk | .5092987 .2859697
1.78 0.075
-.0511917
1.069789
hcon | -.0001831 .0009532
-0.19 0.848
-.0020514
.0016851
------------------------------------------------------------------------------

Random Effect Probit Model


. xtprobit cap grd_t gen en_c self peduc pexp lov

Random-effects probit regression


Group variable (i): bysid
Random effects u_i ~ Gaussian

Log likelihood = -828.72406

Number of obs
Number of groups

=
=

2462
969

Obs per group: min


avg
max

=
=
=

1
2.5
3

Wald chi2(7)
Prob > chi2

=
=

95.28
0.0000

-----------------------------------------------------------------------------cap |
Coef. Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t | -.3019551 .0405052
-7.45 0.000
-.3813439 -.2225663

50 6

A Brief Review of Panel Data Analysis

gen | -.0383474 .1141064


-0.34
0.737
-.2619919
.1852971
en_c | -.0195823 .0906545
-0.22
0.829
-.1972618
.1580971
self | .0119719 .0036553
3.28
0.001
.0048075
.0191362
peduc | 2.18e-07 1.60e-07
1.36
0.174
-9.65e-08
5.32e-07
pexp | -.0448952 .1272214
-0.35
0.724
-.2942445
.2044542
lov | .5096063 .1503664
3.39
0.001
.2148935
.8043192
_cons | -.6191192 .3627304
-1.71
0.088
-1.330058
.0918193
-------------+---------------------------------------------------------------/lnsig2u | .0830638 .1461173
-.2033208
.3694484
-------------+---------------------------------------------------------------sigma_u | 1.042406 .0761568
.9033363
1.202887
rho |
.520754 .0364664
.4493442
.5913257
-----------------------------------------------------------------------------Likelihood-ratio test of rho=0: chibar2(01) = 110.69 Prob >= chibar2 = 0.000
. xtprobit cap grd_t gen en_c self peduc pexp lov fwrk mwrk hcon
Random-effects probit regression
Group variable (i): bysid
Random effects u_i ~ Gaussian

Log likelihood = -827.03467

Number of obs =
Number of groups =
Obs per group: min =
avg =
2.5
max =
3
Wald chi2(10)
Prob > chi2

=
=

2462
969
1

97.65
0.0000

-----------------------------------------------------------------------------cap |
Coef. Std. Err.
z
P>|z|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t | -.3014566
.040651
-7.42
0.000
-.381131
-.2217822
gen | -.0398643 .1144461
-0.35
0.728
-.2641745
.1844459
en_c | -.0172767 .0908542
-0.19
0.849
-.1953477
.1607942
self | .0115506 .0036726
3.15
0.002
.0043525
.0187486
peduc | 1.78e-07 1.66e-07
1.07
0.285
-1.48e-07
5.04e-07
pexp | -.0623195 .1280928
-0.49
0.627
-.3133768
.1887378
lov | .5077915 .1508099
3.37
0.001
.2122095
.8033735
fwrk | .0492487 .1878422
0.26
0.793
-.3189153
.4174127
mwrk | .1370676 .1026576
1.34
0.182
-.0641376
.3382728
hcon | .0004721 .0004036
1.17
0.242
-.0003189
.0012631
_cons | -.8212392
.410698
-2.00
0.046
-1.626192
-.016286
-------------+---------------------------------------------------------------/lnsig2u | .0884781 .1454779
-.1966532
.3736095
-------------+---------------------------------------------------------------sigma_u | 1.045232 .0760291
.9063528
1.205392
rho | .5221051 .0362984
.4509945
.5923309
-----------------------------------------------------------------------------Likelihood-ratio test of rho=0: chibar2(01) = 111.15 Prob >= chibar2 = 0.000
.

6th KEEP Conference

51

5.5 Tobit Models


(1) Cross sectional Model
. tobit peduc1 grd_t1 gen self1 fwrk1 mwrk1 hcon1,ll(0)
Tobit regression

Log likelihood = -4449.1666

Number of obs
LR chi2(6)
Prob > chi2
Pseudo R2

=
=
=
=

533
99.38
0.0000
0.0110

-----------------------------------------------------------------------------peduc1 |
Coef. Std. Err.
t
P>|t|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t1 | -69132.15 12550.72
-5.51
0.000
-93787.74
-44476.57
gen | -33386.33 36096.23
-0.92
0.355
-104296.5
37523.83
self1 | 3247.652 2182.806
1.49
0.137
-1040.417
7535.722
fwrk1 | 38637.71 66284.14
0.58
0.560
-91575.87
168851.3
mwrk1 | 6921.358 36296.71
0.19
0.849
-64382.64
78225.36
hcon1 | 941.3275 154.9461
6.08
0.000
636.9397
1245.715
_cons | 137099.9 98821.38
1.39
0.166
-57032.33
331232.1
-------------+---------------------------------------------------------------/sigma |
379133 16669.88
346385.4
411880.6
-----------------------------------------------------------------------------Obs. summary:
232 left-censored observations at peduc1<=0
301
uncensored observations
0 right-censored observations
. tobit peduc2 grd_t2 gen self2 fwrk2 mwrk2 hcon2,ll(0)
Tobit regression

Log likelihood = -7899.439

Number of obs
LR chi2(6)
Prob > chi2
Pseudo R2

=
=
=
=

967
208.45
0.0000
0.0130

-----------------------------------------------------------------------------peduc2 |
Coef. Std. Err.
t
P>|t|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t2 | -74463.66 9711.971
-7.67
0.000
-93522.78
-55404.54
gen | -29872.76 27029.12
-1.11
0.269
-82915.67
23170.15
self2 | 2938.235 1337.816
2.20
0.028
312.8568
5563.613
fwrk2 | 42909.65 51017.45
0.84
0.401
-57208.81
143028.1
mwrk2 | -49075.08 27348.75
-1.79
0.073
-102745.2
4595.077
hcon2 | 1175.964 115.2732
10.20
0.000
949.7475
1402.18
_cons | 113167.7 71256.48
1.59
0.113
-26668.5
253004
-------------+---------------------------------------------------------------/sigma | 380236.8 12663.46
355385.6
405088
-----------------------------------------------------------------------------Obs. summary:
434 left-censored observations at peduc2<=0

52 6

A Brief Review of Panel Data Analysis

533
uncensored observations
0 right-censored observations
. tobit peduc3 grd_t3 gen self3 fwrk3 mwrk3 hcon2,ll(0)
Tobit regression

Log likelihood = -8007.1336

Number of obs
LR chi2(6)
Prob > chi2
Pseudo R2

=
=
=
=

962
141.84
0.0000
0.0088

-----------------------------------------------------------------------------peduc3 |
Coef. Std. Err.
t
P>|t|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t3 | -49704.54 11718.28
-4.24
0.000
-72701.07
-26708.01
gen | -28248.58 32109.92
-0.88
0.379
-91262.64
34765.49
self3 | 5113.882 1101.812
4.64
0.000
2951.633
7276.132
fwrk3 | 95647.69 60960.53
1.57
0.117
-23984.21
215279.6
mwrk3 | -62685.28 32489.47
-1.93
0.054
-126444.2
1073.629
hcon2 |
1062.52 141.6674
7.50
0.000
784.5053
1340.535
_cons | -77208.26 84013.85
-0.92
0.358
-242081.1
87664.61
-------------+---------------------------------------------------------------/sigma | 450255.2 14820.22
421171.3
479339.1
-----------------------------------------------------------------------------Obs. summary:
427 left-censored observations at peduc3<=0
535
uncensored observations
0 right-censored observations

(2) Random Effect Tobit Model


. tobit peduc grd_t gen self fwrk mwrk hcon,ll(0)
Tobit regression

Log likelihood = -20366.663

Number of obs
LR chi2(6)
Prob > chi2
Pseudo R2

=
=
=
=

2462
437.02
0.0000
0.0106

-----------------------------------------------------------------------------peduc |
Coef. Std. Err.
t
P>|t|
[95% Conf. Interval]
------------+----------------------------------------------------------------grd_t | -61049.76 6428.978
-9.50
0.000
-73656.54
-48442.98
gen | -30716.54 18150.66
-1.69
0.091
-66308.72
4875.641
self | 3738.761 750.4671
4.98
0.000
2267.147
5210.374
fwrk | 58701.93
34119.7
1.72
0.085
-8204.417
125608.3
mwrk | -39923.77 18304.39
-2.18
0.029
-75817.41
-4030.137
hcon | 1058.405 74.00697
14.30
0.000
913.2827
1203.528
_cons |
35328.7 47730.29
0.74
0.459
-58267.08
128924.5
-------------+---------------------------------------------------------------/sigma | 408602.8 8441.719
392049.2
425156.4

6th KEEP Conference

53

-----------------------------------------------------------------------------Obs. summary:
1093 left-censored observations at peduc<=0
1369
uncensored observations
0 right-censored observations

Remark : OLS
. reg peduc grd_t gen self fwrk mwrk hcon
Source |
SS
df
MS
-------------+-----------------------------Model | 3.4964e+13
6 5.8274e+12
Residual | 1.7593e+14 2455 7.1660e+10
-------------+-----------------------------Total | 2.1089e+14 2461 8.5693e+10

Number of obs
F( 6, 2455)
Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=
=
=

2462
81.32
0.0000
0.1658
0.1638
2.7e+05

-----------------------------------------------------------------------------peduc |
Coef. Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------grd_t | -29097.3 3698.868
-7.87
0.000
-36350.52
-21844.08
gen | -6568.833 10811.98
-0.61
0.544
-27770.38
14632.71
self | 1943.641 464.3358
4.19
0.000
1033.111
2854.172
fwrk |
15383.7 19213.28
0.80
0.423
-22292.22
53059.62
mwrk | -30321.02 10942.85
-2.77
0.006
-51779.2
-8862.846
hcon | 756.9652 45.48597
16.64
0.000
667.7703
846.16
_cons | 155825.2 27827.12
5.60
0.000
101258.1
210392.2
------------------------------------------------------------------------------

54 6

You might also like