You are on page 1of 4

Lecture 11

Models for Panel Data


11.1

Introduction

In this chapter we consider pooling time-series and cross-sectional data. Two


well known examples of panel data in the U.S.
1. Panel Study of Income Dynamics (PSID)
2. National Longitudinal Survey (NLS)
Advantages of using panel data
1. gives a richer source of variation which allows for more efficient estimation of the parameters
2. ability to control heterogeneity
3. offer more information
4. study the dynamic behavior, e.g., duration of unemployment

11.2

Panel Data Models


yit = + x0it + it

where i = 1, , n, t = 1, , T . is a scalar, is K 1 and xit is the itth


observation on k explanatory variables.
1

LECTURE 11 PANEL DATA

Error Components Specification:


wit = ui + it
where ui s are cross-section specific components and it are remainder effects.

11.3

Fixed Effects Model

If the ui s are thought of as fixed parameters, i s, to be estimated, then we


have
n
yit = + x0it +

i Di + it

(11.1)

i=2

where Di is a dummy variable for the ith individual. Not all the dummies
are included so as not to fall in the dummy variable trap. Running OLS
on equation (11.1) leads to the least squares dummy variable (LSDV)
estimator.
If equation (11.1) is the true model, LSDV is BLUE as long as it is the
standard i.i.d. disturbance with mean 0 and variance matrix 2 InT .

11.3.1

Testing for fixed effects

One could test the joint significance of these dummies, i.e., H0 : 2 = 3 =


= n = 0, by performing an F -test.
Under H0 , the model becomes the pooled regression.
yit = + x0it + it

pooled regression

Denote the sum of squares residuals from the pooled regression as SSRpooled .
Under H0 ,
F =

11.4

(SSRpooled SSRLSDV )/(n 1)


Fn1,n(T 1)k
SSRLSDV /(nT n k)

Random Effects Model

There are too many parameters in the fixed effects model and the loss of
degrees of freedom can be avoided if the ui s can be assumed random.
c
Yin-Feng
Gau 2002

ECONOMETRICS

LECTURE 11 PANEL DATA

Assume ui i.i.d.(0, u2 ), it i.i.d.(0, 2 ), and the ui s are independent


of the it s. In addition, the Xit s are independent of the ui s and it s for all
i and t. The random effects model is an appropriate specification if we are
drawing n individuals randomly from a large population.
The specification of random effects model implies a homoskedastic variance Var(wit ) = u2 + 2 for all i and t, and series correlation over time only
between the disturbances of the same individual.
Cov(wit , wjs ) = u2 + 2 for i = j, t = s
= u2 for i = j, t 6= s
and zero otherwise. This also means that the correlation coefficients between
wit and wjs is
= Corr(wit , wjs ) = 1 for i = j, t = s
= u2 /(u2 + 2 ) for i = j, t 6= s
and zero otherwise.
Under the random effects model, GLS based on the true variance components is BLUE, and the feasible GLS estimators are asymptotically efficient
as either n or T .

11.4.1

Testing for Random Effects

Breusch and Pagan (1980) derived a Lagrange multiplier (LM) test for the
random effects model based on the OLS residuals. For
H0 : u2 = 0 ( or Corr[wit , wjs ] = 0 for i = j)
H0 : u2 6= 0

LM =

11.4.2

nT

2(T 1)

Pn

i=1

hP

Pn

i=1

= 1T eit

PT

2
t=1 eit

i2

1 2 (1)

Hausman Test for Fixed or Random Effects

Hausman (1978) derived a test based on the idea that under the hypothesis
of no correlation, both OLS in the LSDV model and GLS are consistent, but
OLS is inefficient, whereas under the alternative, OLS is consistent, but GLS
c
Yin-Feng
Gau 2002

ECONOMETRICS

LECTURE 11 PANEL DATA

is not. Therefore, under the null hypothesis, the two estimates should not
differ systematically, and a test can be based on the difference.
To test the difference, we need the covariance matrix of the difference
where b is the OLS in LSDV, and is GLS.
vector, [b ],
= Var[b] + Var[]
Cov[b, ]
Cov[b, ]
0
Var[b ]
Hausmans essential result is that the covariance of an efficient estimator with its difference from an efficient estimator is zero, which
implies that
]
= Cov[b, ]
Var[]
=0
Cov[(b ),
or
= Var[]

Cov[b, ]
Denote
= Var[b] Var[]
=
Var[b ]
The chi-squared test is based on the Wald criterion:

0
1 [b ]
W = 2 [k] = [b ]
1 , we use the estimated covariance matrices of the slope estimator in
For
the LSDV model and the estimated covariance matrix in the random effects
model, excluding the constant term.

More issues:
Heteroskedasticity and Robust Covariance Estimation
Greene (2003) 13.5
Autocorrelation
Greene (2003) 13.6

References
Baltagi, B. H., 1998, Econometrics, Springer. Chapter 12.
Hsiao, C., 1986, Analysis of Panel Data, Cambridge University Press.
Greene, W. H., 2003, Econometric Analysis, 5th ed., Prentice Hall. Chapter
13.
Gujarati, D. N., 2003, Basic Econometrics, 4th ed., McGraw-Hill. Chapter
16.
c
Yin-Feng
Gau 2002

ECONOMETRICS

You might also like