Professional Documents
Culture Documents
MULTICOLLINEARITY:
Agenda:
The nature of multicollinearity. Practical consequences. Detection. Remedial measures to alleviate the problem.
REASONS:
Data collection process Constraints on model or in the population being sampled. Model specification An over-determined models
Perfect multicollinearity is the case when two ore more independe variables Can create perfect linear relationship.
Perfect multicollinearity is the case when two ore more independe variables Can create less than perfect linear relationship.
X1 =
2 1 X 2 + 3 X 3 ..... k X k + ei 1 1 1 1
min u = (Yi 1 2 X 2i 3 X 3i )
2 i
1 = Y 2 X 2 3 X 3
X 2 i = X 3i
if _ = a
= ( yi x3i )a ( yi x3i ) a = 0 2 aa (a ) 2 0
OLS ESTIMATION
2 2 2 1 X 2 x3i + X 32 x2i 2 X 2 X 3 x2i x3i var(1 ) = + 2 2 2 n x2i x3i ( x2i x3i ) 2
(1 r22,3 )
2 x2 i
2 x3i
As degree of collinearity approaches to one, the variances of coefficients approaches to infinity. Thus, the presence of high collinearity will
PRACTICAL CONSEQUENCES
The
OLS is BLUE but large variances and covariances making process estimation difficult. Large variances cause large confidence intervals and accepting or rejecting hypothesis are biased. T statistics are biased Although t-stats are low, R-square might be very high. The sensitivity of estimators and variances are very high to small changes
I FV
60 40 20 0 0 0.2 0.4
Correlation
0.6
0.8
1.2
X i = 0 X 0 + 1 X 1 + 2 X 2 + 3 X 3 ..... + k X k R2 = R2 j
H 0 : 2 = 3 = ... = k = 0
Ha: Not all slope coefficients are simultaneously zero
n k ESS n k R 2 R 2 /( k 1) F= = = 2 k 1 RSS k 1 1 R (1 R 2 ) /( n k )
Due to high R square the F-value will be very high and rejection of Ho will be easy
DETECTION
Multicollinearity is a question of degree. It is a feature of sample but not population.
How to detect :
High R square but low t-stats. High correlation coefficients among the independent variables. Auxiliary regression High VIF
AUXILIARY REGRESSION
Ho: The Xi variable is not collinear
X i = 0 X 0 + 1 X 1 + 2 X 2 + 3 X 3 ..... + k X k R2 = R2 j
Run regression where one X is dependent and other Xs are independent and Obtain R square
k- is the number of explanatory variables including intercept. n- is sample size. If F stat is higher than F critical then Xi variable is collinear Rule of thumb: if R square of auxiliary regression is higher than over R square then it might be troublesome.
WHAT TO DO ?
Do nothing. Combining cross section and time series Transformation of variables (differencing, ratio transformation) Additional data observations.
READING