Professional Documents
Culture Documents
Multiple Regression
1
Outline
3.1. Definitions
3.1.1 Multiple Regression Model
3.1.2 Population regression function
3.1.3 Sample regression function
2
3.1 Definition
3.3.1 Multiple regression model
revise chapter 2:
- The error u arises because of factors, or
variables, that influence Y but are not included
in the regression function.
- The key assumption 3 – that all other factors
affecting y are uncorrelated with x – is often
unrealistic -> difficult to draw ceteris paribus
conclusions about how x affects y
3
3.3.1 Multiple regression model
• is more amenable to ceteris paribus analysis
• Add more factors to our model -> more of
variation in y can be explained -> better model
for predicting the dependent variable.
Lehangmyhanh.cs2@ftu.edu.vn 4
Example: compared 2 model
𝑐𝑜𝑛𝑠𝑢𝑚 = 𝛽1 + 𝛽2 𝑖𝑛𝑐𝑜𝑚𝑒 + 𝛽3 𝑎𝑠𝑠𝑒𝑡 + 𝑢𝑖 (1)
𝑐𝑜𝑛𝑠𝑢𝑚 = 𝛽1 + 𝛽2 𝑖𝑛𝑐𝑜𝑚𝑒 + 𝑢𝑖 (2)
^
We know that the simple regression coefficient 𝛽2 (1) does not
^
usually equal the multiple regression coefficient 𝛽2 (2). There
^
are two distinct cases where 𝛽2 (1) and 𝛽2 (2) are identical:
^
Lehangmyhanh.cs2@ftu.edu.vn 5
MRM can incorporate fairly general
function from relationships.
𝑐𝑜𝑛𝑠𝑢𝑚 = 𝛽1 + 𝛽2 𝑖𝑛𝑐𝑜𝑚𝑒 + 𝛽3 𝑖𝑛𝑐𝑜𝑚𝑒 2 + 𝑢𝑖
Lehangmyhanh.cs2@ftu.edu.vn 6
MRM is the most widely used vehicle for empirical
analysis in economics and other social sciences
Wage=f(edu, exper)
Q=f(K,L)
Lehangmyhanh.cs2@ftu.edu.vn 7
3.1.2 Population regression fuction
Yi 1 2 X 2i .... k X ki ui
• Y = One dependent variable (criterion)
• X = Two or more independent variables (predictor
variables).
• ui the stochastic disturbance term
• Sample size: >= 50 (at least 10 times as many cases
as independent variables)
• 1 is the intercept
• k measures the change in Y with respect to Xk,
holding other factors fixed.
8
Lehangmyhanh.cs2@ftu.edu.vn 9
10
3.1.3 The Sample Regression Function (SRF)
• Population regression function
E (Y / X i ) f ( X i ) 1 2 X i
• Sample regression function Yˆi ˆ1 ˆ2 X i
13
3.1.3. The Sample Regression Function (SRF)
Lehangmyhanh.cs2@ftu.edu.vn 15
3.2.1 OLS Estimators
ˆ
u Y 2
i
i 1 2 2i 3 3i
ˆ ˆ X ˆ X
2
min
16
3.2.1 OLS Estimators
• Partial derivative
• If we denote: yi Yi Y
x2 i X 2 i X 2
x3i X 3i X 3
x X n X 2
2 2 2
2i 2i
x X n X 3
2 2 2
3i 3i
y Y nY
2 2 2
i i
x x X X nX X
2i 3i 2i 3i 2 3
y x Y X n.Y .X
i 2i i 2i 2
y x Y X n.Y .X
i 3i i 3i 3
18
3.2.1 OLS Estimators
• We will obtain:
ˆ
2 3 2 3 3
2
x x x x
2
2
2
3 2 3
2
y x x x x y x
2
ˆ
3 2 2 3 2
3
x x x x
2
2
2
3 2 3
2
19
3.2.1 OLS Estimators
• We obtain
Y 160
i Y 2616
i
2
X 50 2i X 274
2
2i
X 60 3i X 390
2
3i
Y 16 Y X 835
i 2i
X2 5 Y X 920
i 3i
X3 6
X X 274
2i 3i
21
3. OLS Estimators
• and
y Y nY 56
2 2 2
i i
x X nX 24
2 2 2
2i 2i 2
x X nX 30
2 2 2
3i 3i 3
y x Y X nY X 35
i 2i i 2i 2
y x Y X nY X 40
i 3i i 3i 3
x x X X nX X 26
2 i 3i 2i 3i 2 3
22
3.2.1 OLS Estimators
• and ˆ
2 0.2272
ˆ3 1.1363
ˆ1 21.6818
23
3.2.1 OLS Estimators
ˆ 1 X 22 x32 X 32 x22 2 X 2 X 3 x2 x3
Var (1 ) ( ) 2
n x x ( x x )
2
2
2
3 2 3
2
Var ( ˆ )
x 2
3
2
x x ( x x )
2 2 2 2
2 3 2 3
ˆ
Var ( 3 )
x2
2
2
x2 x3 ( x2 x3 )
2 2 2
24
3. OLS Estimators
• or, equivalently,
2
Var ( ˆ ) se(ˆ2 ) var( ˆ2 )
x (1 r )
2 2 2
2 23
2
Var ( ˆ3 ) se( ˆ3 ) var( ˆ3 )
3 23 )
x 2
(1 r 2
uˆ i
n3
7.4.19 25
Example- Eview output
• Model: wage = f(educ,exper )
26
3.2.2 The Three-Variable Model: Notation and
Assumptions
Assumptions Yi 1 2 X 2i 3 X 3i ui
1. Linear regression model, or linear in the parameters.
2. X values are fixed in repeated samplings. X is assumed to be non-
stochastic.
3. Zero mean value of disturbance ui: E(ui|X2i, X3i) = 0.
Then we have Zero covariance between ui and each X variable
cov (ui, X2i) = cov (ui,X3i) = 0
4. Homoscedasticity or constant variance of ui: Var(ui)=σ2
5. No serial correlation between the disturbances:
Cov(ui,uj) = 0, i ≠ j
6. The number of observations n must be greater than the number of
parameters to be estimated.
7. Variability in X values. The X values in a given sample must not all
be the same.
8 No specification bias or the model is correctly specified.
9. No exact collinearity between the X variables.
27
Assumption 3: Zero mean value of
disturbance ui: E(ui|X2i, X3i) = 0.
This Ass can fail if:
- the functional relationship between the
explained and explanatory variables is
misspecified in equation
- omitting an important factor that is correlated
with any of x1,x2, …xk
𝐸(𝛽መ𝑗 ) = 𝛽𝑗
Yˆ uˆ
i 1
i i 0
31
̂
• An unbiased estimator of 2
:
2
E (u )
2
i /n
u 2
i 1
nk nk
RSS / follows
2 2
distribution with df = number of
observations – number of estimated parameters = n-k
Positive ̂ is called the standard error of the regression
(SER) (or Root MSE). SER is an estimator of the standard
deviation of the error term.
32
3.2.3. Unbias and efficient properties
2
Var ( ˆ j )
TSS j (1 R 2j )
n
• Where TSS j ( xij x j ) 2
is total sample
i 1
variation in xj and R 2
j is the R-squared from
regressing xj on all other independent
variables (and including an intercept).
• Since is unknown, we replace it with its
estimator ̂ . Standard error:
ˆ
se( j ) /[TSS j (1 R j )]
ˆ 2 1/ 2
33
3.3 Measure of fit or coefficient of determination R2
• The total sum of squares (TSS)
TSS y (Yi Y ) Yi nY
2
i
2 2 2
35
Example- Goodness of fit
• Determinants of college GPA:
-
36
Output interpretation
• hsGPA and ACT together explain about 17.6%
of the variation in college GPA for this sample
of students.
• There are many other factors including family
background, personality, quality of high school
education, affinity for college that contribute
to a student’s college performance.
37
3.3. Measure of fit
• Note that R2 lies between 0 and 1.
o If it is 1, the fitted regression line explains 100 percent of
the variation in Y
o If it is 0, the model does not explain any of the variation
in Y.
• The fit of the model is said to be “better’’ the closer R2 is to
1
• As the number of regressors increases, R2 almost invariably
increases and never decreases.
38
R2 and the adjusted R2
• An alternative coefficient of determination:
RSS /( n k )
R 1
2
TSS /( n 1)
n 1
R 1 (1 R )
2 2
nk
where k = the number of parameters in the model including the
intercept term.
39
R2 and the adjusted R2
• It is good practice to use adjusted R2 than R2
because R2 tends to give an overly optimistic
picture of the fit of the regression, particularly
when the number of explanatory variables is
not very small compared with the number of
observations.
40
The game of maximizing adjusted R2
• Sometimes researchers play the game of maximizing
adjusted R2, that is, choosing the model that gives the
highest adjusted R2. This may be dangerous.
• In regression analysis, our objective is not to obtain a
high adjusted R2 per se but rather to obtain
dependable estimates if the true population regression
coefficients and draw statistical inferences about them.
• Researchers should be more concerned about the
logical or theoretical relevance of the explanatory
variables to the dependent variable and their statistical
significance.
41
Comparing Coefficients of Determination R2
43
Review. Partial correlation coefficients
• r12,3 =partial correlation coefficient between Y and X2,
holding X3 constant.
• r13,2 =partial correlation coefficient between Y and X3,
holding X2 constant.
• r23,1 =partial correlation coefficient between X2 and X3,
holding Y constant.
These are called first order correlation coefficients (order=
the number of secondary subscripts).
44
Example- Partial correlation coefficients
• Y= crop yield, X2= rainfall, X3= temperature.
Assume r12=0, there is no association between
crop yield and rainfall. Assume r13 is positive,
r23 is negative r12,3 will be positive
Holding temperature constant, there is a
positive association between yield and rainfall.
Since temperature X3 affects both yield Y and
rainfall, we need to remove the influence of
the nuisance variable temperature.
• In Stata: pwcorr Y X2 X3
45
LEC 11
More on Functional Form
The Cobb–Douglas Production Function
• There
• There
7.9.4
50
. More on Functional Form
Polynomial Regression Models
• Geometrically, the MC curve depicted in Figure 7.1
represents a parabola. Mathematically, the parabola is
represented by the following equation:
Y = β0 + β1X + β2Xi2 (7.10.1)
which is called a quadratic function,
• The general kth degree polynomial regression may be written
as
Yi = β0 + β1Xi + β2Xi2+ · · ·+βkXik + ui (7.10.3)
51
More on Functional Form
Polynomial Regression Models
EXAMPLE 7.4 Estimating the Total Cost Function
52