Professional Documents
Culture Documents
Regression
1-1
Learning Objectives
1. Describe the Linear Regression Model
2. State the Regression Modeling Steps
3. Explain Ordinary Least Squares
4. Compute Regression Coefficients
5. Predict Response Variable
6. Interpret Computer Output
1-2
Models
1-3
Models
1. Representation of Some Phenomenon
2. Mathematical Model Is a Mathematical
Expression of Some Phenomenon
3. Often Describe Relationships between
Variables
4. Types
Deterministic Models
Probabilistic Models
1-4
Deterministic
Models
1. Hypothesize Exact Relationships
2. Suitable When Prediction Error is
Negligible
3. Example: Force Is Exactly
Mass Times Acceleration
F = ma
1-5
Probabilistic Models
1. Hypothesize 2 Components
Deterministic
Random Error
Y = 10X +
Random Error May Be Due to Factors
Other Than Advertising
1-6
Types of
Probabilistic Models
Probabilistic
Probabilistic
Models
Models
Regression
Regression
Models
Models
1-7
Correlation
Correlation
Models
Models
Other
Other
Models
Models
Regression Models
1-8
Types of
Probabilistic Models
Probabilistic
Probabilistic
Models
Models
Regression
Regression
Models
Models
1-9
Correlation
Correlation
Models
Models
Other
Other
Models
Models
Regression Models
1. Answer What Is the Relationship
Between the Variables?
2. Equation Used
Is to Be Predicted
Regression Modeling
Steps
1. Hypothesize Deterministic Component
2. Estimate Unknown Model Parameters
3. Specify Probability Distribution of
Random Error Term
4. Evaluate Model
5. Use Model for Prediction & Estimation
1 - 11
Model Specification
1 - 12
Regression Modeling
Steps
1. Hypothesize Deterministic Component
2. Estimate Unknown Model Parameters
3. Specify Probability Distribution of Random
Error Term
4. Evaluate Model
5. Use Model for Prediction & Estimation
1 - 13
Specifying the
Model
1. Define Variables
1 - 14
Model Specification
Is Based on Theory
1.
2.
3.
4.
1 - 15
Thinking Challenge:
Which Is More
Logical?
Sales
Sales
Advertising
Sales
Advertising
Sales
Advertising
1 - 16
Advertising
Types of
Regression Models
1 - 17
Types of
Regression Models
Regression
Models
1 - 18
Types of
Regression Models
1 Explanatory
Variable
Simple
1 - 19
Regression
Models
Types of
Regression Models
1 Explanatory
Variable
Simple
1 - 20
Regression
Models
2+ Explanatory
Variables
Multiple
Types of
Regression Models
1 Explanatory
Variable
Simple
Linear
1 - 21
Regression
Models
2+ Explanatory
Variables
Multiple
Types of
Regression Models
1 Explanatory
Variable
Regression
Models
Multiple
Simple
Linear
1 - 22
2+ Explanatory
Variables
NonLinear
Types of
Regression Models
1 Explanatory
Variable
Regression
Models
2+ Explanatory
Variables
Multiple
Simple
Linear
1 - 23
NonLinear
Linear
Types of
Regression Models
1 Explanatory
Variable
Regression
Models
2+ Explanatory
Variables
Multiple
Simple
Linear
1 - 24
NonLinear
Linear
NonLinear
Linear Regression
Model
1 - 25
Types of
Regression Models
Regression
Models
1 Explanatory
Variable
2+ Explanatory
Variables
Multiple
Simple
Linear
1 - 26
NonLinear
Linear
NonLinear
Linear Equations
Y
Y = mX + b
m = Slope
Change
in Y
Change in X
b = Y-intercept
1 - 27
Linear Regression
Model
1. Relationship Between Variables Is a
Linear Function
Population
Y-Intercept
Population
Slope
Random
Error
Yi 0 1X i i
Dependent
(Response)
Variable
1 - 28
Independent
(Explanatory)
Variable
Population &
Sample Regression
Models
1 - 29
Population &
Sample Regression
Models
Population
$
$
1 - 30
$
$
$
Population &
Sample Regression
Models
Population
Unknown
Relationship
$
Yi 0 1X i i
$
1 - 31
$
$
$
Population &
Sample Regression
Models
Population
Random Sample
Unknown
Relationship
$
Yi 0 1X i i
$
1 - 32
$
$
$
$
$
Population &
Sample Regression
Models
Population
Unknown
Relationship
$
Yi 0 1X i i
$
1 - 33
$
$
$
Random Sample
Yi 0 1X i i
$
$
Population Linear
Regression Model
Y
Yi 0 1X i i
Observed
value
i = Random error
E Y 0 1 X i
X
Observed value
1 - 34
Sample Linear
Regression Model
Y
Yi 0 1X i i
^i = Random
error
Yi 0 1X i
Unsampled
observation
X
Observed value
1 - 35
Estimating Parameters:
Least Squares Method
1 - 36
Regression Modeling
Steps
1. Hypothesize Deterministic Component
2. Estimate Unknown Model Parameters
3. Specify Probability Distribution of
Random Error Term
4. Evaluate Model
5. Use Model for Prediction & Estimation
1 - 37
Scattergram
1. Plot of All (Xi, Yi) Pairs
2. Suggests How Well Model Will Fit
60
40
20
0
0
1 - 38
20
40
X
60
Thinking Challenge
How would you draw a line through the
points? How do you determine which line
fits best?
60
40
20
0
1 - 39
20
40
X
60
Thinking Challenge
How would you draw a line through the
points? How do you determine which line
fits best?
60
40
20
0
1 - 40
20
40
X
60
Thinking Challenge
How would you draw a line through the
points? How do you determine which line
fits best?
60
40
20
0
1 - 41
20
40
X
60
Thinking Challenge
How would you draw a line through the
points? How do you determine which line
fits best?
60
40
20
0
1 - 42
20
40
X
60
Thinking Challenge
How would you draw a line through the
points? How do you determine which line
fits best?
60
40
20
0
1 - 43
20
40
X
60
Thinking Challenge
How would you draw a line through the
points? How do you determine which line
fits best?
60
40
20
0
1 - 44
20
40
X
60
Thinking Challenge
How would you draw a line through the
points? How do you determine which line
fits best?
60
40
20
0
1 - 45
20
40
X
60
Least Squares
1. Best Fit Means Difference Between
Actual Y Values & Predicted Y Values
Are a Minimum
1 - 46
Least Squares
1. Best Fit Means Difference Between
Actual Y Values & Predicted Y Values
Are a Minimum
i 1
1 - 47
i 1
2
i
Least Squares
1. Best Fit Means Difference Between Actual
Y Values & Predicted Y Values Are a
Minimum
Y
n
i 1
Yi
i2
i 1
Least Squares
Graphically
n
2
2
2
2
2
LS minimizes i 1 2 3 4
i 1
Y2 0 1X 2 2
^ 44
^ 22
^ 11
^ 33
Yi 0 1X i
X
1 - 49
Coefficient
Equations
Prediction Equation
Y X
Sample Slope
nn
X
ii Yii
nn
ii11
ii11
X
Y
ii ii
n
ii11
11
22
nn
X
ii
nn
ii11
22
ii
n
ii11
Sample Y-intercept
00 Y 11X
1 - 50
nn
Computation Table
Xii
Yii
X1
Y1
X2
Y2
2
Xi
X112
2
X22
Yn
2
Xnn
2
Yn
XnYn
Yi
2
Xi
2
Yi
Xi Yi
Xn
Xii
1 - 51
2
Yi
Y122
22
Y2
X1 Y1
XiYi
X2 Y2
Interpretation of
Coefficients
1 - 52
Interpretation of
Coefficients
^
1. Slope (1)
^
Estimated Y Changes by 1 for Each 1
Unit Increase in X
^
1 = 2, then Sales (Y) Is Expected to
Increase by 2 for Each 1 Unit Increase in
Advertising (X)
If
1 - 53
Interpretation of
Coefficients
^
1. Slope (1)
^ Each 1 Unit
Estimated Y Changes by 1 for
Increase in X
If
2. Y-Intercept (0)
If
1 - 54
Parameter
Estimation Example
Youre a marketing analyst for Hasbro Toys.
You gather the following data:
Ad $
Sales (Units)
1
1
2
1
3
2
4
2
5
4
What is the relationship
between sales & advertising?
1 - 55
Scattergram
Sales vs. Advertising
Sales
4
3
2
1
0
0
Advertising
1 - 56
Parameter
Estimation Solution
Table
X
Y
X 22
Y 22
XY
Xii
Yii
Xii
Yii
XiiYii
16
25
16
20
15
10
55
26
37
1 - 57
Parameter
Estimation Solution
11
nn
X ii
nn
ii11
X
Y
ii ii
n
ii11
nn
nn
Y
ii11
X ii
nn
ii11
22
ii
n
ii11
22
ii
1510
37
5
0.70
22
15
55
5
Coefficient
Interpretation
Solution
1 - 59
1.
Coefficient
Interpretation
Solution
^
Slope ( )
1
1 - 60
1.
Coefficient
Interpretation
Solution
^
Slope ( )
1
2. Y-Intercept (0)
1 - 61
Parameter
Estimation Computer
Output
^k
Variable DF
INTERCEP 1
ADVERT
1
^0
1 - 62
Parameter Estimates
Parameter Standard T for H0:
Estimate
Error
Param=0
-0.1000
0.6350
-0.157
0.7000
0.1914
3.656
^1
Prob>|T|
0.8849
0.0354
Parameter
Estimation Thinking
Challenge
Youre an economist for the county
Youre an economist for the county
cooperative. You gather the following data:
Fertilizer (lb.) Yield (lb.)
4
3.0
6
5.5
10
6.5
12
9.0
What is the relationship
between fertilizer & crop yield?
1 - 63
Scattergram
Crop Yield vs.
Fertilizer*
Yield (lb.)
10
8
6
4
2
0
0
10
Fertilizer (lb.)
1 - 64
15
Parameter
Estimation Solution
Table*
2
2
1 - 65
Xii
Yii
Xii2
Yii2
XiiYii
3.0
16
9.00
12
5.5
36
30.25
33
10
6.5
100
42.25
65
12
9.0
144
81.00
108
32
24.0
Parameter
Estimation Solution*
11
nn
X ii
nn
ii11
X
Y
ii ii
n
ii11
nn
nn
Y
ii11
X ii
nn
ii11
22
ii
n
ii11
22
ii
32 24
218
4
2
32 2
296
4
0.65
Coefficient
Interpretation
Solution*
1 - 67
Coefficient
Interpretation
Solution*
^
1. Slope (1)
1 - 68
Coefficient
Interpretation
Solution*
^
1. Slope (1)
2. Y-Intercept (0)
1 - 69
Probability Distribution
of Random Error
1 - 70
Regression Modeling
Steps
1. Hypothesize Deterministic Component
2. Estimate Unknown Model Parameters
3. Specify Probability Distribution of
Random Error Term
4. Evaluate Model
5. Use Model for Prediction & Estimation
1 - 71
Linear Regression
Assumptions
1. Mean of Probability Distribution of
Error Is 0
2. Probability Distribution of Error Has
Constant Variance
3. Probability Distribution of Error is
Normal
4. Errors Are Independent
1 - 72
Error
Probability
Distribution
^
f( )
Y
X
X
X
2
1 - 73
Random Error
Variation
1 - 74
Random Error
Variation
1. Variation of Actual Y from Predicted Y
1 - 75
Random Error
Variation
1. Variation of Actual Y from Predicted Y
2. Measured by Standard Error of
Regression Model
1 - 76
Random Error
Variation
1. Variation of Actual Y from Predicted Y
2. Measured by Standard Error of
Regression Model
Parameter Significance
Prediction Accuracy
1 - 77
1.
Measures of
Variation
in
Regression
Total Sum of Squares (SSyy)
yy
1 - 78
Variation Measures
Y
Yi
Total sum
of squares
(Yi - Y)2
Unexplained sum
^ )2
of squares (Yi - Y
i
Yi 0 1X i
Explained sum of
^
squares (Yi - Y)2
X
1 - 79
X
i
Coefficient of
Determination
1. Proportion of Variation Explained by
Relationship Between X & Y
0 r2 1
Explained Variation
r
Total Variation
2
Y
n
i 1
Y
i 1
Y Y
n
i 1
1 - 80
Yi Y
Coefficient of
Determination
Examples2
r2 = 1
r =1
X
Y
r2 = .8
X
1 - 81
r2 = 0
Coefficient of
Determination
Example
Youre a marketing analyst
for Hasbro Toys.
You find 0 = -0.1^ & 1 = 0.7.^
Ad $
Sales (Units)
1
1
2
1
3
2
4
2
5
4
Interpret a coefficient of
determination of 0.8167.
1 - 82
Root MSE
Dep Mean
C.V.
1 - 83
Computer Output
0.60553
2.00000
30.27650
r2
R-square
Adj R-sq
0.8167
0.7556
1 - 84
Regression Modeling
Steps
1. Hypothesize Deterministic Component
2. Estimate Unknown Model Parameters
3. Specify Probability Distribution of
Random Error Term
4. Evaluate Model
5. Use Model for Prediction & Estimation
1 - 85
Test of Slope
Coefficient
1. Shows If There Is a Linear Relationship
Between X & Y
2. Involves Population Slope 1
3. Hypotheses
Sampling
Distribution
of Sample Slopes
1 - 87
Sampling
Distribution
of
Sample
Slopes
Sample 1 Line
Sample 2 Line
Population Line
1 - 88
Sampling
Distribution
of
Sample
Slopes
Sample 1 Line
Sample 2 Line
Population Line
1 - 89
All Possible
Sample Slopes
Sample 1: 2.5
Sample 2: 1.6
Sample 3: 1.8
Sample 4: 2.1
:
:
Very large number of
sample slopes
Sampling
Distribution
of
Sample
Slopes
Sample 1 Line
Sample 2 Line
Population Line
Sampling Distribution
S^
1
1 - 90
^
1
All Possible
Sample Slopes
Sample 1: 2.5
Sample 2: 1.6
Sample 3: 1.8
Sample 4: 2.1
:
:
Very large number of
sample slopes
Slope Coefficient
Test Statistic
tn
n
2
2
1
1
1
1
S
1
where
S
S
1
n
n
i
i
1
1
1 - 91
X ii2
n
n
i
i
1
1
X ii
2
2
Test of Slope
Coefficient Example
Youre a marketing analyst for Hasbro Toys.
You find b0 = -.1, b1 = .7 & s = .60553.
Ad $
Sales (Units)
1
1
2
1
3
2
4
2
5
4
Is the relationship significant
at the .05 level?
1 - 92
Solution Table
Xii
Yii
22
Xii
16
25
16
20
15
10
55
26
37
1 - 93
22
Yii
XiiYii
Test of Slope
Parameter
Solution
Test Statistic:
H0: 1 = 0
Ha: 1 0
1 1 0.70 0
t
3.656
S
0.1915
.05
df 5 - 2 = 3
Critical Value(s):
Reject
Decision:
Reject at = .05
Reject
.025
.025
-3.1824
1 - 94
0 3.1824
Conclusion:
There is evidence of a
relationship
Test Statistic
Solution
t nn22
11 11 0.70 0
3.656
S
0.1915
11
where
S
1
1
ii
nn
22
i
1
X i 1
n
i
11
1 - 95
nn
22
0.60553
55
15
5
33
0.1915
Test of Slope
Parameter
Computer Output
Parameter Estimates
Parameter Standard T for H0:
Variable DF Estimate
Error
Param=0 Prob>|T|
INTERCEP 1 -0.1000
0.6350
-0.157
0.8849
ADVERT
1
0.7000
0.1914
3.656
0.0354
^
k
S^
t = ^k / S^
P-Value
1 - 96
1 - 97
Regression Modeling
Steps
1. Hypothesize Deterministic Component
2. Estimate Unknown Model Parameters
3. Specify Probability Distribution of Random
Error Term
4. Evaluate Model
5. Use Model for Prediction & Estimation
1 - 98
Prediction With
Regression Models
1. Types of Predictions
Point Estimates
Interval Estimates
2. What Is Predicted
1 - 99
What Is Predicted
Y
Y Individual
Mean Y, E(Y)
E(Y)
Prediction, Y
XP
1 - 100
= 00 + 11 X
^
X
Confidence Interval
Estimate of Mean Y
Y t nn22,, //22 SYY E (Y ) Y t nn22,, //22 SYY
where
1
SYY S
X X
X X
22
pp
nn
ii11
1 - 101
ii
22
Factors Affecting
Interval Width
1. Level of Confidence (1 - )
3. Sample Size
1 - 102
_
Y
X1
1 - 103
X2
Confidence Interval
Estimate Example
Youre a marketing analyst for Hasbro Toys.
You find b0 = -.1, b1 = .7 & s = .60553.
Ad $
Sales (Units)
1
1
2
1
3
2
4
2
5
4
Estimate the mean sales when
advertising is $4 at the .05 level.
1 - 104
Solution Table
Xii
Yii
Xii22
Yii22
XiiYii
1
2
1
1
1
4
1
1
1
2
3
4
2
2
9
16
4
4
6
8
25
16
20
15
10
55
26
37
1 - 105
Confidence Interval
Estimate Solution
Y t nn22,, //22 SYY E (Y ) Y t nn22,, //22 SYY
Y 0.1 0.7 4 2.7
X to be predicted
1 4 3 2
SYY .60553
0.3316
5
10
2
Prediction Interval
of Individual
Response
Y t n 2, / 2 S Y Y YP Y t n 2, // 2 S YY Y
where
1
S Y Y S 1
n
X X
X X
2
i 1
Note!
1 - 107
we're trying to
predict
Expected
Expected
(Mean) Y
E(Y) =
Prediction, YY
XP
1 - 108
00 + 11 X
Interval Estimate
Computer Output
Dep Var
Obs SALES
1 1.000
2 1.000
3 2.000
4 2.000
5 4.000
Predicted Y
when X = 4
1 - 109
SY^
Confidence
Interval
Prediction
Interval
Hyperbolic Interval
Bands
Y
_
X
1 - 110
XP
Correlation Models
1 - 111
Types of
Probabilistic Models
Probabilistic
Probabilistic
Models
Models
Regression
Regression
Models
Models
1 - 112
Correlation
Correlation
Models
Models
Other
Other
Models
Models
Correlation Models
1. Answer How Strong Is the Linear
Relationship Between 2 Variables?
2. Coefficient of Correlation Used
Sample Coefficient
of Correlation
1. Pearson Product Moment Coefficient
of Correlation, r:
r Coefficient of Determination
cYi Y h
cX i X h
n
Yi Y h
cX i X h c
n
i 1
1 - 114
i 1
i 1
Coefficient of Correlation
Values
1 - 115
Coefficient of Correlation
Values
-1.0
1 - 116
-.5
+.5
+1.0
Coefficient of Correlation
Values
No
Correlation
-1.0
1 - 117
-.5
+.5
+1.0
Coefficient of Correlation
Values
No
Correlation
-1.0
-.5
Increasing degree of
negative correlation
1 - 118
+.5
+1.0
Coefficient of Correlation
Values
Perfect
Negative
Correlation
-1.0
1 - 119
No
Correlation
-.5
+.5
+1.0
Coefficient of Correlation
Values
Perfect
Negative
Correlation
-1.0
No
Correlation
-.5
+.5
+1.0
Increasing degree of
positive correlation
1 - 120
Coefficient of Correlation
Values
Perfect
Negative
Correlation
-1.0
1 - 121
Perfect
Positive
Correlation
No
Correlation
-.5
+.5
+1.0
r=1
Coefficient of
Correlation
Examples
Y
r = -1
X
r = .89
X
1 - 122
r=0
1.
Test of
Coefficient of
Correlation
Shows If There Is a Linear Relationship
1 - 123
Conclusion
1. Described the Linear Regression Model
2. Stated the Regression Modeling Steps
3. Explained Ordinary Least Squares
4. Computed Regression Coefficients
5. Predicted Response Variable
6. Interpreted Computer Output
1 - 124