MATH2831/2931 Linear Models/ Higher Linear Models.: August 2, 2013

MATH2831/2931
Linear Models/ Higher Linear Models.
August 2, 2013
Week 2 Lecture 3 - Last lecture:
Review of Hypothesis testing two-sided and one-sided tests.
Hypothesis testing for 0 and 1 .
Example 1: zinc concentrations in plants.
Example 2: sales and advertising data.
Week 2 Lecture 3 - This lecture:
The analysis of variance table
Inference on 1 : some further examples on testing and

ANOVA.
Confidence intervals for the mean.
Prediction intervals.
Week 2 Lecture 3 - The ANOVA table
So far we have considered the model with a single

independent variable to be the true model i.e. presumption
that y |x is related to x linearly.
NOTE 1: Prediction of the response will be poor in situations

in which there are several independent variables, each
affecting response.
NOTE 2: Prediction of the response will be poor in situations

in which presumption that y |x is related to x linearly is false
in range of variables considered.
Here we utilize ANOVA to analyze the quality of the estimated

regression line.
If the true unknown model is linear in more than one variable

x, ie.
y |x = 0 + 1 x1 + 2 x2
then standard least squares estimate derived so far
b1 =
Sxy
Sxx
which is calculated only considering x1 , is a biased estimate

for 1 .
The bias will be a function of the additiona coefficient 2
We need a methodology to assess the quality of our

regression line !
Analysis of the quality of an estimated regression line can be

handled by an ANOVA approach.
ANOVA procedure: considers total variation in dependent

variable as subdivided into meaningful components which are
then observed and treated in a systematic fashion.
(recall Week 1 Lecture 3 - partitioning of variance.)
Fundamental identity:
SStotal = SSreg + SSres
Regression sum of squares (variation explained by the fit)
SSreg =
n
X
(
yi y )2 .
i=1
Residual sum of squares (variation unexplained by the fit)

SSres =
n
X
i=1
(yi yi )2 .

Question: Why is testing H0 : 1 = 0 of particular interest?
Answer: It helps answer question about whether the predictor is
useful for explaining the response.
t-test: statistic for testing H0 : 1 = 0 was
T =
b
1 .
/ Sxx
This statistic (T) has a t-distribution with n 2 degrees of

freedom under H0 .
From our distribution results (Week 2 Lecture 1) we also can

derive that under H0
F = T2 =
b12 Sxx
has an F distribution with 1 and n 2 degrees of freedom.

Theorem: One can express the regression sum of squares SSreg
as a function of the least squares as follows:
SSreg = b12 Sxx
With SSreg = b12 Sxx the above statistic, under H0 ,
F =
can be written as
F =
b12 Sxx
SSreg /1
SSres /(n 2)
(ratio of variation explained by the model to scaled residual

variation).

Proof:
SSreg
n
X
i=1
n
X
i=1
n
X
i=1
=
=
n
X
(
yi y )2
(b0 + b1 xi y )2
(
y b1 x + b1 xi y )2
b12 (xi x)2
i=1
b12 Sxx .
So
F =
SSreg /1
SSres /(n 2)
has an F1,n2 distribution under H0 : 1 = 0 as claimed.

With F as test statistic we obtain a test of
H0 : 1 = 0
against
H1 : 1 6= 0
at significance level by using the critical region
F > F;1,n2 .
Computation of p-value:
p = Pr (F f |1 = 0)
where f is the observed value of F (given 1 = 0, F F1,n2 ).
Source
Sum of
Squares
Degrees
of
freedom
Mean
Square
Regression
SSreg
MSreg
/
2
Residual
SSres
n2
MSreg =
SSreg /1
MSres =
SSres
2
(n2) =
Total
SStotal
n1
When the null hypothesis is rejected, i.e. computed F-statistic

exceeds a critical value f (1, n 2) the conclusion is:
there is a significant amount of variation in the response

accounted for by the postulated model (simple linear
regression)
NOTE: the t-test allows for testing both two-side and one-sided
alternative hypothesis, where as the ANOVA F-test is restricted to
testing against the two-sided alternative.
Week 2 Lecture 3 - Example 1: market model of stock

returns
Monthly rate of return on a stock (R) is linearly related to monthly
return on the overall stock market (Rm ).
R = 0 + 1 Rm +
Rm is taken to be the monthly rate of return on some major stock
market index
RECALL:
Coefficient 1 is called the beta coefficient of the stock
1 > 1 indicates stocks rate of return is more senstive to

overall market than average
1 < 1 less sensitive than average
Estimate 1 and is it significantly different from 1?
Week 2 Lecture 3 - Example 1: Market model of stock

returns
Scatter plot of Host International (y-axis) versus overall market

returns (x-axis) with fitted regression line.

returns
Fitted line is R = 0.14 + 1.60Rm
= 9.27, Sxx = 1117.90.

RECALL: 100(1 ) percentage confidence interval for 1 is

.
b1 t/2;n2 , b1 + t/2;n2
Sxx
Sxx
95% confidence interval for 1 :

9.27
9.27
, 1.60 + 2.002
1.60 2.002
= (1.04, 2.16),
1117.90
1117.90
which doesnt contain 1
A value of 1 for the slope does not seem plausible based on
the data.

returns
Hypothesis testing equivalent:
H0 : 1 = 1
versus
H1 : 1 6= 1
Test statistic:
1.60 1
b1 1
=
= 2.16.
/ Sxx
9.27/ 1117.90
So if T t58 , the p-value for the test is
p = Pr (|T | 2.16)
= 2Pr (T 2.16)
= 0.0349
so that we reject H0 : 1 = 1 at the 5% level
Week 2 Lecture 3 - Example 2: Risk assessment from

financial reports
DATA Collection:
Investors are interested in the riskiness of a stock.
Want company financial reports to provide information helpful

for assessing risk.
Seven accounting determined measures of risk (available from

a companys financial reports)
Divident payout, current ratio, asset size, asset growth,

leverage, variability in earnings, covariability in earnings.
These were computed for 25 well known stocks based on

annual reports.
Week 2 Lecture 3 - Example 2: risk assessment from

financial reports
Experiment
Data sent to a random sample of 500 financial analysts of which
209 responded
Mean rating assigned by the 209 analysts recorded for each of the
25 stocks
Mean rating by analysts taken as reasonable surrogate of

market risk for each stock
AIM: Want to predict market risk from accounting measures:

response is market risk, predictors are accounting measures
(multiple).
Week 2 Lecture 3 - Example 2: Risk assessment example
Estimated market risk (y-axis) versus log(asset size) (x-axis).

Fitted line is y = 8.143 0.412x and R 2 = 0.21
Week 2 Lecture 3 - Prediction in the simple linear

regression model
Reason for building a simple linear regression model is often to
predict a new response value when the value of the predictor is
known.
Example: risk assessment data
Riskiness of a stock rated by 209 financial analysts (response).

Predictors are various accounting determined measures of risk.
Aim: Simple linear regression model for risk with asset size as
predictor.
Outcome: For a company not assessed by the financial

analysts we can determine asset size from company reports
and predict risk using the fitted model.
Week 2 Lecture 3 - Confidence intervals for the mean and

prediction intervals
Prediction of a new response value when predictor is x0 :
y (x0 ) = b0 + b1 x0 .
True conditional mean of response at x0 :
0 + 1 x0
New response value y0 when predictor is x0 : write
y0 = 0 + 1 x0 + 0 ,
0 independent of previous responses, normal with mean zero,
variance 2 .
Want to find confidence interval for conditional mean, and interval
which covers y0 with specified confidence (prediction interval).
Week 2 Lecture 3 - Confidence and Prediction Intervals
Confidence interval for conditional mean will reflect our

uncertainty due to estimating 0 , 1 .
Prediction interval will reflect our uncertainty due to

estimating 0 , 1 , and the level of variation of the responses
about the conditional mean (captured by our estimate of 2 ).
First well define a statistic which can be used for constructing a

confidence interval for the conditional mean at x0 .
Consider y (x0 ) = b0 + b1 x0 .
y (x0 ) is a Gaussian random variable.
E(
y (x0 )) = E(b0 + b1 x0 ) = 0 + 1 x0 .

x )2
Var(
y (x0 )) = 2 n1 + (x0S
xx
Week 2 Lecture 3 - Confidence interval for the mean
Var(
y (x0 )) = Var (b0 + b1 x0 )
= Var (b0 ) + x02 Var (b1 ) + 2x0 Cov (b0 , b1 )

2

2
x2
x
2 1
2
=
+
+ x0
2x0
n Sxx
Sxx
Sxx

2
1 (x0 x)
+
.
= 2
n
Sxx
(1)
Week 2 Lecture 3 - Confidence interval for the mean
y (x0 ) is normally distributed, so

y (x0 ) 0 1 x0
q
x )2
n1 + (x0S
xx
N(0, 1).
As in previous lectures
(n 2)
2
2n2
2
independently of y (x0 ) (since y (x0 ) is a linear combination of b0 ,
b1 both independent of
2 ).
Week 2 Lecture 3 - Confidence intervals for the mean

We have
y (x0 ) 0 1 x0
q
tn2 .
x )2
n1 + (x0S
xx
Write t/2,n2 for upper 100 /2 percentage point of t

distribution with n 2 degrees of freedom,
(x
)
x
0
0
1 0
q
t/2,n2
Pr t/2,n2
(x0
x )2
1
n + Sxx
=1
Confidence interval for 0 + 1 x0 :

y (x0 ) t/2,n2
1 (x0 x)2
+
n
Sxx
Week 2 Lecture 2 - Learning Expectations.
Understand the quantities within the analysis of variance table
Be able to use the ANOVA table to answer questions

regarding suitability of a postulated statistical model.
Be able to formulate and evaluate a confidence intervals for

the mean.
Be able to formulate and evaluate a prediction interval

MATH2831/2931 Linear Models/ Higher Linear Models.: August 2, 2013

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MATH2831/2931 Linear Models/ Higher Linear Models.: August 2, 2013

Uploaded by

Copyright:

Available Formats

MATH2831/2931

Linear Models/ Higher Linear Models.

Week 2 Lecture 3 - Last lecture:

Review of Hypothesis testing two-sided and one-sided tests.

Hypothesis testing for 0 and 1 .

Example 1: zinc concentrations in plants.

Example 2: sales and advertising data.

Week 2 Lecture 3 - This lecture:

The analysis of variance table

Inference on 1 : some further examples on testing and

Confidence intervals for the mean.

Week 2 Lecture 3 - The ANOVA table

So far we have considered the model with a single

NOTE 1: Prediction of the response will be poor in situations

NOTE 2: Prediction of the response will be poor in situations

Here we utilize ANOVA to analyze the quality of the estimated

Week 2 Lecture 3 - The ANOVA table

If the true unknown model is linear in more than one variable

which is calculated only considering x1 , is a biased estimate

The bias will be a function of the additiona coefficient 2

We need a methodology to assess the quality of our

Week 2 Lecture 3 - The ANOVA table

Analysis of the quality of an estimated regression line can be

ANOVA procedure: considers total variation in dependent

Residual sum of squares (variation unexplained by the fit)

Week 2 Lecture 3 - The ANOVA table

This statistic (T) has a t-distribution with n 2 degrees of

From our distribution results (Week 2 Lecture 1) we also can

has an F distribution with 1 and n 2 degrees of freedom.

Week 2 Lecture 3 - The ANOVA table

(ratio of variation explained by the model to scaled residual

Week 2 Lecture 3 - The ANOVA table

has an F1,n2 distribution under H0 : 1 = 0 as claimed.

Week 2 Lecture 3 - The ANOVA table

Week 2 Lecture 3 - The ANOVA table

Week 2 Lecture 3 - The ANOVA table

When the null hypothesis is rejected, i.e. computed F-statistic

there is a significant amount of variation in the response

Week 2 Lecture 3 - Example 1: market model of stock

Coefficient 1 is called the beta coefficient of the stock

1 > 1 indicates stocks rate of return is more senstive to

1 < 1 less sensitive than average

Estimate 1 and is it significantly different from 1?

Week 2 Lecture 3 - Example 1: Market model of stock

Scatter plot of Host International (y-axis) versus overall market

Week 2 Lecture 3 - Example 1: Market model of stock

= 9.27, Sxx = 1117.90.

Week 2 Lecture 3 - Example 1: Market model of stock

so that we reject H0 : 1 = 1 at the 5% level

Week 2 Lecture 3 - Example 2: Risk assessment from

Investors are interested in the riskiness of a stock.

Want company financial reports to provide information helpful

Seven accounting determined measures of risk (available from

Divident payout, current ratio, asset size, asset growth,

These were computed for 25 well known stocks based on

Week 2 Lecture 3 - Example 2: risk assessment from

Mean rating by analysts taken as reasonable surrogate of

AIM: Want to predict market risk from accounting measures:

Week 2 Lecture 3 - Example 2: Risk assessment example

Estimated market risk (y-axis) versus log(asset size) (x-axis).

Week 2 Lecture 3 - Prediction in the simple linear

Riskiness of a stock rated by 209 financial analysts (response).

Outcome: For a company not assessed by the financial