Lecture4 Module2 Anova 1

Analysis of Variance and Design g of ExperimentExperiment p -I
MODULE II
LECTURE - 4
GENERAL LINEAR HYPOTHESIS AND ANALYSIS OF VARIANCE

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Regression model for the general linear hypothesis

Let Y1 , Y2 ,..., Yn be a sequence of n independent random variables associated with responses. Then we can write it as
E (Yi ) = j xij , i = 1, 2,..., n, j = 1, 2,..., p
j =1 p
Var (Yi ) = 2 .
This is the linear model in the expectation form where 1 , 2 ,..., p are the unknown parameters and x ij ' s are the known values of independent covariates X 1 , X 2 ,..., X p . Alternatively, the linear model can be expressed as
Yi = j xij + i , i = 1, 2,..., n; j = 1, 2,..., p

j =1
where i s are identically and independently distributed random error component with mean 0 and variance 2 , i.e.,
E ( i ) = 0, 0 Var ( i ) = 2 and Cov( i , j ) = 0(i j ) ).
In matrix notations, the linear model can be expressed as
Y = X +
where Y = (Y1 , Y2 ,..., Yn ) ' is n 1 vector of observations on response variable,
X 11 X 12 ... X 1 p X 21 X 22 ... X 2 p is n p matrix of n observations on p independent covariates X 1 , X 2 ,..., X p , the matrix X = X X ... X n 1 n 2 np
3
= ( 1 , 2 ,..., p ) is a p 1 vector of unknown regression parameters (or regression coefficients) 1 , 2 ,..., p
associated with X 1 , X 2 ,..., X p , respectively and
= (1 , 2 ,..., n ) is a n 1 vector of random errors or disturbances.

We assume that E ( ) = 0, covariance matrix V ( ) = E ( ') = 2 I p , rank ( X ) = p. In the context of analysis of variance and design of experiments,
the matrix X is termed as design matrix; unknown 1 , 2 ,..., p are termed as effects; the covariates X 1 , X 2 ,..., X p are counter variables or indicator variables where xij counts the number of times the effect j occurs in the ith observation xi .
x ij mostly takes the values 1 or 0 but not always.
The value xij = 1 indicates the presence of effect j in xi and xij = 0 indicates the absence of effect j in Xi.
Note that in the linear regression model, the covariates are usually continuous variables. When some of the covariates are counter variables and rest are continuous variables, then the model is called as mixed model and is used in the analysis of covariance.
Relationship between the regression model and analysis of variance model

The same linear model is used in the linear regression analysis as well as in the analysis of variance. So it is important to understand the role of linear model in the context of linear regression analysis and analysis of variance. Consider the multiple linear model
Y = 0 + X 1 1 + X 2 2 + ... + X p p + .
In the case of analysis of variance model, the one-way classification considers only one covariate, two t way-classification l ifi ti model d l considers id t two covariates, i t three-way classification model considers three covariates and so on. If , and denote the effects associated with the covariates X, Z and W which are counter variables, then in
One-way model: Y = + X + Two-way model: Y = + X + Z + Three-way model : Y = + X + Z + W + and so on.
Consider an example of agricultural yield. The study variable denotes the yield which depends on various covariates
X 1 , X 2 ,..., X p . In case of regression analysis, the covariates X 1 , X 2 ,..., X p are the different variables like temperature,
quantity of fertilizer, amount of irrigation, etc.
Now consider the case of one way model and try to understand its interpretation in terms of multiple regression model. The covariate X is now measured at different levels, e.g., if X is the quantity of fertilizer then suppose there are p possible values, say 1 Kg., 2 Kg., ,..., p Kg. then X 1 , X 2 ,..., X p denotes these p values in the following way. The linear model now can be expressed as Y = o + 1 X 1 + 2 X 2 + ... + p X p + by defining
1 if effect of 1 Kg. fertilizer is present X1 = 0 if effect of 1 Kg. fertilizer is absent 1 if effect of 2 Kg. fertilizer is present X2 = 0 if effect of 2 Kg. fertilizer is absent 1 if effect of p Kg. fertilizer is present Xp = 0 if effect of p Kg. fertilizer is absent.
If effect of 1 Kg. of fertilizer is present, then other effects will obviously be absent and the linear model is expressible as
Y = 0 + 1 ( X 1 = 1) + 2 ( X 2 = 0) + ... + p ( X p = 0) + = 0 + 1 + .
If effect of 2 Kg. of fertilizer is present then
Y = 0 + 1 ( X 1 = 0) + 2 ( X 2 = 1) + ... + p ( X p = 0) + = 0 + 2 + .
If effect of p Kg. of fertilizer is present then
Y = 0 + 1 ( X 1 = 0) + 2 ( X 2 = 0) + ... + p ( X p = 1) + = 0 + p +
and so on.
If the experiment with 1 Kg. of fertilizer is repeated n1 number of times then n1 observation on response variables p as are recorded which can be represented
Y11 = 0 + 1.1 + 2 .0 + ... + p .0 + 11 Y12 = 0 + 1.1 + 2 .0 + ... + p .0 + 12 Y1n1 = 0 + 1.1 + 2 .0 + ... + p .0 + 1n1.
If X2 = 1 is repeated n2 times, then on the same lines n2 number of times then n1 observation on response variables are recorded which can be represented as
Y21 = 0 + 1.0 + 2 .1 + ... + p .0 + 21 Y22 = 0 + 1.0 + 2 .1 + ... + p .0 + 22 Y2 n2 = 0 + 1.0 + 2 .1 + ... + p .0 + 2 n2 .
7
The experiment is continued and if Xp = 1 is repeated np times, then on the same lines
Yp1 = 0 + 1.0 + 2 .0 + ... + p .1 + P1 Yp 2 = 0 + 1.0 + 2 .0 + ... + p .1 + P 2 Ypn p = 0 + 1.0 + 2 .0 + ... + p .1 + pn p .
All these n1 , n2 ,.., n p observations can be represented as

y 11 1 y12 1 y1n 1 1 y21 1 y 1 22 = y 2 n2 1 y p1 1 1 y p2 y pn 1 p 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 11 12 1n 1 21 0 22 1 + p 2 n2 p1 p2 pn p
or
Y = X + .
In the two way analysis of variance model, there are two covariates and the linear model is expressible as
Y = 0 + 1 X 1 + 2 X 2 + ... + p X p + 1 Z1 + 2 Z 2 + ... + q Z q +
where
X 1 , X 2 ,..., X p
denotes, e.g., the p levels of quantity of fertilizer, say 1 Kg., 2 Kg.,..., p Kg. and Z1 , Z 2 ,..., Z q
denotes, e.g., the q levels of level of irrigation, say 10 Cms., 20 Cms.,,10q Cms. etc. The levels X 1 , X 2 ,..., X p ,
Z1 , Z 2 ,..., Z q are defined as counter variable indicating the presence or absence of the effect as in the earlier
case. If the th effect ff t of f X1 and d Z1 are present, t i.e., i 1K Kg of f fertilizer f tili and d 10 Cms. C of f irrigation i i ti i is used d th then th the linear model is written as
Y = 0 + 1.1 + 2 .0 + ... + p .0 + 1.1 + 2 .0 + ... + p .0 + = 0 + 1 + 1 + .

If X2 = 1 and Z2 = 1 is used, then the model is Y = 0 + 2 + 2 + . The design matrix can be written accordingly as in the one way analysis of variance case. In the three way analysis of variance model
Y = + 1 X 1 + ... + p X p + 1Z1 + ... + q Z q + 1W1 + ... + rWr + .
The regression g p parameters ' s can be fixed or random.
If all ' s are unknown constants, they are called as parameters of the model and the model is called as a
fixed-effects model or model I. The objective in this case is to make inferences about the parameters and the error
variance 2 .
If for some j , xij = 1
for all i = 1, case, j occurs with every 1 2,..., 2 n then j is termed as additive constant. In this case
observation and so it is also called as general mean effect.
If all ' s are observable random variables except the additive constant, then the linear model is termed as
random-effects model, model II or variance components model. The objective in this case is to make inferences
2 2 2 about the variances of ' s, i.e., 1 , 2 ,..., p and error variance 2 and/or certain functions of them.
If some parameters are fixed and some are random variables, then the model is called as mixed-effects model or model III. In mixed effect model, at least one
is constant and at least one
is random variable. The
2 objective is to make inference about the fixed effect parameters, variance of random effects and error variance .

Lecture4 Module2 Anova 1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture4 Module2 Anova 1

Uploaded by

Copyright:

Available Formats

Analysis of Variance and Design g of ExperimentExperiment p -I

GENERAL LINEAR HYPOTHESIS AND ANALYSIS OF VARIANCE

Regression model for the general linear hypothesis

Yi = j xij + i , i = 1, 2,..., n; j = 1, 2,..., p

In matrix notations, the linear model can be expressed as

= (1 , 2 ,..., n ) is a n 1 vector of random errors or disturbances.

x ij mostly takes the values 1 or 0 but not always.

Relationship between the regression model and analysis of variance model

One-way model: Y = + X + Two-way model: Y = + X + Z + Three-way model : Y = + X + Z + W + and so on.

quantity of fertilizer, amount of irrigation, etc.

If effect of 2 Kg. of fertilizer is present then

If effect of p Kg. of fertilizer is present then

All these n1 , n2 ,.., n p observations can be represented as

Y = 0 + 1.1 + 2 .0 + ... + p .0 + 1.1 + 2 .0 + ... + p .0 + = 0 + 1 + 1 + .

The regression g p parameters ' s can be fixed or random.

If for some j , xij = 1

observation and so it is also called as general mean effect.

is constant and at least one

is random variable. The

You might also like