4 views

Uploaded by 薛小芮

purdue stat 511

- 48_175
- Anava Mpiwwww
- Anova
- Optimization of Surface Roughness in CNC Turning of Aluminium 6061 Using Taguchi Techniques
- Case 5 - Lanco Case
- Anova
- Statement of the Problem
- MDT
- Hypotest F
- Mukhyi, Nurul & Dwi Asih
- ch03.doc
- 714012081_content
- anova
- Introducing SigmaXL Version 7 - Aug 13 2014
- kz file
- Marketing Research SPSS
- Finding Seasonal Spreads
- Anova
- Proposal Presentation
- Event Study Hong Leong

You are on page 1of 13

' $

grown in 12 open-top chambers, say I, groups (populations), one of-

which are subject to 4 treat- ten uses an analysis of variance

ments, 3 each, with O3 and SO2 model, or ANOVA.

present/absent. The total yield For the I populations, we use 1 ,

Slide 1 was measured for each chamber. 2 , . . . , I and 1 , 2 , . . . , I to

Sulfur Dioxide denote their respective means and

Ozone Absent Present standard deviations. Similarly, the

Absent 1.52 1.49 sample mean, sample standard de-

1.85 1.55 viation, and sample size of the ith

1.39 1.21 population are denoted by x i , si ,

Present 1.15 0.65 and Ji .

1.30 0.76

Of most interest are the compar-

1.57 0.69

isons between the i s.

& %

' $

For Ji s large, by CLT,

2

i N (i , i ),

P

trt Ji j xij i

x X Ji

1 3 4.76 1.5867

and s2i are reliable estimates of i2 .

2 3 4.25 1.4167

For Ji s small, one assumes normal-

3 3 4.02 1.3400

Slide 2 ity and 12 = = I2 = 2 .

4 3 2.10 0.7000

i = J1i Jj=1

P i

P P

vations is i j xij = 15.13, so x xij ,

the grand mean is where xij is the jth observation in

x

= 15.13

= 1.2608. the ith group. The grand mean is

12

= n1 Ii=1 Jj=1

P P i

x xij ,

The Ji s here are all equal so x

PI

where n = i=1 Ji is the total num-

is the mean of xi s. This would

ber of observations in the I groups.

not be the case for Ji s unequal.

& %

C. Gu Spring 2016

STAT 511 ANOVA and Regression 2

' $

trt

P

i )2 s2i 12 = = I2 = 2 ,

j (xij x

1 .112467 .056233 one would like to estimate the com-

2 .065867 .032933 mon variance 2 using all available

3 .090600 .045300 information. Such information is

Slide 3 4 .006200 .003100 contained in the sum of squared

errors,

SSE is

SSE = Ii=1 Jj=1 i )2

P P i

(xij x

i )2 = .275134,

P P

i j (xij x

= Ii=1 (Ji 1)s2i .

P

and MSE is

The pooled variance estimate is

s2p = .275133

= .034392.

124

given by

For Ji s all equal, s2p = i s2i /I. SSE

P

s2p = MSE = ,

In general, s2p is a weighted mean nI

PI

of s2i with weights (Ji 1). where n I = i=1 (Ji 1).

& %

' $

SSTr is given by groups, one calculates the sum of

squares for treatments,

)2 = 1.353758,

P

i 3(

xi x

SSTr = Ii=1 Jj=1 )2

P P i

(

xi x

and SST is given by

= Ii=1 Ji ( )2 .

P

xi x

Slide 4 P P

)2 = 1.628892.

i j (xij x

It can be shown that

It is easy to verify that

)2 = i )2

P P P P

i j (xij x i j (xij x

SST = SSTr + SSE

)2 ,

P P

+ i j (xi x

If one ignores the grouping,

)2 .

P P

where SST = i j (xij x

then the sample variance of

For I = 2, it can be shown that

the n observations is

( 2 )2

x1 x

s2 = 1

SST. SSTr = 1 .

n1

J1

+ J12

& %

C. Gu Spring 2016

STAT 511 ANOVA and Regression 3

' $

MSE is an unbiased estimate of 2 .

degrees of freedom n I and n 1.

Similarly, SSTr has df I 1. Note

For i s all equal, MSTr is also an

that

unbiased estimate of 2 . When

n 1 = (n I) + (I 1). i s are not all equal, MSTr tends

Slide 5

Dividing SS by the corresponding to be larger.

df, one gets a mean square (MS).

To test the hypotheses

An ANOVA table summarizes

H0 : 1 = = I vs. Ha : o.w.,

all the information.

Src SS df MS Calculate

MSTr

Trt SSTr I 1 SSTr f= ,

I1 MSE

SSE and reject H0 when f > F,1 ,2 ,

Error SSE nI nI

Total SST n1 where 1 = I 1 and 2 = n I.

& %

' $

F -Distribution

and Zj N (0, 1), j = 1, . . . , n, ANOVA table is given by

independent. The distribution of Src SS df MS

Pm

Yi2 /m

Pi=1

n

Trt 1.3538 3 .4513

2

j=1 Zj /n

Error 0.2751 8 .0344

is called a F -distribution with

Slide 6 degrees of freedom n = m and Total 1.6289 11

d = n. It is easy to calculate

F(3,8) and F(8,3)

.4513

f= = 13.12,

.0344

0.6

0.4

significance level.

0.2

0.0

0 1 2 3 4 5 6

& %

qf(.95,3,8).

C. Gu Spring 2016

STAT 511 ANOVA and Regression 4

' $

MSTr ( 2 )2

x1 x Since SST = SSTr + SSE, one only

f= = 2 1 . needs to calculate two of the three

MSE sp ( J1 + J12 )

Reject H0 when f > F,1,n2 . terms.

)2

P P

Compare this with the t-test for SST = i j (xij x

Slide 7

H0 : 1 = 2 versus Ha : 1 6= 2 , xij )2

P P

(

x2ij i j

P P

= ,

x

1 x

2 i j n

t= q ,

)2

P P

sp J11 + J12 SSTr = i j (

xi x

P (Pj xij )2 ( i j xij )2

P P

with a rejection region |t| > = ,

i Ji n

t/2,n2 . We notice that f = t2 .

i )2

P P

SSE = i j (xij x

Actually, one also has F,1, =

xij )2

P

(

t2/2, , so the F -test is equivalent 2 j

P P P

= i j xij i Ji

.

to the t-test we learned earlier.

& %

' $

Sample SSE = 244 + 114 + 56

1 2 3 222 202 122

( + + )

12 8 6 2 4 3

= 24,

Slide 8 10 5 2

3 4 222 202 122 542

SSTr = ( + + )

2 4 3 9

4

= 66.

Ji 2 4 3

P

j xij 22 20 12 Since f = 66/2 24/6

= 8.25 and

P 2

244 114 56 F.05,2,6 = 5.14, we reject

j xij

i

x 11 5 4 H0 : 1 = 2 = 3

P P

n = 9, i j xij = 54, x

= 6. at the 5% significance level.

& %

C. Gu Spring 2016

STAT 511 ANOVA and Regression 5

' $

x

1 = 1.5867, x2 = 1.4167, are derived from the fact that

2

X i N (i , ).

s2p = .0344 = .18552 , Ji

J1 = J2 = 3, = 8. A (1 )100% CI for i is

s

Slide 9 A 95% CI for 1 is s2p

q x

i t/2, ,

1.5867 2.306 .0344 , Ji

3

where = n I.

or (1.340, 1.834), where t.025,8 =

2.306. A (1 )100% CI for 1 2 is

r

1 1

A 95% CI for 1 2 is ( 2 ) t/2, s2p (

x1 x + )

q J1 J2

.17 2.306(.1855) 23 ,

Tests for hypotheses concerning

or (.179, .519). One would ac-

these parameters can be similarly

cept H0 : 1 = 2 at the 5% level.

& %

constructed.

' $

trast of interest is = c1 1 + + cI I ,

= (1 2 ) (3 4 ). is to be estimated by

= 0 implies no interaction be- = c1 x

1 + + ck x

I ,

tween O3 and SO2 . with a standard

Slide 10 s error

The estimate is given by c21 c2

= sp

+ + I .

= x

1 x

2 x

3 + x

4 = .47, J1 JI

with a standard error When c1 , . . . , cI add to zero,

p P

= .1855 4/3 = .2142. i ci = 0, such a is called a con-

.47 2.306(.2142), contrast.

or (.964, .024). One would con- In applications, contrasts are often

clude = 0 at the 5% level. of the most interest.

& %

C. Gu Spring 2016

STAT 511 ANOVA and Regression 6

' $

for the area A and radius r of a circle; or (ii) y = 95 (x 32) for

thermometer readings xo F and y o C.

Statistical relations: Variables tend to vary together, but there

Slide 11

is no deterministic coupling. Among examples are (i) ages of

married couples; and (ii) lengths and weights of snakes.

3.0

2.0

weight (gm)

area

1.0

0.0

radius length (cm)

& %

' $

of father-son pairs, Galton

Y = 0 + 1 x +

found, in late 19th cen-

tury, that for fathers taller

Y response or dependent var.

than average, the average

height of their sons is be- x predictor or indep. var.

Slide 12 tween their height and the noise or random error

average. Ditto for fathers

shorter than average. Y varies randomly given x. The distri-

o

o

o

bution of Y varies systematically with

o

o

o oo

o

oo o o

o o o

o o oo

o

o o o o o o o

o o oo oo o

oo o o o oo o o

o

o o o o

o oo o o o oo o oo o o

o

o o o oo o o o ooo oo

o o o oo

o o oo o

o o o o o o o oooo oo

o

o

o

o oo ooo o o o

Y x = 0 + 1 x.

o o ooooo o oo o o

o o o o oo o

o o o o o

o o o o o o oo o o oo

oo o o o oo o o oo

oo o oo oooo o oooo o o o o o

ooo o o o ooo ooo ooo o o o o oo oo

o o o ooo o oooooooooo o oo

oo o o o o o o oooo oooooo o ooo

o o

o ooo o o o ooooo ooo o o oo o o oo o oo

o

o

oo oo o o

oo

o o oooo o oo o ooo o oo oooooo o o o o

o oo o o

o o o o o oo oo o o oooooo o o oooooooooo o o o ooo o o

oo o oo oo ooo ooo ooooooooo oo oo oo o

oo oooooooo o oo o

oo ooo oo o oo o ooo o ooo ooo o o oo o oo o

o o oo o o oo oooo o ooo o o o o

o o o o ooo o oo o oo o o o oo oo o o o

o

o

o oo oo

oo

ooooooo o oooooooooo oo oo o

o oo oo oo o

o o oo ooooooo oooo o o oo o

o

oo oooo oooo oo ooo o oooo o ooo o o o

o o oo o oooooooooooo o oooo oo oo oo oo oo o oo o

o oo o o o o o o

oo o o oo o o oo o ooo ooo o o o oo

o o

oo o oo o oo ooo o ooo o ooo ooo o oooo oo o

o o oo oo oooooo oo oo o o oo oo o o o o o

o o o o o o o o ooooooo o oooo o oo o

o o o o o oo oooo o o ooo o o o o

oo ooo o

oo o o

oo oo o o oooooo o o o

oo o o o o o o oo o oo o

oo o oo o ooo o oo ooooo ooo o oo oo o

o o o o

o o o o ooo o oooo o ooo o o oo o o

o o o o o oo oo

o o o ooo o o o o o o o o

oo oo o o o o o o oo o o

o o o

o o o

o o oo

o o oo o

o oo o oo oo o o o

o o o o o o

oo

o o o o oo o o o

o o o o o

o

o o o

o o

o

oo

o

o o o

& %

C. Gu Spring 2016

STAT 511 ANOVA and Regression 7

' $

Yi = 0 + 1 xi + i

It is usually assumed that i N (0, 2 ).

Y = 12 + 8x + , (xi , yi ), and estimates model

where N (0, 9). Since parameters 0 , 1 , and 2 .

Y |x = 1 N (20, 9), Y x = 0 + 1 x is a strong

one has assumption.

The normality assumption can

P (Y < 17|x = 1)

sometimes be weakened to

17 20

= P (Z < ) = .1587 i = 0 and 2i = 2 .

3

& %

' $

Length Weight

were caught and measured. The lengths and

60 136 weights are listed on the left and plotted below.

69 198

66 194

200

Slide 14

180

64 140

140 160

weight (gm)

54 93

67 172

120

59 116

100

65 174

63 145 55 60 65

length (cm)

& %

C. Gu Spring 2016

STAT 511 ANOVA and Regression 8

' $

n

male snakes. X

Q= (yi (0 + 1 xi ))2 ,

The LS estimate of regression i=1

Slide 15 (LS) estimates of (0 , 1 ),

Y=-301+7.19X

Y=-227+6X

180

Sxy

b1 = 1 = ,

Sxx

weight (gm)

b0 = 0 = y b1 x

140

where

80 100

Q=1093.7

Q=1347 P

Sxy = i (xi x

)(yi y),

55 60 65

)2 .

P

length (cm) Sxx = i (xi x

& %

' $

The lengths and weights of

female snakes. edly) estimated by the fitted regression

function

x y y e

Y x = Y = b0 + b1 x.

60 136 130.4 5.6

At the data points, one has the fitted

Slide 16 69 198 195.2 2.8

values (y-hat)

66 194 173.6 20.4

yi = b0 + b1 xi ,

64 140 159.2 -19.2

and the residuals

54 93 87.3 5.7

ei = yi yi = yi (b0 + b1 xi ).

67 172 180.8 -8.8

59 116 123.2 -7.2 The fitted values and residuals satisfy

Pn

65 174 166.4 7.6 i = n

P

i=1 y i=1 yi ,

63 145 152.0 -7.0 Pn Pn

i=1 ei = i=1 xi ei = 0.

& %

C. Gu Spring 2016

STAT 511 ANOVA and Regression 9

' $

Estimation of 2

Yi = + i , residual sum of squares

n n

where i = 0 and 2i = 2 . The X 2

X

SSE = (yi yi ) = e2i ,

estimate i=1 i=1

yi =

= y and use

Slide 17

actually minimizes P

yi )2

2 SSE i (yi

s = = .

Q= n 2

P

i=1 (yi ) . n2 n2

An unbiased estimate of 2 is Unbiasedness: s2 = 2 .

Pn

i )2

i=1 (yi y

2

s = To calculate s2 , use

n1 2

Sxy

Pn 2 SSE = Syy ,

i=1 ei Sxx

= ,

n1 where

y)2 .

P

where yi contains one parameter. Syy = i (yi

& %

' $

Details of Calculation

( xi )2

P P P

X xi yi X

Sxy = xi yi , Sxx = x2i .

n n

P P 2

Slide 18 xi = 567 xi = 35893 1237

P P 2 b1 = 172

= 7.19

yi = 1368 yi = 217926

P b0 = 152 7.19(63)

xi yi = 87421

Then calculate = 301

=

x 567

= 63, y = 1368

= 152, SSE is given by

9 9

5672 12372

Sxx = 35893 = 172, 9990 172

= 1093.7,

9

= 9990, so 2 is estimated by

9

Sxy = 87421

567(1368)

= 1237. s2 = 1093.7

92

= 156.24.

& %

9

C. Gu Spring 2016

STAT 511 ANOVA and Regression 10

' $

Inferences Concerning 1

q

sb1 = 156.24

172

= .953. where b21 = 2 /Sxx is to be esti-

A 95% CI for 1 is given by mated by

Slide 19 7.19 2.365(.953), s2

s2b1 = .

where t.025,7 = 2.365. Sxx

The inferences are based on

To test the hypotheses

H0 : 1 = 0 vs. Ha : 1 6= 0, b1 1

tn2 .

sb1

we calculate

t= 7.190

= 7.545, For example, a (1 )100% CI for

.953

and reject H0 even at the 1%- 1 is given by

level, as |t| > 3.499 = t.005,7 . b1 t/2,n2 sb1 .

& %

' $

Analysis of Variance

The lengths and weights of female snakes.

yi y = (

yi y) + (yi yi ),

56.94

F

random. It can be shown that

8896.3

156.24

MS

)2 = i ( yi y)2 + i (yi yi )2

P P P

Slide 20 i (yi y

df

1

8

7

8896.3

9990.0

1093.7

mation.

SS

Source SS df MS f

SSR MSR

Model SSR 1

Source

Model

1 MSE

Resid

Total

2 SSE

Resid SSE n2 s = n2

Total SST n1

& %

C. Gu Spring 2016

STAT 511 ANOVA and Regression 11

' $

F -Test for 1 = 0

male snakes. MSR = 2 + 12 Sxx ,

Since MSE = 2 .

f= = 56.94, MSR

156.24 f= F1,n2 .

Slide 21 F.01,1,7 = 12.246, MSE

These lead to the F -test for

we reject H0 : 1 = 0 at the 1% H0 : 1 = 0 vs. Ha : 1 6= 0,

level. which rejects H0 when Fs >

This is equivalent to the t-test on F,1,n2 .

Slide 19. Note that The F - and t-tests are equivalent:

MSE

= f = t2 = ( sb1 )2 ,

b1

& %

' $

Inferences Concerning 0

snakes, 0 has no meaning.

b0 N (0 , b20 ),

Consider Y = 15 + 5X + , where

N (0, 4). Given xi = 8(.1)10, where

simulate Yi and estimate the re- 2

b20 = 2 { n1 + x

Sxx

}

Slide 22 gression function.

is to be estimated by

o

oo

o ooooooo o o

2

60

oooo

o o

oo s2b0 = s2 { n1 + x

Sxx

}

40

b0 0

tn2 .

20

sb0

For |

x| large, 0 is hard to esti-

0

& %

0 2 4 6 8 10

mate, or to interpret.

C. Gu Spring 2016

STAT 511 ANOVA and Regression 12

' $

Inferences Concerning Y x = 0 + 1 x

male snakes.

Y N (0 + 1 x, Y2 ),

We are to estimate the average

weight of snakes of length 60 cm. where Y = b0 + b1 X, and

(xx)2

Y = 301 + 7.19(60) Y2 = 2 { n1 + Sxx

}

Slide 23

= 130.4, is to be estimated by

2

s2Y = 156.24{ 19 + (6063)

172

} s2Y = s2 { n1 + (xx)2

}.

Sxx

so a 95% CI for 0 + 1 60 is Y (0 + 1 x)

tn2 .

sY

130.4 2.365(5.053),

For |x x

| large, 0 + 1 x is hard to

or (118.45, 142.35).

& %

estimate.

' $

male snakes. Y = 0 + 1 x + ,

one has to allow for the variability

We are to predict the weight of a

of .

snake of length 60 cm.

With 0 , 1 , and 2 known, the pre-

Slide 24 Y = 130.4,

diction interval

s2 = 156.24, (0 + 1 x) z/2

s2Y = 25.535 covers Y with probability 1 .

b0 + b1 x, we use

130.4 2.365 156.24 + 25.535,

q

Y t/2,n2 s2 + s2 ,

Y

or (98.51, 162.29). This is wider where the variances of Y and are

than the CI for 0 + 1 60. estimated by s2Y and s2 .

& %

C. Gu Spring 2016

STAT 511 ANOVA and Regression 13

' $

R2 , Correlation

The coefficient of determina-

R = 2 8896.3

= .891 tion, or R2 ,

9990

r= 1237

= .944 SSR SSE

172(9990) R2 = =1 ,

SST SST

o

o o

measures the amount of variation

Slide 25 o o

oo

o o

oo o o

o

ooo o o oo ooo oo

o

o o oo o o o

o o oooo ooo

oo

o o o

o oo o o

o

o oo o o oooo oo ooo o o o o

oo oo o o o o o oo

o oo o oo

ooo ooo o oo

o o oo ooo o o o o o

o oooooo o

o

o o o o o oo o

o oo o o oo ooo ooo o

o o

o o o oo

oo ooo ooo

o o o o

o

o o oo o

o

ooo

o oo o

oo

o

The coefficient of correlation,

ooo o o o o

o o

o o

o Sxy

r= p ,

o

o

o

oo

o

Sxx Syy

o o o o

o

o oo o

o o

o o o oo o

o

o

o

oo o

o o

oo o oo o

oo

o o o o o oo o

o o

o

o o

o

oo o

o

o

oo measures the linear association be-

o o oooo

o

o o oo oo oooo o oo oo o o o o ooo

o

o

o o ooo

o

oo o o

o

o

oo

oo o o oo o o o

oooo oooo o

o

o

o oo o o

oo

o o

oo

o oo o o o

o o

oo oo

o

o tween X and Y .

o o o o oo o oo ooo o o

o oo oo oo o

o

o oo o o o o o o

0 R2 1. 1 r 1. R2 = r2 .

o o oo oo

o oo

oo o

o

& %

o

C. Gu Spring 2016

- 48_175Uploaded byMarlon Bundalian Cantal
- Anava MpiwwwwUploaded byWeni Winarti
- AnovaUploaded bykamalkant05
- Optimization of Surface Roughness in CNC Turning of Aluminium 6061 Using Taguchi TechniquesUploaded byIJMER
- Case 5 - Lanco CaseUploaded byhonkme41
- AnovaUploaded byIqra Jawed
- Statement of the ProblemUploaded byCJ Angeles
- MDTUploaded byAnkush Bhushan
- Hypotest FUploaded byHassan Khan
- Mukhyi, Nurul & Dwi AsihUploaded bypatamxitin
- ch03.docUploaded bySarah Johnson
- 714012081_contentUploaded byPradeep Joshi
- anovaUploaded byEdwin Zambrano
- Introducing SigmaXL Version 7 - Aug 13 2014Uploaded bymarian_grajdan
- kz fileUploaded byimtiaz
- Marketing Research SPSSUploaded byJohn Zimmerman
- Finding Seasonal SpreadsUploaded byarisdavid
- AnovaUploaded bymilena152
- Proposal PresentationUploaded byiqbal
- Event Study Hong LeongUploaded byEdward Chua
- tmpFFF.tmpUploaded byFrontiers
- Practical Statistics for the Analytical ScientistUploaded byGustavo
- La deshidratación de la proteína cruda de Ginkgo biloba L. por congelación secado por microondasUploaded byAlex Román
- kpUploaded byKhairah A Karim
- METODE FAHMI.rtfUploaded byRahmad Abdhar
- 1--impact of Cash Conversion Cycle on Profitability of Sugar Sector in Pakistan.pdfUploaded by39281265
- P2900215001-02Uploaded byiqbal
- 12_chapter 5_2Uploaded byPruthviraj Rathore
- 06 01 Regression AnalysisUploaded byJohn Bofarull Guix
- MrcostestUploaded byCoco Luis Trinidad Alvarado

- The Effect of Garlic Supplements on the Pharmacokinetics of SaquinavirUploaded byAlvian Vian
- Relationship Between the Demographic Characteristics and the Effectiveness of E-banking Training Courses Based on Kirkpatrick Model ForUploaded byGlobal Research and Development Services
- Ibrahim Final - CopyUploaded byMuhammad Ibrahim
- MATHEMATICAL STUDIES PROJECT-IA.docxUploaded bymadhu_angel20018484
- Past 3 ManualUploaded byGhizela Vonica
- Friedman's Test with multiple comparisons tutorialUploaded byabarmas
- 22. Dr. Sangeeta BarwalUploaded byAnonymous CwJeBCAXp
- Inhalation of ChlorineUploaded bymplenna
- KPSSUploaded byHalil Coşkun Çelik
- competitive anxiety.pdfUploaded bymarcferr
- All DMAIC Paper for BB NewUploaded bydeepu200778
- 82Uploaded bybdela
- Facto MinerUploaded byAdonai Sala
- 93961677.pdfUploaded byAlessandro Bianchessi
- Looking for the Bouba-kiki Effect in Prelexical InfantsUploaded byYilbert Oswaldo Jimenez Cano
- Research ProposalUploaded byJester Rafols
- Kapok Fiber Cement Board for Wall CladdingUploaded bypepit
- Engineering a highly elastic human protien based sealant for surgical applicationsUploaded byKamonashis Halder
- AKSHATHA_VENKATESHAUploaded byTegar Shidarta
- Application Taguchi ApproachUploaded byRobson Faria
- ANOVA AnalysisUploaded bykvnikhilreddy
- HamburguesasUploaded byLucilaSartorelli
- Product AttachmentUploaded byRosu Bogdan
- Biomedical Literature EvaluationUploaded byRuth Sharmaine
- Rellstab, 2007. Starving With a Full Gut. Effect of Suspended Particles on the Fitness of Daphnia Hyalina.Uploaded byJuan Carlos Reyes Hagemann
- Impact of Training on PerformanceUploaded byOwen Hudson
- Probabilistic Models in the Study of LanguageUploaded bytenshi66
- Application of Taguchi Method of Experimental DesignUploaded bymohamedadelali
- 6. Temp Effect on CompactionUploaded byvidyaranya_b
- Kontovas et al (2010) An empirical analysis of IOPCF oil spill cost dataUploaded bycaptain_hook_