30 views

Uploaded by Devendren Sathasivam

- RMM Project Report
- The Relationship between the Rights of Citizens and Municipal Services in Small Towns
- Data Analysis Note, UT Dallas
- 54199223 a Report on Comparative Study of Home Loan Amp Customer Satisfaction
- Abstract for Project on Minimizing Trt
- Lecture 4 - How to Choose a Statistical Test
- ANOVA Assumptions
- Solutions
- QTM Cycle 7 session 6.ppt
- Spending & Saving Habits of Youth in the City of Aurangabad
- jjjjjjjjjjjjjjjj
- week 8
- cohesion versus passing.doc
- Tabel Anova Jwb No 1Tests of Between
- Journal.pone.0090767
- 510-Asmah
- Focused vs Unfocused
- Descriptive Statistics.docx
- Course Text
- 1043_3993_1097.ppt

You are on page 1of 9

two-sample t-test to more than two samples. (Further methods in Chapter 8 of Business

Statistics)

As an example: Using the 2-sample t-test we have tested to see whether there was any

difference between the size of invoices in a company's Leeds and Bradford stores. Using

Analysis of variance we can, simultaneously, investigate invoices from as many towns as

we wish, assuming that sufficient data is available.

Problem: Why can't we just carry out repeated t-tests on pairs of the variables?

If many independent tests are carried out pairwise then the probability of being correct

for the combined results is greatly reduced. For example, if we compare the average

marks of two students at the end of a semester to see if their mean scores are significantly

different we would have, at a 5% level, 0.95 probability of being correct. Comparing more

students:

Students

Pairwise tests

P( all correct)

0.95

0.05

0.953 = 0.875

0.125

{n(n-1)}/2

0.95n

1 - 0.95n

10

45

0.9545 = 0.10

0.90

Solution: We need therefore to use methods of analysis which will allow the variation

between all n means to be tested simultaneously giving an overall probability of 0.95 of

being correct at the 5% level. This type of analysis is referred to as Analysis of Variance

or 'ANOVA' in short. In general, it:

In general, Analysis of Variance, ANOVA, compares the variation between groups and

the variation within samples by analysing their variances.

One-way ANOVA: Is there any difference between the average sales at various

departmental stores within a company?

Two-way ANOVA: Is there any difference between the average sales at various stores

within a company and/or the types of department? The overall variation is split 'two ways'.

One-way ANOVA

Total variation

(SST)

due to difference between the

group means.

(SSE)

between the groups, i.e.

between the group means.

(SSG)

Two-way ANOVA

Total variation

(SST)

between the groups, i.e.

between the group means

(SSG)

due to difference between the

main group means.

(SSE1)

between the block means,

i.e. second group means

(SSBl)

variation not due to

difference between either

set of group means (SSE)

where SST = Total Sum of Squares; SSG = Treatment Sum of Squares between the

groups; SSBl = Blocks Sum of Squares; SSE = Sum of Squares of Errors.

(At this stage just think of 'sums of squares' as being a measure of variation.)

The method of measuring this variation is variance, which is standard deviation squared.

Total variance = between groups variance + variance due to the errors

It follows that: Total sum of =

squares (SST)

the groups (SSG)

to the errors (SSE)

If we find any two of the three sums of squares then the other can be found by difference.

In practice we calculate SST and SSG and then find SSE by difference.

Since the method is much easier to understand with a numerical example, it will be

explained in stages using theory and a numerical example simultaneously.

Example 1

One important factor in selecting software for word processing and database management

systems is the time required to learn how to use a particular system. In order to evaluate

three database management systems, a firm devised a test to see how many training hours

were needed for five of its word processing operators to become proficient in each of three

systems.

System A

16

19

14

13

18 hours

System B

16

17

13

12

17

hours

System C

24

22

19

18

22

hours

Using a 5% significance level, is there any difference between the training time needed for

the three systems?

In this case the 'groups' are the three database management systems. These account for

some, but not all, of the total variance. Some, however, is not explained by the difference

between them. The residual variance is referred to as that due to the errors.

Total variance = between systems variance + variance due to the errors.

It follows that:

Total sum

of squares

(SST)

Sum of squares +

between systems

(SSSys)

Sum of squares

of errors

(SSE)

The 'square' for each case is (x - x )2 where x is the value for that case and x is the mean.

The 'total sum of squares' is therefore x x 2 . The classical method for calculating this

sum is to tabulate the values; subtract the mean from each value; square the results; and

finally sum the squares. The use of a statistical calculator is preferable!

In the lecture on summary statistics we saw that the standard deviation is calculated by:

sn

or s n 1

x x

n

x x

n 1

2

]

Both methods estimate exactly the same value for the total sum of squares.

1

TotalSS Input all the data individually and output the values for n , x and n from

the calculator in SD mode. Use these values to calculate 2n and n2n .

n

n

n2

n n2 .

x

15

SSSys Calculate

17.33

3.419

11.69

175.3 = SS Total

System A

System B

System C

5

5

5

16

15

21

SS for Systems

15

17.33

2.625

n2

n n2 .

6.889

103.3 = SSSys

Below is the general format of an analysis of variance table. If you find it helpful then

make use of it, otherwise just work with the numbers, as on the next page.

General ANOVA Table (for k groups, total sample size N)

Source

S.S

d.f.

M.S.S.

Between groups

SSG

k-1

SSG

MSG

k 1

MSG

F

MSE

Errors

SSE

(N-1) - (k-1)

SSE

MSE

Nk

Total

SST

N-1

Method

Fill in the total sum of squares, SST, and the between groups sum of squares, SSG,

after calculation; find the sum of squares due to the errors, SSE, by difference;

the degrees of freedom, d.f., for the total and the groups are one less than the total

number of values and the number of groups respectively; find the error degree of

freedom by difference;

the mean sums of squares, M.S.S., is found in each case by dividing the sum of

squares, SS, by the corresponding degrees of freedom.

The test statistic, F, is the ratio of the mean sum of squares due to the differences

between the group means and that due to the errors.

Source

S.S.

d.f.

M.S.S.

103.3

3-1=2

103.3/2 = 51.65

51.65/6.00 = 8.61

Errors

72.0

14 - 2 = 12

72.0/12 = 6.00

Total

175.3

15 - 1 = 14

Between

systems

The methodology for this hypothesis test is similar to that described last week.

The null hypothesis, H0, is that all the group means are equal. H0: 1 = 2 = 3 = 4 etc.

The alternative hypothesis, H1, is that at least two of the group means are different.

The significance level is as stated or 5% by default.

The critical value is from the F-tables, F 1 , 2 , with the two degrees of freedom from

the groups, 1, and the errors, 2.

The test statistic is the F-value calculated from the sample in the ANOVA table.

The conclusion is reached by comparing the test statistic with the critical value and

rejecting the null hypothesis if the test statistic is the larger of the two.

Example 1 (cont.)

H0: A = B = C

Critical value: F0.05 (2,12) = 3.89 (Deg. of free. from 'between systems' and 'errors'.)

Test statistic: 8.61

Conclusion: T.S. > C.V. so reject H0. There is a difference between the mean learning

times for at least two of the three database management systems.

We can calculate a critical difference, CD, which depends on the MSE, the sample sizes

and the significance level, such that any difference between means which exceeds the CD

is significant and any less than it is not.

The critical difference formula is:

1

1

CD t MSE

n1 n 2

t has the error degrees of freedom and one tail. MSE from the ANOVA table.

5

Example1 (cont.)

1

1

CD t MSE

n1 n 2

1 1

1.78 6.00

5 5

2.76

System C takes significantly longer to learn than Systems A and B which are similar.

Two-way ANOVA

In the above example it might have been reasonable to suggest that the five Operators

might have different learning speeds and were therefore responsible for some of the

variation in the time needed to master the three Systems. By extending the analysis from

one-way ANOVA to two-way ANOVA we can find our whether Operator variability is a

significant factor or whether the differences found previously were just due to the Systems.

Example 2

System A

System B

System C

Operators

3

16

16

24

19

17

22

14

13

19

13

12

18

18

17

22

Again we ask the same question: using a 5% level, is there any difference between the

training time for the three systems? We can use the Operator variation just to explain some

of the unexplained error thereby reducing it, 'blocked' design, or we can consider it in a

similar manner to the System variation in the last example in order to see if there is a

difference between the Operators.

In the first case the 'groups' are the three database management systems and the 'blocks'

being used to reduce the error are the different operators who themselves may differ in

speed of learning. In the second we have a second set of groups - the Operators.

Total variance = between systems variance + between operators variance

+ variance of errors.

So

of squares

between systems

between operators

(SST)

(SSSys)

(SSOps)

Sum of squares

of errors

(SSE)

In 2-way ANOVA we find SST, SSSys, SSOps and then find SSE by difference.

SSE = SST - SSSys - SSOps

We already have SST and SSSys from 1-way ANOVA but still need to find SSOps.

Operators

6

1

16

16

24

18.67

System A

System B

System C

Means

2

19

17

22

19.33

3

14

13

19

15.33

4

13

12

18

14.33

5

18

17

22

19.00

Means

16.00

15.00

21.00

17.33

SSOps Inputting the Operator means as frequency data (n = 3) gives:

n = 15,

n = 2.078

x = 17.33,

and

n n2 = 64.7 = SSOps

Source

Between Systems

Between Operators

S.S.

d.f.

M.S.S.

103.3

3-1=2

103.3/2 = 51.65

51.65/0.91 = 56.76

64.7

5-1=4

64.7/4 = 16.18

16.18/0.91 = 17.78

7.3

14 - 6 = 8

7.3/8 = 0.91

Errors

Total

175.3

15 - 1 = 14

H0: A

Test Statistic: 56.76 (Notice how the test statistic has increased with the use of the more

powerful two-way ANOVA)

Conclusion: T.S. > C.V. so reject H0. There is a difference between at least two of the

mean times needed for training on the different systems.

Using CD of 1.12 (see overhead): C takes significantly longer to learn than A and/or B.

H0: 1

Test Statistic: 17.78

Conclusion: T.S. > C.V. so reject H0. There is a difference between at least two of the

Operators in the mean time needed for learning the systems.

Using CD of 1.45 calculated as previously (see overhead): Operators 3 and 4 are

significantly quicker learners than Operators 1, 2 and 5.

Table 4

= 5%

1

1

2

3

4

5

161.40

18.51

10.13

7.71

6.61

199.50

19.00

9.55

6.94

5.79

215.70

19.16

9.28

6.56

5.41

224.60

19.25

9.12

6.39

5.19

230.20

19.30

9.01

6.26

5.05

234.00

19.33

8.94

6.16

4.95

236.80

19.35

8.89

6.09

4.88

238.90

19.37

8.85

6.04

4.82

240.50

19.38

8081

6.00

4.77

6

7

8

9

10

5.99

5.59

5.32

5.12

4.96

5.14

4.74

4.46

4.26

4.10

4.76

4.35

4.07

3.86

3.71

4.53

4.12

3.84

3.63

3.48

4.39

3.97

3.69

3.48

3.33

4.28

3.87

3.58

3.37

3.22

4.21

3.79

3.50

3.29

3.14

4.15

3.73

3.44

3.23

3.07

4.10

3.68

3.39

3.18

3.02

11

12

13

14

15

4.84

4.75

4.67

4.60

4.54

3.98

3.89

3.81

3.74

3.68

3.59

3.49

3.41

3.34

3.29

3.36

3.26

3.18

3.11

3.06

3.20

3.11

3.03

2.96

2.90

3.09

3.00

2.92

2.85

2.79

3.01

2.91

2.83

2.76

2.71

2.95

2.85

2.77

2.70

2.64

2.90

2.80

2.71

2.65

2.59

16

17

18

19

20

4.49

4.45

4.41

4.38

4.35

3.63

3.59

3.55

3.52

3.49

3.24

3.20

3.16

3.13

3.10

3.01

2.96

2.93

2.90

2.87

2.85

2.81

2.77

2.74

2.71

2.74

2.70

2.66

2.63

2.60

2.66

2.61

2.58

2.54

2.51

2.59

2.55

2.51

2.48

2.45

2.54

2.49

2.46

2.42

2.39

21

22

23

24

25

4.32

4.30

4.28

4.26

4.24

3.47

3.44

3.42

3.40

3.39

3.07

3.05

3.03

3.01

2.99

2.84

2.82

2.80

2.78

2.76

2.68

2.66

2.64

2.62

2.60

2.57

2.55

2.53

2.51

2.49

2.49

2.46

2.44

2.42

2.40

2.42

2.40

2.37

2.36

2.34

2.37

2.34

2.32

2.30

2.28

26

27

28

29

30

4.23

4.21

4.20

4.18

4.17

3.37

3.35

3.34

3.33

3.32

2.98

2.96

2.95

2.93

2.92

2.74

2.73

2.71

2.70

2.69

2.59

2.57

2.56

2.55

2.53

2.47

2.46

2.45

2.43

2.42

2.39

2.37

2.36

2.35

2.33

2.32

2.31

2.29

2.28

2.27

2.27

2.25

2.24

2.22

2.21

40

60

120

4.08

4.00

3.92

3.84

3.23

3.15

3.07

3.00

2.84

2.76

2.68

2.60

2.61

2.53

2.45

2.37

2.45

2.37

2.29

2.21

2.34

2.25

2.17

2.10

2.25

2.17

2.09

2.01

2.18

2.10

2.02

1.94

2.12

2.04

1.96

1.88

- RMM Project ReportUploaded byAnurag Malviya Student, Jaipuria Lucknow
- The Relationship between the Rights of Citizens and Municipal Services in Small TownsUploaded byIjsrnet Editorial
- Data Analysis Note, UT DallasUploaded bymeisam hejazinia
- 54199223 a Report on Comparative Study of Home Loan Amp Customer SatisfactionUploaded bySunil Soni
- Abstract for Project on Minimizing TrtUploaded byPrasanna Venkatesh
- Lecture 4 - How to Choose a Statistical TestUploaded byLucasAzevedo
- ANOVA AssumptionsUploaded byAbuzar Tabassum
- SolutionsUploaded byAdnan Ali Raza
- QTM Cycle 7 session 6.pptUploaded byOttilie
- Spending & Saving Habits of Youth in the City of AurangabadUploaded bythesij
- jjjjjjjjjjjjjjjjUploaded byIpin Belawang Aja Gen
- week 8Uploaded byJoel
- cohesion versus passing.docUploaded byeteure
- Tabel Anova Jwb No 1Tests of BetweenUploaded byNaufal Junior
- Journal.pone.0090767Uploaded byhollymp
- 510-AsmahUploaded byMar Shrf
- Focused vs UnfocusedUploaded byjacobo22
- Descriptive Statistics.docxUploaded bySheralyn Torres Absalon
- Course TextUploaded bySrijit Sanyal
- 1043_3993_1097.pptUploaded byDelikanliX
- mit6 041f10 l24Uploaded byapi-246008426
- Exer on Hypothesis Testing (ANOVA and ChiSquare)Uploaded byMaria Lourdez Bayan
- besbes2009Uploaded byNez Ardenio
- TERM PAPER EduardoTavaresUploaded byejmtavares
- Versi Terjemahan Dari ABSTRAK REVISIUploaded byTirta Sari
- Final Report Mm2Uploaded byIshaBarapatre
- 1557625032130_jyothi Final ProjectUploaded bybagya
- Home Background and Classroom Interaction as Correlates of Basic Science Students’ Achievement in Upper Basic Schools in Nasarawa State, NigeriaUploaded byInternational Journal of Current Innovations in Advanced Research
- IRJET-Optimization of Runner System of Multi-Cavity Injection Molding Process: A Case Study for Electrical Switch BoxUploaded byIRJET Journal
- Freakonomics Gone Criminals, CritiqueUploaded byRaluca Popp

- Borang Pertukaran Program -A142Uploaded byDevendren Sathasivam
- borangUploaded byNur Faizah Peson
- Ujian-Matematik-1-Tahun-5.docxUploaded byDevendren Sathasivam
- Mid Term Assessment HOS60303 082014Uploaded byDevendren Sathasivam
- Instrumen Contoh Matematik Kertas 1 UPSR 2016Uploaded bydnyapkak573
- Soalan-Ujian-Bm-Tahun-4Uploaded byDevendren Sathasivam
- Borang Penglibatan Dalam KumpulanUploaded byDevendren Sathasivam
- Borang Penglibatan Dalam KumpulanUploaded byDevendren Sathasivam
- SEJARAH FORM4-Bab 6Uploaded byMuhammad Azeem
- A aUploaded byDevendren Sathasivam
- LessonUploaded byDevendren Sathasivam
- Jerk MarinadeUploaded byDevendren Sathasivam
- Partnership AgreementUploaded byDevendren Sathasivam
- The Call of the Wild Chp2Uploaded byDevendren Sathasivam
- The Call of the Wild Chp2Uploaded byDevendren Sathasivam
- Habanero Chile SalsaUploaded byDevendren Sathasivam
- Barbeque MenuUploaded byDevendren Sathasivam
- Production ChartUploaded byDevendren Sathasivam
- Hotel Plant & Premises_questionsUploaded byDevendren Sathasivam
- Customer Rewards GuidelineUploaded byDevendren Sathasivam
- Sequence of Service for Restaurant PracticalUploaded byDevendren Sathasivam
- Equipment List and Bill of QuantitiesUploaded byDevendren Sathasivam

- t SeriesUploaded byjilaniosmane
- Tutorial 1Uploaded byneeta
- TitanicUploaded byPreetha Rajan
- chapter 8Uploaded byapi-306798358
- Tests of Significance and Measures of Association.pptUploaded byanandan777supm
- Scheffe’s TestUploaded byxiaoqiang9527
- Sing With Derr 002Uploaded bybprashant720
- Minitab Problem 6.1Uploaded byVarsha Malviya
- The Box-Jenkins Methodology for RIMA ModelsUploaded byرضا قاجه
- Approximating DistributionsUploaded bykshitijsaxena
- HLM in StataUploaded byajp11
- Barrons Data Analysis 1 (1)Uploaded by1012219
- nlreg10EUploaded byflgrhn
- Testing of HypothesisUploaded byYogpatter
- EViews WorkshopUploaded byIsuru Wijerathne
- Time SeriesUploaded bycristian_master
- Correlation and RegressionUploaded byanindya_kundu
- math1040Uploaded byapi-238312886
- CV FangjianGuoUploaded bysurabhi
- Lecture 12 Preliminary lalaUploaded bykushie0
- Simple RegressionUploaded byKhawaja Naveed Haider
- Effect Size CalculatorUploaded byMarian Mirela
- Two-way Anova Using SPSSUploaded byssheldon_222
- Regression and CorrelationUploaded byMireille Kirsten
- M6 Introduction Model SAR Dan SEMUploaded byRiza
- Logistic Regression Results April 14, 2016Uploaded bySilver Nixus
- Fine Cmt AllUploaded byArfan Mehmood
- Logistic Regression With Low Event Rate (Rare Events)Uploaded byTejamoy Ghosh
- my report.docxUploaded byDeepak Kumar
- Least squares estimation.pdfUploaded bySebastian Astorquiza Trucco