You are on page 1of 45

Experimental Designs and

Analysis of Variance
(ANOVA)

One way Anova


Two Way Anova
1
Introduction
• You are the production manager of the
Perfect Parachute Company. Parachutes
are woven in your factory using a synthetic
fiber purchased from one of four different
suppliers. For obvious reasons, one of the
most important quality characteristics of a
parachute is its strength. You need to
decide whether the synthetic fibers from
your four suppliers result in parachutes of
equal strength. 2
Introduction
• An experiment was conducted to determine if
any significant differences exist in the strength of
parachutes woven from synthetic fibers obtained
from the different suppliers. Five parachutes
were woven for each group- supplier 1, 2, 3, and
4. The strength of the parachutes is measured
by placing them in a tensile strength testing
device that pulls on both end of parachute until it
tears apart. The amount of force required to tear
the parachute is measured on a tensile-strength
scale where the larger the value the stronger the
parachute.
3
terminology
• Response variable is the variable of interest to be
measured in the experimentdependent variable
• Factors in a design experimentlike independent
variables in regression analysis
– Factors can be quantitative or qualitative
• Levels are the values of the factor used in the
experiment
– Levels also can be quantitative (numerical) or qualitative
(nonnumerical)
• Treatments of an experiment are the factor-level
combinations used

4
Cont...
• Experimental unit is the object on which the
response and factors are observed or measured
• Design experiment is one for which the analist
controls the specification of the treatment and
the method of assigning the experimental units
to each treatment
• Observational experiment is one for which the
analyst simply observes the treatments and the
response on a sample of experimental units

5
What is ANOVA?
• ANOVA is used to compare the means of
the groups
• The term analysis of variance appears to
be inappropriate because the objective is
to analyze differences among the group
means
• However, trough ANOVA, both among and
within the groups, conclusion can be made
about possible differences in group means
6
The F Distribution
• The F Distribution was named to honor Sir
Ronald Fisher (one of the founders of
modern-day Statistics)
• The F Distribution is used to :
– Test whether two samples are from
populations having equal variances
– Compare several population means
simultaneously
 Analysis of Variance (ANOVA)
7
• The F distribution is used to test the hypothesis
whether the two population variances are equal.
– The assumptions :
• The population : normal distribution
• The population are independent
• The means of two populations are the same
• The larger sample variance is placed in the
numerator, hence the F ratio is always larger
than 1

8
Characteristics of F-Distribution
• There is a “family” of F distributions
– A particular member of the family is determined by
two parameters: the numerator degrees of freedom
(df1) and the denominator degrees of freedom (df2).
• The F distribution is continuous
– It can assume an infinite number of values between 0
and plus infinity
• The F distribution cannot be negative
• The F distribution is positively skewed
– As the number of both d.f. increases, the distribution
approaches a normal distribution
• The F distribution is asymptotic

9
ANOVA Assumptions
• The F distribution is also used for testing
the equality of more than two population
means using a technique called analysis
of variance (ANOVA).
• The major reasons of ANOVA using :
– ANOVA will allow us to compare the means
simultaneously
– ANOVA avoid the buildup of The I Error ()

10
Cont…
• ANOVA requires the following conditions:
– The populations being sampled are normally
distributed.
– The populations have equal standard deviations.
– The samples are randomly selected and are
independent
– Data must be at least interval-scale.
• Type of ANOVA :
– One-Way (One-Factor) ANOVA
– Multi-Way (Multi-Factor) ANOVA
 Two-Way ANOVA
11
1. The Completely Randomized Design:
One Way ANOVA (single factor)
• The completely randomized design  has only
one factor with several groups (such as type of
tire, marketing strategy, brand of drug, etc).
• Factor=treatment, group=level
• Example:
– baking temperature may have several numerical
levels (3000C, 3500C, 4000C)
– Or categorical levels (supplier 1,2,3,4)

12
Partitioning The Total Variation
• SST=SSA + SSW
Among-Group Variation
(SSA)
Treatment Effects
d.f.=c-1
Total Variation
(SST)
d.f.=n-1

Within-Group Variation
(SSW)
Experimental Error
d.f.=n-c

13
Total Variation
(Sum of Squares Total= SST)
 
nj 2
c
SST   X ij  X
j 1 i 1
c nj

 X
j 1 i 1
ij

• Where X 
n
=overall or grand
mean
Xij = ith observation in group j
nj = number of observations in group j
n = total number of observations in all groups
combined (that is, n = n1+n2+n3…+nc)
c = number of groups

14
Sum Squares Among Groups
(SSA)
• Among- Group Variation
 
c
SSA   n j X j  X
2

j 1

where c = number of groups


nj = number of observations in
group j
X j = sample mean of group j

X = overall or grand mean

15
Sum of Squares Within Groups
(SSW)
• Within-Group Variations
2

SSW   X ij  X j 
c nj

j 1 i 1

• Where Xij = ith observation in group j


Xj = sample mean of group j

16
ANOVA Summary Table
Source D.F S.S M.S F
(Variance)

Among c-1 SSA MSA 


SSA
F
MSA
c 1
Groups MSW

Within n-c SSW SSW


MSW 
nc
Groups
Total n-1 SST MST 
SST
n 1

17
The ANOVA Test
• The Null Hypothesis (Ho) : the population
means are the same or no differences in the
population means
– H0:µ1=µ2=…=µc
• The Alternative Hypothesis (H1) : at least one
of the means is different.
– H1:not all µj are equal (where j=1,2,3,…,c)
• The Test Statistic :
• Estimate of the population variance based on the variation
• F = among the sample means
Estimate of the population variance based on the variation
within the samples
18
Regions of Rejection and Non-rejection
When using ANOVA to test H0

Reject H0 if F>FU
Otherwise do not
Reject H0

(1-α)

α
0 FU F
Region of Region of rejection
non-rejection Crit value

19
One Way ANOVA
TREATMENT
1 2 3
X1.1 X1.2 X1.3
X2.1 X2.2 X2.3
X3.1 X3.2 X3.3
X4.1 X4.2 X4.3
Tj T1 T2 T3 T
Xj X1 X2 X3 X
X : Overall Mean (Grand Mean); T  X
20
Example: in terms of tensile
strength
Parachutes Supplier
1 2 3 4
1 18.5 26.3 20.6 25.4
2 24.0 25.3 25.2 19.9
3 17.2 24.0 20.8 22.6
4 19.9 21.2 24.7 17.5
5 18.0 24.5 22.9 20.4
Mean 19.52 24.26 22.84 21.16
21
Answer
• Step 1
H0: there is no difference in mean tensile
strength among the four suppliers
– H0: µ1= µ2= µ3= µ4
H1: at least one of the suppliers differs with
respect to the mean tensile strength
– H1: not all the means are equal
• Step 2
– LOS = 5%
22
• Step 3

3.46

Area of rejection
0.95
Area of nonrejection 0.05
00 3.24 F

Critical value
23
• Step 4 c nj

 X
j 1 i 1
ij
438.9
X   21.945
n 20

c
SSA   n j X j  X
j 1
 2

• (5)(19.52-21.945)2+(5)(24.26-
21.945)2+(5)(22.84-21.945)2+(5)(21.16-21.945)2
= 63.2855
24
 
c nj
SSW   X ij  X j
2

j 1 i 1

(18.5-19.52)2 +…+ (18-19.52)2 + (26.3-24.26)2 +…+


(24.5-24.26)2 + (20.6-22.84)2 +…+ (22.922.84)2 +
(25.4-21.16)2 +…+ (20.4-21.16)2 = 97.504

 
c nj 2

SST   X ij  X
j 1 i  j

(18.5-21.945)2 + (24-21.945)2 +…+ (20.4-21.945)2


= 160.7895

25
• c=4, n=20

SSA 63.2855
MSA    21.095
c 1 4 1
SSW 97.504
MSW    6.094
n  c 20  4
MSA 21.095
F   3.46
MSW 6.094
26
Output

27
Output spss untuk pupuk

ANOVA

HASIL
Sum of
Squares df Mean Square F Sig.
Between Groups 992.000 2 496.000 49.600 .000
W ithin Groups 90.000 9 10.000
Total 1082.000 11

28
Guidelines for selecting a multiple
comparisons method in ANOVA
Method Treatment Types of
sample compariso Note:
sizes ns • For equal sample sizes
Tukey equal pairwise and pairwise
comparisons, Tukey’s
method will yield
Bonferroni Equal or Pairwise or simultaneous confidence
unequal general
contrasts intervals with smalest
(number of width, and the Bonferroni
contrast intervals will have smaller
known)
widths than the Scheffe
Scheffe Equal or General
unequal contrasts intervals

29
Contoh
• Berikut adalah hsl panen padi (kuintal) dari 12 petak
sawah dengan 3 jenis pupuk yang berbeda. Tiap
jenis pupuk diberikan pada masing-masing 4 petak
sawah. Apakah ada perbedaan hsl panen dari ketiga
jenis pupuk tsb? Gunakan  = 5%

Jenis Pupuk
A B C
55 66 47
54 76 51
59 67 46
56 71 48
30
Contoh
Nilai Ujian Uji apakah ada perbedaan
Kelas A Kelas B yang signifikan varians
52 59 nilai ujian kelas A dan
67 60 kelas B ?
56 61
45 51
70 56
54 63
64 57
65 65

31
Latihan Soal
• Berikut adalah waktu (menit) yang dibutuhkan untuk
mengerjakan 1 soal latihan dari 4 mata kuliah.
Sampel random masing-masing 5 mhs untuk tiap
mata kuliah. Uji apakah ada perbedaan yang
signifikan pada LOS 0,05? (Kerjakan menggunakan
Tabel ANOVA)
E. Makro E.Mikro Matematika Statistika
18 20 20 22
21 22 24 24
20 23 25 23
25 21 28 25
26 24 28 25 32
2. The Factorial Design: Two Way ANOVA

• Two factor factorial design in


which two factors are
simultaneously evaluated
• Partitioning the total variation
SST= SSA+SSB+SSAB+SSE

33
Summary of Two Way ANOVA
Factor A Variation
(SSA)
d.f.=r-1

Factor B Variation
(SSB)
Total Variation d.f.=c-1
(SST)
d.f.= n-1
Interaction (SSAB)
d.f.= (r-1) (c-1)

Random Variation
(SSE)
d.f.= rc(n’-1)

34
• Total Variation: represents the total variation among all
the observations around the grand mean

 
r c n 2

SST   X ijk  X


i 1 j 1 k 1

• Factor A Variation: represents the differences among the


means of the various levels of factor A and the grand
mean

 
r 2

SSA  cn X i..  X


i 1

35
• Factor B Variation: represents the differences among the
means of the various levels of factor B and the grand

 
mean. c 2

SSB  rn X . j .  X
j 1
• Interaction Factor: represents the interacting effect of
specific combinations of factor A and factor B.

 
r c 2

SSAB  n X ij.  X i..  X . j .  X


i 1 j 1

36
• Random error: represents the differences
among the observations within each cell and the
corresponding cell mean.

n 2

SSE   X ijk  X ij. 


r c

i 1 j 1 k 1

37
Summary Table
Source of D.F SS MS F
Variation (Variance)
SSA MSA
MSA  F
A r-1 SSA r 1 MSE
SSB MSB
c-1 SSB MSB  F
B c 1 MSE
SSAB MSAB
(r-1)(c-1) SSAB MSAB  F
AB r  1c  1 MSE
SSE
Error rc(n’-1) SSE MSE 
rc n  1

Total n-1 SST

38
Three distinct test
• To test the hypothesis of no difference due to
factor A
H0:µ1..= µ2..=…= µr..
H1: Not all µi.. are equal

The null Hypothesis is rejected at the α level of significance if


MSA
F  Fu
MSE

39
• To test the hypothesis of no difference due to
factor B
H0:µ.1.= µ.2.=…= µ.c.
H1: Not all µ.j. are equal

The null Hypothesis is rejected at the α level of significance if

MSB
F  FU
MSE

40
• To test the hypothesis of no interaction of factors
A and B
– H0: the interaction of A and B is equal to zero
– H1: the interaction of A and B is not equal to zero
– The null hypothesis is rejected at the α level of
significance if
MSAB
F  FU
MSE

41
• Suatu percobaan Heat treatment
tentang usia mesin,
diduga ada dua faktor Ring low high
yang berpengaruh
terhadap usia mesin small 12 26
yaitu faktor panas,
dan besarnya ring. 24 16
Dengan
menggunakan LOS large 18 101
5%, apakah ada efek
dari kedua faktor 28 113
tersebut terhadap
usia mesin? 42
Output SPSS
Tests of Between-Subjects Effects

Dependent Variable: LIFE


Type III Sum
Source of Squares df Mean Square F Sig.
Corrected Model 11205.500 a 3 3735.167 61.232 .001
Intercept 14280.500 1 14280.500 234.107 .000
RING_OSC 4140.500 1 4140.500 67.877 .001
HEAT_TMT 3784.500 1 3784.500 62.041 .001
RING_OSC * HEAT_TMT 3280.500 1 3280.500 53.779 .002
Error 244.000 4 61.000
Total 25730.000 8
Corrected Total 11449.500 7
a. R Squared = .979 (Adjusted R Squared = .963)

43
Contoh
• Berikut adalah waktu (menit) yang dibutuhkan untuk
mengerjakan 1 soal latihan dari 4 mata kuliah. Sampel
random masing-masing 5 mhs untuk tiap mata kuliah. Uji
apakah ada perbedaan yang signifikan waktu pengerjaan
antara keempat mata kuliah dan antara mahasiswa
tersebut?
E. Makro E.Mikro Matematika Statistika
A 18 20 20 22
B 21 22 24 24
C 20 23 25 23
D 25 21 28 25
E 26 24 28 25
44
Latihan Soal
• Uji dengan  = 0,1 apakah ada perbedaan hsl
produksi (kg) antara ketiga mesin dan kelima
karyawan tersebut?
Mesin
Karyawan I II III
A 21 17 31
B 27 25 28
C 29 20 32
D 23 15 30
E 25 23 24
45

You might also like