Professional Documents
Culture Documents
Minitab has available four methods for selecting the “best” subset, these being
Considering all possible regressions
Forward selection
Backward elimination
Stepwise regression
Row X1 X2 X3 X4 X5
1 7 26 6 60 78.5
2 1 29 15 52 74.3
3 11 56 8 20 104.3
4 11 31 8 47 87.6
5 7 52 6 33 95.9
6 11 55 9 22 109.2
7 3 71 17 6 102.7
8 1 31 22 44 72.5
9 2 54 18 22 93.1
10 21 47 4 26 115.9
11 1 40 23 34 83.8
12 11 66 9 12 113.3
13 10 68 8 12 109.4
4.1
ALL POSSIBLE REGRESSIONS Introduction
where mse(ŷi ) is the mean square error for each fitted value. Better subsets have
smaller values of Jp , and one of the simplest estimates of Jp is Mallows’ Cp which
for a model with p parameters can be expressed as
RSSp
Cp = + 2p − n
σ̂ 2
RSSp − RSSk′
= + p − (k ′ − p)
σ̂ 2
= (k ′ − p)(Fp − 1) + p
where σ̂ 2 is from the full model and Fp is the F statistic for testing that the
predictors left out of the subset model but included in the full model have zero
coefficients.
Cp is a measure of the differences in fitting errors between the full model and a
particular subset model. For the full model Ck′ = k ′ and Mallows suggests that
good models have Cp ≃ p.
Using the appropriate commands or menu options, Minitab will produce a list giving
for the various subsets of predictor variables the values of R2 , S and Cp . The subset
model is chosen on the basis of the value of Cp , a large value of R2 and a small
value of S.
4.2
ALL POSSIBLE REGRESSIONS Minitab output
Variables R2
X4 67.5
X2 66.6
X1 53.4
X3 28.6
X1 X2 97.9
X1 X4 97.2
X3 X4 93.5
X2 X3 84.7
X2 X4 68.0
X1 X3 54.8
X1 X2 X4 98.234
X1 X2 X3 98.228
X1 X3 X4 98.1
X2 X3 X4 97.3
X1 X2 X3 X4 98.2
Response is X5
Mallows X X X X
Vars R-Sq R-Sq(adj) C-p S 1 2 3 4
1 67.5 64.5 138.7 8.9639 X
1 66.6 63.6 142.5 9.0771 X
2 97.9 97.4 2.7 2.4063 X X
2 97.2 96.7 5.5 2.7343 X X
3 98.2 97.6 3.0 2.3087 X X X
3 98.2 97.6 3.0 2.3121 X X X
4 98.2 97.4 5.0 2.4460 X X X X
4.3
FORWARD SELECTION Introduction
The forward selection method begins by selecting as the first predictor the one which
has the highest correlation with the response. As long as this predictor is significant,
the predictor, of those remaining, which has the highest partial correlation with the
response is added to the model as long as it has a significant sequential sum of
squares in the regression. Further predictors are added to the model in the same
way, the process stopping when the sequential sum of squares of a predictor is no
longer significant.
For a sequential sum of squares to be considered significant, its p–value must be less
than a specified significance level which in Minitab is the value of Alpha to enter
in the Stepwise regression menu. This value is initially set by Minitab to a default
value of 0.25.
4.4
FORWARD SELECTION Minitab output
Correlations (Pearson)
X1 X2 X3 X4
X2 0.229
X3 -0.824 -0.139
X4 -0.245 -0.973 0.030
X5 0.731 0.816 -0.535 -0.821
Regression Analysis
Analysis of Variance
Source DF SS MS F P
Regression 1 1831.9 1831.9 22.80 0.000
Error 11 883.9 80.4
Total 12 2715.8
4.5
FORWARD SELECTION Minitab output
Step 1 2 3
Constant 117.57 103.10 71.65
X1 1.44 1.45
T-Value 10.40 12.41
P-Value 0.000 0.000
X2 0.42
T-Value 2.24
P-Value 0.052
4.6
BACKWARD ELIMINATION Introduction
In backward elimination, all predictors are placed in the model at the start, and
then progressively removed. The criteria at any stage for removing a predictor are
that of all the predictors in the model, it has the smallest sequential sum of squares
of all the predictors if each of them were fitted as the last predictor to be fitted in
the model and furthermore, this predictor does not have a significant sequential sum
of squares. The process stops when all predictors, if they were the last predictor
fitted in the model, have a significant sequential sum of squares.
4.7
BACKWARD ELIMINATION Minitab output
Step 1 2 3
Constant 62.41 71.65 52.58
X3 0.10
T-Value 0.14
P-Value 0.896
X4 -0.14 -0.24
T-Value -0.20 -1.37
P-Value 0.844 0.205
4.8
STEPWISE Introduction and Minitab output
Step 1 2 3 4
Constant 117.57 103.10 71.65 52.58
X2 0.416 0.662
T-Value 2.24 14.44
P-Value 0.052 0.000
4.9