The D2star Table Exposed - Gruska

1
Stat Expos
The d-two-star tables uncovered
Gregory F. Gruska, The Third Generation Inc

with input from the MSA workgroup
David Benham, DaimlerChrylser Corporation
Peter Cvetkovski, Ford Motors Corporation
Michael Down, General Motors Corporation

Abstract

From the beginning of the development and implementation of Statistical Quality Control
SQC (now known as Statistical Process Control SPC), the Range has been used to
develop an estimate of the process standard deviation. Since the Range is a biased
estimate of the standard deviation correction factors have to be used to transform the
average Range to an estimate of the process standard deviation. This paper will discuss
the development of the
*
2
d tables which contain the necessary correction factors.

Warning

This paper is rated (xi) since it contains Greek letters, mathematical symbols, and
statistical terminology. Individual with no statistical background or training should
proceed with caution. Professional Statistical guidance is recommended.

Warning

This paper is intended to serve as additional guidelines for the analysis of measurement systems.
www.aiag.org/publications/quality/msa3.html
2
The d-two-star tables uncovered

From the beginning of the development and implementation of Statistical Quality Control
SQC (now known as Statistical Process Control SPC), the range has been used to
develop an estimate of the process standard deviation. Although the range provides a less
efficient estimate of the population standard deviation, it was widely used due to the ease
of calculation and the lack of inexpensive computers and calculators capable of
calculating the standard deviation during the first five decades of SPC.

Because of its wide use the distribution of the range in random samples from a normal
distribution has been studied by noted statisticians such as David, Grubbs, Weaver,
Patniak, Hartley, Pearson, and Duncan. The difficulty with reading their papers is that
there is no consistent notation used. This paper will use the notation contained in the
Quality Control and Industrial Statistics
1
,.by Acheson Duncan because of its prominence
in the Quality field.

The above authors have shown that the distribution of the range in random samples from
a normal distribution:
Depends on the sample size
Is independent of the population mean
Is dependent on the population standard deviation
Further, the relative efficiency of the range as an estimator of the standard deviation
decreases as the sample size increases
2
.

Unfortunately, a simple form of the exact distribution of the (mean) range cannot be
developed except for the trivial case of two samples of two observations each. However,
Patniak (1950) did develop a useful approximation to this distribution which is utilized
here.

Approximation to the Distribution of the Mean Range

Let
1 2
, , ,
m
x x x denote a random sample of size m from a normal population having
mean and standard deviation . The range of this sample is
Range = ( ) ( )
1 2 1 2
max , , , min , , ,
m m m
R x x x x x x =

If there are g such independent samples each with a sample size of m, the mean of the g
ranges is denoted in this paper by
, g m
R .

Let
m
W denote the range of the standardized (z) values. That is,

1 Quality Control and Industrial Statistics, 5th edition, McGraw-Hill, 1986.by Acheson Duncan
2
A generally acceptable rule is that the range should not be used when the sample size exceeds 20. In these
cases it is preferable to divide the sample into a number of groups and consider the average range over all
the groups. A subgroup size of seven or eight provides the most efficient estimation.
3
( ) ( )
1 2 1 2
max , , , min , , ,
m m m
W z z z z z z = where
i
i
x
z

=
Then the probability integral of
m
W can be expressed as
( )
{ }
1
( ) ( )
m
m
z W
m
z
P W m f z f u du dz
=

where ( ) f x is the normal frequency function:
2
1
2
1
( )
2
x
f x e
=

The moments for the probability integral of
m
W have been calculated to 5 decimal places
using numerical quadrature by Hartley and Pearson (1951). The following table has been
extracted from their work.

Sample
Size
Mu =d
2
Var =V
m
2 1.12838 0.72676
3 1.69257 0.78922
4 2.05875 0.77407
5 2.32593 0.74661
6 2.53441 0.71916
7 2.70436 0.69424
8 2.8472 0.67213
9 2.97003 0.65262
10 3.07751 0.63531
11 3.17287 0.61984
12 3.25846 0.60601
13 3.33598 0.59353
14 3.40676 0.58217
15 3.47193 0.57186
16 3.53198 0.56237
17 3.58788 0.55363
18 3.64006 0.54554
19 3.68896 0.53802
20 3.73495 0.53097
Table 1: Mean and Standard Deviation for
the distribution of ranges in normal samples

The range is a biased estimate of the standard deviation and a correction factor (d
2
) has
to be used to transform the average range to an estimate of the process standard deviation.
I.e.,
E(W
m
) =
2
m
R
d
=
So
2
m
R
d
=
4
For the distribution of
, g m
R , things are not so simple. However, based on the work by
Pearson (1926), Patniak selected the distribution as a reasonably accurate
representation of the distribution of
, g m
R .

The first two moments of
, g m
R are related to those of
m
R by

,
2
g m
R
E d

=

( )
,
, 2 2
1 1
var var
g m
m m g m
R
R V V
g g

| |
= = =
|
\

Relating these two moments with those of
c
where has degrees of freedom

yields:

2
2 1
2 2
c
d

+

=

2
2
,
1
2
2 2
g m
c
V

1
+
| | | |
=
1 ` | |
\ \ )
1
]
where
*
2
c d =

The -functions can be expanded by Stirlings formula and the resulting equations
simplified and used to solve for
*
2
d and .

( )
2
* 2
2 2 m
d d V g = +
1 2
1 3 3
4 16 64
A A A

= + + where
2
2
2
m
V
A
d
=

These are the formulae used to generate the
*
2
d table in Appendix C of the MSA Manual,
3
rd
edition.
The constant difference (C.D.) given in the table is calculated by
2
2
2
m
d
V
.

Using the
*
2
d Table
Whither go g and m?

The thing that tends to be most confusing to people first using the
*
2
d table is what shout
the value for g and m be. The best thing is to bring it down to basics:
How many ranges are used to calculate the average range? =g
How many pieces were used to determine each range? =m
5

In the MSA 3
rd
example for the Range Method there are five ranges used to calculate the
average range hence g =5. And each range was the difference of samples of size two
m =2. From the table the value of
*
2
d for g =5 (=number of parts) and m =2 (=number
of appraisers) is 1.19105 or simply 1.19 (unless you are enamored with decimals).

In the GRR example with a 3, 10, 3 setup, the average range used for repeatability
calculations has g =30 (=number of parts * number of appraisers) range values used in
the calculations. Each range is based on a sample of m =3 (=number of trials). Going to
the table we cannot find g =30 the largest g is 20. We make the assumption that 30 is
sufficiently large and use the
2
d valueof 1.69257 for
*
2
d in the calculation of
1 *
2
1 1
0.59081751419439077852023845394873
1.69257
K
d
= = = =0.5908

C.D.s and dfs
The constant difference term is used to determine the degrees of freedom value ( ) when
the number of samples (g) exceed the tabled values (i.e. g >20).

Example:
Find
*
2
d and for g =22 and m =8.
From the table we have
*
2
d = 2.85310 and =120.9 for g =20 m =8 with
2
d =2.8472
and C.D. =6.0305.

For g =22 take
*
2
d = 2.853 since 22 is closer to 20 than infinity.
=120.9 +2*(6.0305) =132.961 or 133.0

Yes, there is some fudging, but remember these are only approximations.

Gregory F. Gruska, a Fellow of the American Society for Quality (ASQ), is the principal consultant in
performance excellence for Omnex, LLC. an Engineering and Management services firm. Greg has been
involved in the development of theory and software and co-authored over 60 books and papers in statistical
theory and applications and quality management.
6

References

Duncan, A. (1986). Quality Control and Industrial Statistics, 5th edition, McGraw-Hill,
New York.
David, H. A. (1951). Further Applications of Range to the Analysis of Variance
Biometrika, 38, 393.
Florin, H. (1950). Comminucations of the Royal Finnish Academy (Science Series), 12, 6.
Grubbs, F. E. and Weaver, C. L.(1947) The Best Estimate of Population Standard
Deviation Based on Group Ranges, JASA, 42, 224
Hartley, H. O. and Pearson, E. S. (1951). Moment constants for the distribution of
Range in Normal Samples, Biometrika, 38, 463.
Patniak, P.B. (1950). The Use of Mean Range as an Estimator of Variance in Statistical
Tests, Biometrika, 37, 78.
Pearson E. S. (1951). Some Notes on the Use of Range, Biometrika, 38, 88.
Pearson, E. S. and Hartley, H. O. (eds.)(1976). Biometrika Tables for Statisticians,
Griffen and Co., London

The D2star Table Exposed - Gruska

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The D2star Table Exposed - Gruska

Uploaded by

Copyright:

Available Formats

1

where has degrees of freedom

You might also like