You are on page 1of 158

Thoratec

Workshop in
Applied Statistics for
QA/QC, Mfg, and R+D
Part 3 of 3:
Advanced Applications
Instructor : John Zorich
www.JOHNZORICH.COM JOHNZORICH@YAHOO.COM

Part 3 was designed for students who have


taken Part 1 and Part 2 of these workshops,
or who have had a college-level statistics course.
John Zorich's Qualifications:
20 years as a "regular" employee in the medical device
industry (R&D, Mfg, Quality)
ASQ Certified Quality Engineer (since 1996)
Statistical consultant+instructor (since 1999) for many
companies, including Siemens Medical, Boston
Scientific, Stryker, and Novellus
Instructor in applied statistics for Ohlone College (CA),
Pacific Polytechnic Institute (CA), and KEMA/DEKRA
Past instructor in applied statistics for UC Santa Cruz
Extension, ASQ Silicon Valley Biomedical Group, & TUV .
Publisher of 9 commercial, formally validated, statistical
application Excel spreadsheets that have been purchased
by over 80 companies, world wide. Applications include:
Reliability, Normality Tests & Normality Transformations,
Sampling Plans, SPC, Gage R&R, and Power.
Youre invited to connect with me on LinkedIn.
Self-teaching & Reference Texts
RECOMMENDED by John Zorich
Dovich: Quality Engineering Statistics
Dovich: Reliability Statistics
Juran: Juran's Quality [Control] Handbook
Natrella: Experimental Statistics (<< recently re-published)
NIST Engineering Statistics Internet Handbook, found at
http://www.itl.nist.gov/div898/handbook/index.htm
Pyzdek: Quality Engineering Bible
Taylor: Guide to Acceptance Sampling
Tobias & Trindade, Applied Reliability
Wheeler: Understanding Statistical Process Control
Zimmerman: Statistical Quality Control Using Excel
AIAG: Statistical Process Control (SPC)
http://www.home.agilent.com/agilent/application.jspx?
nid=-34791.0.00&cc=US&lc=eng
Main Topics in Today's Workshop
Reliability Plotting
Statistical Analysis of Gages
QC Sampling Plans
Statistical Process Control (SPC)
(SPC) Process Capability Indices

This is a lot to cover in 1 day, but


your studying the "Student" files at home, and
Instructor accessibility by email, complete the course.
Reliability Plotting
(review of topic from part 2 of this course)
Definitions of Failure and Reliability
In many of the slides in this section of the class, the words
" Failure " and " Reliability " are used.
By "Failure" is meant that an individual component or product
has been put on-test or under inspection and has either not
passed specification or has literally failed (e.g., broke,
separated, or burst -- it may have passed spec but then been
taken past spec, until it eventually failed) --- which meaning is
intended is obvious (or should be !!) in each situation.
"Failure Rate" refers to the % of a lot or sample that has failed
in testing, so far (that is, up to a given stress level).
By "Reliability" is meant the % of the lot that does not exhibit
"failure" (Reliability = 100% minus the Failure Rate), AT OR
BELOW A SPECIFIC STRESS LEVEL.
Reliability Plotting

Typically, reliability data is not linear.

Methods of extrapolating
Cumulative %

curved lines are not


recommended because
they are "not well
1 Reliability understood" (Pyzdek),
and their mathematics
Specification are not widely discussed
(e.g., not in Juran's Q-
Handbook).
DATA
(review of topic from part 2 of this course)
Definition of " F "
Reliability textbooks provide various transformation of the %
Cumulative values, so that all data, even the "100%" point, can
be plotted onto the Y-axis.
In textbooks on Reliability Statistics, the transformation
suggested is typically the " F " value. To calculate " F " for a given
set of data, first sort all the values, then give each value a "rank"
number ("rank" of the value with the lowest magnitude = 1, next =
2, and so on).
A commonly used formula for F is ...
F = Median Rank = ( Rank 0.3 ) / ( SampleSize + 0.4 )
A more accurate and theoretically justified calculation (per one
of the authors of Applied Reliability) is (using Excel)...
F = BETAINV ( 0.5 , rank , SampleSize Rank + 1 )
Reliability Plotting
requires a transformation that gives a straight line (by Linear
Regression) that can be extrapolated to the Specification value.
Transformed " F "

The "confidence limit" (the


hollow triangle) on the
Transformed extrapolated point (the solid
"F" at 95% triangle) can be calculated by
Confidence a simple but long formula
found in advanced texts.
It is provided in modified
Transformed form on the next slide.
Specification
Transformed DATA
Formula for calculating the 1-sided confidence limit on the
plotted Y-value at a single point on a linear regression line
(i.e., the transformed Y-value for hollow triangle on previous slide)
(note: The generalized linear regression equation is... Yei = a + b Xi )
= YSL +/ t x See x [(1 / N )+(( XSL Xavg )^2) / (Sum(( Xi
Xavg)^2 ))]^0.5
where...
YSL = Y-axis transformed F value corresponding to the Specification Limit
(i.e., transformed Y-value for solid triangle on the chart on previous slide)
+/ = use + if out-of-specification is below the spec limit; otherwise use
t = one-side, t-Table value at alpha = 1 Confidence, and df = N 2
using Excel, t = TINV ( 2 * (1 Confidence) , ( N 2 ) )
See = Std Error of Estimate = [ ( Sum ( ( Yei Yi ) ^2 ) ) / ( N 2 ) ] ^0.5
Yi = transformed plotted F value corresponding to a plotted Xi
N = number of X,Y points plotted on the chart (not the same as sample size)
XSL = transformed Specification Limit
Xavg = average of the transformed X values of the plotted X,Y points
(in some cases, this is not the average of the transformed raw data)
(do not include the specification limit in this average)
Xi = each of the transformed X values of the plotted X,Y points
Examples of useful X-axis Transformations
( using Excel Formulas )
In the formulas below, A, B, D, and E are constants (negative or positive,
whole numbers or fractional) chosen to help linearize the reliability plot.
= 1/X
= SQRT ( X )
= ASINH ( SQRT ( X ) )
= SQRT ( X + A )
= ((X^B)1)/B << this is called the Box-Cox Transformation
= LN ( X + D )
= 1/(X+E)
The following can be used only with X values between 0 and 1:
= LN ( X / ( 1 X ) << this is called the Logit Transformation
= ASIN ( SQRT ( X ) )
= 0.5 * LN ( ( 1 + X ) / ( 1 X ) << this is called the Fisher Transformation
Examples of useful Y-axis Transformations
( using Excel Formulas )

In the formulas below, F = the calculated Median Rank and


C = the user-chosen shape parameter constant
** = formula has been standardized by setting other
shape parameters to a value of 1.000
= NORMSINV ( F ) << this is the Normal (Z-table) transformation
= NORMSINV [ 1 (1 F ) ^ ( 1 / C ) ] << Power Normal
= EXP ( NORMSINV ( F ) ) << Three Parameter LogNormal **
= EXP ( NORMSINV [ 1 (1 F ) ^ ( 1 / C ) ] ) << Power LogNormal **
= LN ( LN ( 1 / ( 1 F ) ) ) << Smallest Extreme Value
= LN ( 1 / ( 1 F ) ) ^ ( 1 / C ) << Weibull ** = Exponential when C = 1
= LN ( F / ( 1 F ) ) << Logistic
= LN ( 1 / ( LN ( 1 / F ) ) ) << Largest Extreme Value
= TAN ( PI() * ( F 0.5 ) ) << Cauchy distribution
Reliability Plotting
Reliability plotting
allows calculation
of confidence and
reliability based
on either...
-- small sample
sizes
Transformed " F "

-- unfinished
experiments
-- data that can't
be normalized
Not -- data from 2
possible different
with populations
K-tables -- data with
many
duplicates
Transformed DATA as we shall see
next...
Actual data from presenter's client...
continued from previous slide...
In reliability statistics
textbooks, a plotdflike this, or one
that is not even as straight as
this, is sometimes shown as an
example of a Normal"
distribution; but...
even tho this data does
pass the best tests for
Normality (Anderson-Darling
A2*, Cramer-von Mises W2*,
and Shapiro-Francia W' ), with
test p-values all > 0.425, ...
and even tho the
correlation coefficient is very
high...
this plot is slightly curved; and
This is the Excel equivalent therefore this data is not truly
of a Normal Probability Plot normal (it is almost Normal).
(data is Normal if it shows as a
straight line on this plot). Is "almost" good enough
continued from previous slide...

The "inverse" ( = 1 / X ) transformation gives a much


straighter line on "Normal Probability Plotting" paper, and so
the distribution is "Inverse Normal" rather than "Normal"
Using the 12-pt data set from the previous slide...
Using Reliability Plotting to extrapolate transformed data to the
transformed spec ( 1 / X = 1 / 5.5 = 0.1818 ), we have this result:
Z(F) = 4.15 = 0.002% failure
rate = 99.998% reliability at 95%
confidence (solid triangle is the
extrapolated value; the hollow
triangle is the upper 1-tailed 95%
confidence limit). By comparison,
Normal K-tables yielded slightly
less than 99.9 % reliability.
Z (F)

In John Zorich's view,


Reliability Plotting is more
accurate than Normal
K-tables. In this case, we
obtained a "better" result;
but the reverse may occur
on a different data set.
Reliability Plotting: EXACT vs. INTERVAL

If you have replicate measurements in your data set


(e.g., 4 data points each = 0.35), and if you plot
each of the individual exact data points, the
resulting line may be inappropriate. The reason it
may be inappropriate is that "Linear Regression"
(which is the mathematical tool used in Reliability
Plotting) does not perform well when replicates are
present (especially when the replicates are near
one or the other of the ends of the straight line).
Instead of using your individual "exact" data, it may
be better to pool identical values (or very similar
values) into groups ("intervals"), the way you do for
a histogram. Then calculate the % cumulative for
each of the cumulative groups.
Reliability Plotting: EXACT vs. INTERVAL
EXACT %F
0.35 6.7%
0.35 16.3%
0.35 26.0%
0.35 35.6%
0.68 45.2%
1.22 54.8% .
1.56 64.4%
1.91 74.0%
2.17 83.7%
Reliability Plotting: EXACT vs. INTERVAL
(%F and %Cumulative are calculated differently)

EXACT %F INTERVAL %Cumulative


0.35 6.7% 0.3540%
0.35 16.3% 0.68 50%
0.35 26.0% 1.22 60%
0.35 35.6% 1.56 70%
0.68 45.2% 1.91 80%
1.22 54.8% 2.18 99.9999..%
In cases such as this, plotting
1.56 64.4% INTERVAL values will produce a
1.91 74.0% more accurate line than plotting
EXACT values, but you may need to
2.17 83.7% "censor" the largest value.
See next slide for this data plotted.
Reliability Plotting: EXACT vs. INTERVAL
(continued from previous slide)
Burst strength ( actual data !!)

60 devices tested; the minimum spec was 0.40 psi.


The (sorted) raw data was...
Burst strength

Using Reliability
Plotting.xls
Because this data using
does NOT form a
straight line on NPP Z(F)
paper, it's not valid to vs.
use K-factor tables. X(untransformed)

(this is equal to
Normal
Probability
Plotting paper)
Burst strength
Also, notice that the data includes
many replicate values...
Burst strength
...and notice that on a basic cumulative plot
( = F(untransformed) vs. X(untransformed) )
the data seem to include 2 different populations:
s i n g le
A t io n
u l a
pop ok like
ul d lo
w o
o t h " S"
a smo This one
curve. break or
has a er in it,
corn g a dual To use Reliability
n d i ca t in n.
i l a t io Plotting, must "censor"
popu these data, (they appear
as shoulder on a line
chart -- see next slide)
(continued from previous slide)

Mixtures of distributions appear as


bi-modal frequency distributions, or as a single mode with a
shoulder, like this:

Frequency
Distribution

"Shoulder" (the data that


must be "censored", to
use Reliability Plotting)
Burst strength
Here is how to convert the ( n = 60 ) data, with
its many replicate values, into interval data:
48 -- 2 = 2/60 = 3.3 %
49 -- 1 = 3/60 = 5.0 %
51 -- 1 = 4/60 = 6.7 %
52 -- 1 = 5/60 = 8.3%
53 -- 4 = 9/60 = 15.0%
55 -- 1
56 -- 4
57 -- 1 For "Interval"
58 -- 1
59 -- 3 plots, must use
60 -- 1 % cumulative,
O T plo t z ero values 61 -- 7 not "F"
Do N
do n ot plo t 54 -- 0 etc.
e.g.,
Burst strength
If convert from
"exact" to
"interval" AND
censor (i.e., do not
plot) values above
0.8 psi, then...
95% confidence at
99.8707 %
Reliability
using
Important !! Log(X) vsvs Z(%cum)
Z(%cum)
This pairing IS the best Log(X)
straight line but does NOT (i.e., LogNormal)
(i.e., LogNormal)
have the highest CC.
Statistical Analysis
of Gages
Calibration, Metrology, &
Measurement Uncertainty
Regulatory Requirements...
ISO 9001 & 13485 "The organization shall determine the
monitoring and measurement to be undertaken and the
monitoring and measuring devices needed to
provide evidence of conformity of product to determined
requirements."
MDD: Annex II, V, + VI "Application of the quality system must
ensure that the products conform to the provisions of this
Directive which apply to them at every stage, from design to
final inspection....It shall include in particular an adequate
description of...the test equipment used; it must be possible to
trace back the calibration of the test equipment
adequately."
Question: What does that word "adequately" mean?
The combination of calibration records AND the process of
choosing calibrated equipment must "provide evidence of
conformity. If the wrong instrument is chosen, it provides no
evidence of conformity, even if it is calibrated.
Vocabulary
ACCURACY is defined using the mean of several
measurements. Subtracting that mean from the "true
value" gives the Inaccuracy or Bias. Divide Inaccuracy
by the true value", then multiply by 100, to yield the
"% Inaccuracy".
Commonly, accuracy is assessed by taking only a
single measurement; and if that measurement is within
the tolerance allowed, then the instrument is said to be
within tolerance. Calibration vendors use N = 1, i.e., a
single measurement, unless explicitly told not to.

That "N=1" may NOT be a good thing,


because accuracy cannot be determined
accurately without taking multiple readings !!
Vocabulary
PRECISION (also called repeatability, especially in
the specification section of measurement equipment
owners manual) is assessed by taking several
measurements of the same item, and calculating
their standard deviation (= the Imprecision).
Divide Imprecision by the true value", then multiply
by 100, to yield the "% Imprecision".
Typically, calibration vendors do not check for
precision, unless you explicitly tell them to.

This may NOT be a good thing, if product


pass/fail decisions are based upon a single
measurement (precision tells you how
reliable is a single measurement) !!
Vocabulary
RESOLUTION is equivalent to the number of digits you can
read on an instrument, e.g.
an instrument that can output this measurement
2.71634 inches
has "higher resolution" than one that can output only this
2.716 inches
Unfortunately, there is no reliable relationship between
resolution and accuracy and/or precision !!
However, when there is no other info available about
measurement uncertainty, the standard deviation of repeated
measurements is estimated as follows:
The width of the smallest readable unit (e.g., 0.00001
and 0.001 in the examples above) divided by 3.464
NOTE: some micrometers read only 0 or 5 in the last digit, in
which case, the smallest readable unit is not 1 as in
Uncertainty

There is always some uncertainty as to the degree


with which a sample represents the population from
which it was drawn.
That type of uncertainty cannot be reduced by
anything other than larger sample sizes.

In addition to that uncertainty, there is


uncertainty caused by the measurement process
itself.
The science of "Metrology" aims to quantify,
control, and reduce the uncertainty caused by
the measurement process.
Some Tools of Metrology are...
Calibration
Assesses & corrects accuracy & precision
Gage R&R
Assesses precision only
Gage Correlation
Assesses accuracy only, vs. another device/person
Gage Linearity & Bias s ed
s
Assesses accuracy only, vs. a gold std t d iscu inar
No s sem
i
Gage Stability (i.e., instability) Studies in t h
Assesses systematic drift in accuracy over time
Uncertainty Budgets
Summarizes available measurement uncertainty info,
and then suggests how QC specs should be modified.
Calibration concerns

Before setting a design specification, a company should


decide on a desired relationship between specification
tolerance & measurement equipment accuracy, e.g. ...

Product tolerance specification = target +/ 4 units

Equipment calibrated accuracy = nominal +/ 1 unit

This situation above ( a ratio of 4 : 1 ) is generally


considered acceptable, since it mimics the practice that
ISO 17025 mandates for Calibration companies.
Calibration concerns
SOME MEDICAL DEVICE COMPANIES UNKNOWINGLY HAVE
RATIOS OF 1 : 1 , FOR EXAMPLE...
Product tolerance specification = 100 +/ 4 units

Equipment calibrated accuracy = 100 +/ 4 units

EXAMPLE OF WORST-CASE SCENARIO WITH THAT RATIO


Calibration data: NIST-traceable standard of 100 reads 96
(= meets requirement, but equipment is reading 4 units low).
Equipment is then used to measure product;
if result is 104, it passes spec, but "true" value is really 108.
Thus, product should be rejected but instead is passed.
Gage R&R
A "Gage R&R" study quantitates measurement uncertainty
that is due to the combination of the instrument & users.
Typically, the output is the width of the interval that includes
the middle 99% of the "Normal" distribution of individual
measurements. Let's call that the "Uncertainty Interval".
In an R&R study, we primarily identify the...
Total Variation uncertainty interval (from all causes)
Repeatability uncertainty interval (this is uncertainty
caused by the inability of a measurement instrument
[i.e., the gage] to produce the same measurement result
when used repeatedly to measure the identical part)
Reproducibility uncertainty interval (caused by the
inability of different users of the same gage to produce the
same result when measuring the identical part)
Equipment
This is an exampleControl
of a data input table for a simple Gage R&R
study (a complicated one involves more than one gage).
Typically, analysis of this data requires a computer program
capable of the quite-involved Gage R&R calculations.
Uncertainty
Uncertainty

This shows that gages+people (i.e., repeatability+reproducibility)


cause variation that consumes 12.98 / 40 = 32.5 % of the QC
Spec Interval. In this case, re-training and/or standardizing
personnel practices will help only a little to decrease that %.
To decrease variation significantly, we need to buy better gages
(gages caused most of this R&R variation, i.e., 32.2 % vs. 4.3 % ).
Gage R&R using Excels Data Analysis Add-in Option,
on the DATA tab: ANOVA: Two Factor With Replication

Reproducibility (99%) =
5.15 x StdDev( 4.63, 4.42, 4.65 )

Repeatability (99%) = 5.15 x sqrt [ ( 578.76 57.87 0.94 ) / ( 89 9 2 ) ]

2 2
Gage R&R (99%) = sqrt ( Reproducibility + Repeatability )
Gage Correlation
A "Gage Correlation" study typically is used to compare
measurements of identical parts by 2 different companies ---
for example, by the Supplier of the part, and their Customer.
One practical use is to validate that the Supplier gets the "same"
answer as the Customer, and thus the Customer justifies using
Supplier-provided QC data rather than the Customer having itself
to perform QC (the part could then justifiably go "dock to stock").
In a Gage Correlation study, we identify the...
Linear Regression relationship between the measurements
by the 2 companies
Correlation Coefficient for that linear regression
Offset Values that could be used to "correct" any identified
differences in measurement between the 2 companies.
Such a study could also evaluate R&D vs. Pilot Production, or
Pilot Production vs. Manufacturing, in the same company.
Gage Correlation
This is an example of a data
input table for a simple
Gage Correlation study
(a complicated one would
involve 3 or more gages).
In this case, each "Set #" is
a unique part to be
measured by "Gage # 1"
at one company
(or department), and
"Gage # 2" at the other
company (or department).
Altho not identical, the
measurements from the 2
companies look to be
linearly related and to be
highly correlated.
For Gage#1 to read like Gage #2,
multiply each Gage# 1 result by
0.998 and then subtract 4.7732
from that result...

4.77

(based on equation shown on


previous slide).
Gage Linearity
A "Gage Linearity" study typically is used to evaluate
performance over a wide range of values (e.g., over the entire
range of values that the gage is capable of measuring).
In a Gage Linearity study, we...
Use "gold standards" or any parts for which we believe we
know the "true" answers very accurately
Make repeated measurements of the gold standards, using
a single on-test gage
Graph the "error" (a.k.a., "bias") for each measurement
(i.e., how far off each measurement is from the "true" value)
Determine if the error is statisticly significantly different from
0.000 (i.e., if there is no error, then there is no bias)
If the gage is found to have (statisticly) "no" bias thruout its
tested range, we say the gage has acceptable "linearity" in
that range.
Gage Linearity
This is an example of a data input table for a Gage Linearity
study. The "gold standards" could be very accurate gage blocks.
Consider the curved lines to be 95% confidence limits on the
sample result avgs, in the measurement range of 2 thru 10
(formulas for such limits are found in advanced stat books).
Because the horizontal "Y = 0.000 Bias" line is fully contained
by the solid curved confidence interval lines, we conclude that
this gage has acceptable linearity in the range of 2 thru 10.
Bias = error

0.000
Gage Bias
A "Gage Bias" study is, in effect, a one-point Gage Linearity
Study, in which is used either a gold standard ("Reference")
calibrator or a gold standard ("Reference") gage. The difference
between the on-test gage and either the gold-standard gage or
the gold-standard calibrator is considered the "bias".

The virtue of a Gage Bias


study is that it is simple &
quick --- it uses one person,
one (on-test) gage, and one
part measured several
times at one sitting. Its
analogous to a one-point
calibration.
See output of this study, on
the next slide >>
Gage Bias

Upper & Lower


sample mean, conf. limits of
c
t-tables per any alculated with
intro stat book
Gage Linearity vs. Bias vs. Calibration
Value
n in are
out-putted w els
by On-Test o
n lev d.
k
s l 3 rate
gage i
inty al lib
r t a e if r ca
n ce ang d o
U is r ate
th alu ly
ev n
n
o is
In either ow vel d
k n le te K,
case, nothing is is 1 bra O r
is known of in ty th ali is fo
uncertainty for r t a nly or c ion re .
c e if o d rat he tc
any point in Un ere luate alib ure D e
this range --- h va t c eas R&
0.000 is a e oi n m g . ,
- p n l y Mf
given (a "tare 1 o C,
point"), not a if Q
calibration pt. True Value (= calibration standard or reference gage)
Uncertainty Budget
Estimate the standard deviation of uncertainty for each of
the uncertainty sources for which you have information, e.g.,
for a 99% interval Gage R&R...
StdDev = GageR&R interval divided by 5.15
for a calibration tolerance (e.g., "Mfg's specs")...
StdDev = tolerance interval divided by 4.00
for uncertainty in the calibration calibrator (typical)...
StdDev = calibration tolerance divided by 16.000
other (e.g., gage instability)...??
Square each StdDev, sum them, take square root of sum.
Multiply that by factor for interval you wish to calculate (e.g.,
for 99%, factor = 5.15; for 95%, factor= 3.92 ); the result is
called the "Expanded Uncertainty Interval"
Divide the width of product's spec interval by that interval.
There is general agreement in industry, based on ISO &
NIST recommendations, that if that ratio is less than 4.00,
for a 95% interval, then the measurement equipment or
measurement process is NOT suitable for given product.
Graphical Summary of the Problem of &
Solution to Measurement Uncertainty
Design specification range

95% or 99% Expanded Uncertainty


interval (based upon whatever is
included in the "Uncertainty Budget")

This guard-banded specification range has the advantage that


any single measurement that falls within it is guaranteed to fall
within the design specification range ( 95% or 99% probability).

Without "guard banding", the actual range being used to


pass/fail measurement results is this "expanded specification".
What is acceptable, if the goal is to...
"provide evidence of conformity of product to
determined requirements" per ISO 9001 & 13485 ?

There is no regulation or official guidance document that


discusses uncertainty budgets, expanded specifications, and
guard-banding (e.g., ISO 14969 says only that the
documented procedure should include details of equipment
type, unique identification, location, frequency of checks,
check method, and acceptance criteria).

Therefore, ISO, CE, & FDA auditors have no firm basis on


which to force companies to implement metrology policies.

Therefore, it is up to the company whether or not its product


is QCd vs. the expanded specification interval (which is
always wider than the design-based specification interval).
Classic
QC Sampling Plans
(and their alternatives)
Standards & Regulations
ISO 9001:2008 + ISO 13485:2003 8.1:
"[Mfg] shall...implement...analysis...processes needed to
demonstrate [product/process] conformity....This shall
include determination of applicable...statistical techniques".
FDA's "GMP" (21CFR820.250) (re: medical devices):
"Sampling plans...shall be...based on a valid statistical
rationale... Each manufacturer shall...ensure that sampling
methods are adequate for their intended use."
FDA's "Medical Device Quality Systems Manual"
"...all sampling plans have a built-in risk of accepting a bad
lot. This sampling risk is typically determined in quantitative
terms by deriving the 'operating characteristic curve'
[which]...can be used to determine the risk a sampling plan
presents. A manufacturer should be aware of the risks the
chosen plan presents....A manufacturer shall be prepared to
demonstrate the statistical rationale for any sampling plan
used.
US Dept of Defense MIL-STD 1916
Basic Types of Sampling Plans
In an attribute sampling plan, "quality" is
measured by the observed % of the sample that
meets specification.
In a variables sampling plan, "quality" is
measured by the estimated % of the population
that meets specification (based upon Sample
Mean & either Sample Range or Std Deviation, &
assuming data Normality (see statement of normality
requirement, in ANSI/ASQC Z1.9-1993, pp. 2-3).
Only attribute sampling plans are discussed in this class,
because they are currently still the dominant ones used
in the medical device industry (see next slide).
Information collected by John Zorich:
Virtually 100% of U.S. medical device companies use
AQL Attribute sampling plans for their IQC inspections
( IQC = Incoming or Receiving Quality Control )
less than 1% use a "variables" sampling plan
less than 1% use an LQL sampling plan.
That conclusion is based upon John Zorich's history of...
full-time quality-system (& statistical) consulting, 19992013
working halftime as an auditor for European ISO / Notified-
Body registrars, TUV and KEMA/DEKRA, 20002013
performing more than 500 quality-system audits at more
than 200 medical-device companies in USA, 19992013
Attribute Sampling Plans
An attribute sampling plan is a written procedure for...
choosing a fraction of an incoming lot
(the fraction = the sample)
deciding on the acceptability of the entire lot
based on the observed quality of the sample (the
lot "passes" if the number of defects or defective
parts is not more than the " C " = "acceptance
number" that is allowed by the plan)
Sampling-plan-use involves a RISK of approving a
bad lot (a risk to end-user customer, possibly).
Attribute Sampling Plans
AQL stands for Acceptable Quality Level or "Acceptance
Quality Limit". The %AQL of an AQL sampling plan is the
product quality ( = lots having that % defective) which the
sampling plan will approve almost all the time (there is no
generally accepted numerical definition of %AQL).
%AQL = " I am happy with AQL% defective "

LQL stands for Limiting Quality Level or "Lower Quality


Limit". The %LQL of an LQL sampling plan is the product
quality ( = lots having that % defective) which the sampling
plan will reject almost all the time (there is no generally
accepted numerical definition of %LQL).
%LQL = " I'm not happy with LQL% defective"
99% of U.S. med device companies that
do IQC inspection use one of these two
plans:
ANSI/ASQC-Z1.4 = ISO 2859-1 = MilStd105E
AQL attribute sampling plan, widely used because of it's
explicit endorsement by the FDA, in its Medical Device
Quality Systems Manual: "[Sampling] Plans should be
developed by qualified mathematicians or statisticians, or be
taken from established standards such as ANSI Z1.4"
The plan's stated purpose "is not intended as a procedure
for estimating lot quality or for segregating lots"...but rather
to "induce a supplier to maintain a process average...[and to
control ] consumer's risk...."
Squeglias "Zero Acceptance Number Sampling Plans"
AQL attribute sampling plan, widely used in industry,
because of its smaller sample sizes & implicit endorsement
by ASQ (it's published by the official ASQ Quality Press).
Classic (AQL Attribute)

QC Sampling Plans
(are they worth the effort?)
What people say about why they use
traditional AQL sampling plans is...
"FDA / ISO auditors won't ask any challenging questions."
That is true, for field auditors (= untrained in statistics).
PMA / CE auditors and their staff statisticians are much
more statistically savvy, and have been known to ask you
to justify your sampling plans, based on risk analysis, for
critical parts (e.g., implant components).

"Such plans provide statistical assurance that...


1. suppliers provide consistently high quality product,
2. we are not accepting low quality product, &
3. our Parts Storeroom has a known quality level."

Let's now examine those 3 claims...


ANSI Z1.4
ANSI Z1.4
" Zero Acceptance Number Sampling Plans "
( by N. L. Squeglia, 4th ed.)
Attribute Sampling Plans
For lots of a given part #, when inspected using a given
sampling plan, the % of lots (not the % of parts) that meet
specification is called the "Pass Rate".

The Pass Rate for a sampling plan is always...


( #1 ) high for good lots ( = have low %
defectives)
( #2 ) low for bad lots ( = have high % defectives)
( #3 ) intermediate for lots of intermediate quality.
Attribute Sampling Plans

The manner in which lot quality and lot size affect the
Pass Rate is described by 2 types of...
Operating Characteristic curves = OC curves
In this presentation, those 2 types of curves are called...
% Defective OC Curves
and
Lot Size OC Curves
examples of each are shown on upcoming slides...
Predicting Pass Rates
This is the typical
OC CURVE found in
text books, i.e., Lot %
Defective vs. Pass Rate

% Defective OC curve for a 4%


AQL sampling plan (ANSI Z1.4)
OC curves "describe
the long run behavior
of a sampling plan.
They do not tell the
user what can be
N = 1000 said about a
n = 80 particular lot that has
c=7 just been accepted or
rejected."
D. J. Wheeler, 2006 in
EMP III, pg. 152
Predicting Pass Rates

% Defective OC curve for


a 4% AQL, C=0 sampling plan
(Squeglia's 4th edition)

N = 1000
n = 15
c=0
How can such variations in "4% AQL" pass rates ensure that
"Suppliers provide consistently high quality product "?

Which "4% AQL"


plan should be
used?
In order to focus on
consumer risk (per
FDA & Squeglia),
need to focus on
this LQL point.
LQL sampling plans
N = 1000 focus on % defective
n = 80 or 15 that will be rejected
c = 7 or 0 almost all the time.
Attribute Sampling Plans
Probability of Acceptance of a Single Lot from a
Sequence of Lots from a Stable Process
(MSExcel function)
=binomdist(C,S,F,True)
C = Number of defectives allowed in the sample
S = Sample size
F = Fraction of lot that is defective
True = tells the program to add up the probabilities for
0, 1, 2, 3,.... thru to C.

(continued on next slide)


(free) "Self-made Sampling Plans.xls"
Control of Sampling Plan's Consumer Risks

It is NOT possible to use an AQL% to explain a "valid


statistical rationale" for a sampling plan.
The only way to achieve a "valid statistical rationale" for
classic sampling plans is to...
review the Risk Management documents (e.g., FMEA),
to determine if "IQC" processes have been identified as
being a "mitigation"; if they have, then...
choose a sampling plan whose LQL supports Risk-
Management statements such as..."In IQC, mitigation will
involve using a sampling plan that ensures that
component lots that are 1% or more defective are
rejected approximately 90% or more of the time").
If Risk Management docs do not identify IQC inspection as
mitigating a product or process risk, then it is reasonable to
conclude that IQC does not pose any risk to the end-user
(e.g., patient or doctor); in that case, only "business risks" are
important, and therefore any sampling plan is "valid".
Suppose that your Risk Management docs state that
"consumer risk" is not acceptable if component lots are
more than 5% defective, & claim that the IQC AQL attribute
sampling plan shown below provides "mitigation" so that
such lots are rejected 95% of time. Is that claim true?

% Defective OC curve for


a 4% AQL sampling plan

NOT TRUE, because


in order to ensure that
"We are not accepting low
quality product ", a 4 or 5%
LQL sampling plan should be
N=1000 used, not this 4% AQL one.
n=80
c=7 However, even with LQL
plans, we don't know
exactly what level of
"statistical confidence"
we can claim for a given
lot being inspected.
Do AQL plans control consumer risk
consistently?

ASQC-Z1.4, general, level II, single, normal, 4% AQL

Lot Sample If lot is 5% Defective


Size Size "C" Pass Rate is...
20 3 0 85 %
100 20 2 95 %
1,000 80 7 96 %

This shows a consistent approval rate of about 90%;


but "5% defective" is at the AQL top of the OC curve
(where "supplier risk" is controlled), & so is irrelevant to
control of "consumer risk".
Do AQL plans control consumer risk
consistently?

ASQC-Z1.4, general, level II, single, normal, 4% AQL

Lot Sample If lot is 15% Defective


Size Size "C" Pass Rate is...
20 3 0 60 %
100 20 2 38 %
1,000 80 7 6%

This shows an inconsistent approval rate at the LQL bottom


of the OC curve (where "consumer risk" is controlled).
Do AQL plans control consumer risk
consistently?
All three of these "4% AQL" plans
have same high Pass Rate when lot
is low % defective, but differ greatly
at high % defective.
% Defective OC Curves for
ASQC-Z1.4, general, level II,
single, normal, 4% AQL
Do AQL plans control consumer risk
consistently? ANOTHER KIND
OF OC CURVE

Lot Size OC Curves for


ASQC-Z1.4, general, level
II, single, normal, 4% AQL

Lot Quality is
Do AQL plans control consumer risk
consistently? Conclusion:
Z1.4 and C=0 AQL sampling plans do
not control consumer risk consistently,
unless Lot Size is controlled.
Lot Size OC Curves
for v4 ASQC-C=0,
Single Sample, 4% AQL

Lot Quality is
Important lesson from 2 previous slides:

When the Receiving-QC Inspection Pass Rate


changes dramaticly (increasing or decreasing),
you should not come to any conclusion about the
cause (e.g. "Supplier is doing much better!" or
"Supplier is doing much worse!"), until you examine
the "Lot Size OC Curve" versus the size of the lots
that have been received before and after that Pass
Rate changed dramaticly.
Such an examination may reveal that the dramatic
"change" is a false impression, and that it is due to a
change in size of lots received, not due to a change
in the quality of the lots received!
Arbitrariness(?) of Sampling Plans

** = ASQC-Z1.4, general, level II, single, normal, 4% AQL

Lot Sample If lot is 5% Defective


Size Size "C" Pass Rate is...
20 3** 0** 85 %**
100 20** 2** 95 %**
1,000 80 6 90 %
1,000 80** 7** 96 %**
1,000 80 8 99 %
Arbitrariness(?) of Sampling Plans

** = ASQC-Z1.4, general, level II, single, normal, 4% AQL

Lot Sample If lot is 5% Defective


Size Size "C" Pass Rate is...
1,000 125 10 96 %
1,000 80** 7** 96 %**
1,000 40 4 96 %

During World War II (when these sampling plans first


became common), one possible use for the larger-than-
needed sample sizes was for discrimination between
mediocre lots and excellent lots (both of which pass QC).
Arbitrariness(?) of Sampling Plans

All 3 plans from the previous slide


have the same high Pass Rate
when lot is 5% defective, but differ
greatly at high % defective.

Sample Size for lot size of 1000


Middle
line is
ASQC
Z1.4
4%
AQL

" C " for that


sample size
" Zero Acceptance Number Sampling Plans "
( by N. L. Squeglia, 5th ed.)

This table from this 5th edition, has many sample-size


changes (shown circled), compared to the 4th edition.
In some cases, the sample size has increased dramatically
(e.g., if Lot Size = 100 and AQL=1.5, then...Sample Size is
now 19 instead of 12, which is a 58% increase, and...
the pass rate for a 1.5% defective lot of that size drops from
83% in 4th edition to 75% in the 5th edition).
How much defective product is in
your approved-parts Storeroom?
The % of defective product that is in your Approved-parts
Storeroom is a function of the...
quality of lots received
sampling plan used
lot size received (as we saw on previous slides)

A relevant term that defines that % is...


AOQ (Average Outgoing Quality).

AOQ is the resulting average % defective in the


Approved Storeroom, assuming that LotSize, SampleSize,
C value, and received Lot%Defective all remain constant.

If the received Lot%Defective varies from lot to lot, then the


potential AOQ varies lot to lot. The worst AOQ possible is
AOQ can be easily calculated using a classic formula
found in any sampling-plan textbook, but AOQL is typically
available in tables in the back of published Sampling Plans.
Squeglia "C=0" (4th ed.)

This means that if a


1.0 % AQL Sampling
plan is used, and if the
parts Supplier
consistently sends lots
that are about 6 to 7%
defective, and if Lot Therefore, unless only a
Size is consistently 91 specific small range of
to 150, then Approved lot sizes (e.g., 91--150)
Stores will consistently is allowed to be
contain about 2.6% purchased, AOQL (and
defective of that part. possibly storeroom
quality) varies lot-to-lot!
% Defective in the Approved Storeroom is
affected by IQC & Supplier-Control practices
The classic formula for AOQ assumes that good parts are
used to replace all defective parts encountered either in any
sample or in a 100% inspection of a rejected lot, before that lot
is approved and moved into the Approved Storeroom.
Based upon John Zorichs experience auditing more than 200
US medical device manufacturing companies from 1999 to
2014, no company follows those classic instructions.
Instead, virtually all companies do NOT replace defective parts
with good parts, but rather return defective parts to the
Supplier for credit on future shipments of normal lot sizes.
If N=100, n=16, C=1, & received Lot%Defective = 10%, then...
AOQ = 4.20% using the classic formula
AOQ = 4.92% when good parts do NOT replace defectives.
Why do we do so much work
for so little information?
When we use classic AQL attribute sampling plans, we
settle for knowing almost nothing about the specific lot of
product from which the sample came. If our boss were to
ask, all we can say is "the lot passed".
We don't know the % defective in the just-passed lot
We may not know the actual % defective in Stores
(we may know only theoretical worst case = the "AOQL")
If all we do is focus on the AQL, we dont even have a
clear definition of a bad lot (the chosen "AQL%" is
considered good-enough %defective)
Instead, for each Lot of product received, why don't we
calculate what % of that lot is "in-spec" ?
That is
Why not calculate its reliability at 95% confidence
("reliability" here means "% in-specification"), and
use % reliability specs (instead of % AQL specs)?
Using "Reliability Calculations"
The future is now:
instead of AQL Sampling Plans

Duke Empirical (a well-known contract Design & OEM


manufacturer in Santa Cruz, California, with a long list of
medical device clients, including billion-dollar corporations)
does not use any AQL sampling plans for IQC, unless
mandated by the client.

Instead of %AQL specifications, Duke uses %Reliability


specs (all at 95% Confidence), as described in this
seminar.

The client is asked to choose an IQC %Reliability spec


for each of its parts received by Duke. If the client is not
ready to do that, Duke defaults to the %Reliability listed
by Risk class in Duke SOPs (e.g., human-implant parts
are high risk).
What difference does it make if we continue to AQL ?
Here's a real-life example of Sampling Plan Blues:
Using an AQL sampling plan ( n = 10, C = 0 ), actual (residual)
data from receiving inspection QC of catheter nose-piece was...
2.82, 3.72, 3.91, 4.70, 4.77, 5.24, 5.71, 6.09, 6.28, 7.18
Average = 5.04
Std Deviation = 1.33
Specification is " 2.50 or greater "
Lot passed, because all sample nose-pieces were in-spec.
However, more than 10% of finished devices made with that
lot of nose-pieces failed final test, causing a week-long shut-
down of production while root cause was investigated. And...
the root cause of those failures was...out-of-spec nose-pieces !!
Was that problem predictable / avoidable?
Data was "Normal". Using Normal" K-table (on next slide)...
Observed K = (5.04 2.50 ) / 1.33 = 1.92 = < 90% reliability
at 95%
Juran's QH

This is K for 95% confidence of 90% reliability when 10 is the


sample size and the population is Normally distributed.
Because the observed K (= 1.92, on the previous slide) is
smaller than this value, we are 95% confident that the true
reliability is less than 90% (that is, 1.92 is less than 2.355).
What this seminar proposes is this:
If we use sampling plans at Incoming QC to assess the
quality of the purchased product, then traditional
AQL attribute sampling plans are inadequate because ...
they do not tell us the % of incoming product that meets
product quality specifications; and therefore...
they do not guarantee that our formal risk-management
statements / requirements have been met.
What AQL attribute sampling plans do provide is
evidence that a given lot of product meets the
requirements of the sampling plan, NOT that the given
lot meets product quality requirements.
In these days of ubiquitous access to computers and
computer programs that can easily perform reliability
calculations, we should, instead of AQL plans, use...
%Reliability + %Confidence specifications
What this seminar proposes is that we all
start using a "New" kind of QC specification:

Product Design Old New


specifications QC Specs QC Specs
Sterile-barrier 1 psi, minimum 0.65% 99% reliable
pouch burst pressure AQL at 95%
confidence
Injection- 1.25 -- 1.30 1% 97% reliable
molded part inches long AQL at 95%
confidence
Label text color same 4% 90% reliable
as master copy AQL at 95%
confidence
Misleading advice from GHTF
(Process Validation Guidance, GHTF/SG3/N99-10:2004 (Edition 2)

1% AQL sampling plan would NOT support that statement!


A lot size of 300 would need an AQL 0.06 %, and a sample
size of 189, C = 0 (as determined using StatGraphics-XV).
STATISTICAL
PROCESS
CONTROL
( SPC )
BACKGROUND on SPC
"Statistical Process Control" was invented in the 1920s by
an engineer working at Bell Labs. His name was Walter
Shewhart. His goal was to increase the thru-put & reduce
the scrap rate at the nearby telephone manufacturing plant,
in order to meet the huge demand for telephones (which
were new, hi-tech gadgets in those days).
During JuneAugust 1950, an American named Edward
Deming trained hundreds of Japanese engineers,
managers, and scholars in SPC, a tool that became the
foundation for Japans success in becoming the world
leader in product quality. In gratitude, Japan awards the
annual Deming Prize to companies and individuals who
have made major contributions to the advancement of
quality. The awards ceremony was (as of 2011) still shown
every year in Japan on national television!
"Quality" -- How can it be defined?
Which has higher quality: Rolls Royce or Ford Focus?
Which are higher quality: paper clips or binder clips?
Does a packet of "Sweet'n Low" have higher quality if it has
more or less than the targeted 1.00 gram?

What makes for higher quality of a "Sweet'n Low" packet?


Lot to lot conformance (on average) to design/QC specs.
( This is sometimes called being on target.)
Similarity, packet to packet to packet (within a lot).
( This is sometimes called having minimum variation.)
The modern, practical definition of quality is:
on target with minimum variation.
Basic SPC lingo:
Term Meaning .
Xbar Same as a mathematical "Average"
Range An indicator of how variable a process is,
as shown by within lot variability
Sigma Same as a Standard Deviation
Sigma X = std dev of raw data
Sigma Xbar = std dev of sample avgs
Sigma R = std dev of sample ranges
( These are "standard errors" )
Every process has some variation:
"Common" causes of variation appear routinely &
randomly (e.g., the variation in results of honest dice or coin
tossing); to start an investigation about such normal
variation is to waste resources, because a common-cause
change is a "false alarm" (as Shewhart called it). The AIAG
reference manual calls them "the many sources of variation
that consistently act on the process."
Think of common causes as "background noise".
"Special" causes of variation appear unexpectedly and
definitely not randomly; you get a lot of "bang for your buck"
when you try to identify and eliminate the cause of "special"
variation, because there is "real change" & "the cause is
findable" (as Shewhart described it). Special causes act
inconsistently on the process.
Think of a special cause as a "signal " of opportunity.
What does SPC do ??
Statistical Process Control is poorly named. It really doesn't
"control" anything. It should be known as "Statistical
Process Monitoring".
All SPC does is MONITOR a process for times
of unexpected variation, which indicates to you
when your company might benefit by
spending resources to discover the cause of
the unexpected variation (that is,
to determine the identity of "special causes").
If SPC charts are used simply to decide when to adjust the
process up or down, you sabotage your process!!
What does SPC do ??

SPC "Control Charts" help you to identify when a


process is "out of control". By definition,
a process is "out of control" when it has been
affected by a "special cause" of variation and
therefore is not predictable.
If the special cause can be identified and eliminated,
the process will return to "control" (i.e., to become
"in-control) and therefore will probably be
less variable (= more predictable) in the future, since if
there is no special cause acting on the process, then
the only cause of variation is "common cause".
IN CONTROL

The shape and location of the distribution


in the next lot ("Day 5") is predictable, and
so the situation is "in control".
Way out of Spec, low !!

IN CONTROL Lower
Spec Target

The shape and location of the distribution in the next lot


("Day 5") is predictable, and so the situation is
"in control", even tho half of the product is out of spec!!
Product that is "In control" is not necessarily "In spec".
OUT OF
CONTROL

The shape and/or location of the distribution


of the next lot ("Day 5") is NOT predictable,
and so the situation is "out of control".
--

OUT OF --
CONTROL
Lower
Spec --

Upper
Spec

The shape and/or location of the distribution of the next lot


("Day 5") is NOT predictable, and so the situation is
"out of control", even tho all the product is in spec.
"Out of control" product is not necessarily "Out of spec".
Basic types of Control Charts

1. Variables data ( = measurements )


are charted onto "XbarR" or "XbarS" or "XmR" control
charts.

2. Count data ( = 1, 2, 3, ...) are charted onto either " P "


(or " NP ") or " U " (or " C ") control charts.

Today, we'll discuss only XbarR charts.


Variables data entry ( n > 1 )
XbarR Control Chart
Variables Data, Statistical Process Control (SPC) Chart bb
dfsf

This upper chart shows the


"between-sample variation",
i.e., variation from one sample
Average (= Xbar) to the next.

This lower chart shows the


"within-sample variation",
i.e., variation from one
sample Range (= R) to the
next (if plot Std Deviation here,
then have XbarS chart).
Variables Data, Statistical Process Control (SPC) Chart bb
UCL (upper control limit) for Averages

LCL (lower control limit) for Averages

UCL (upper control limit) for Ranges

LCL (lower control limit) for Ranges


Variables Data, Statistical Process Control (SPC) Chart bb
dfsf

The avg of the


"current process"
is drawn as the
"midline"

Data representing the "current process" are marked here


with boxes; only these are used to calc limits & midlines.

Notice that we didn't use some lots.


Variables Data, Statistical Process Control (SPC) Chart bb
dfsf

Out of control

We'll talk about


these later.

NOT out of control


Control Chart per GHTF
(Process Validation Guidance, GHTF/SG3/N99-10:2004 (Edition 2)

e b ee n m uch
w ould ha v
Th e G HTF d d ed the
t h e y a
ful had
more help j us t b efore the
ple
word sam e o r r ange
er ag
words av

d
Control Chart per GHTF ( FDA approved !!)
(Process Validation Guidance, GHTF/SG3/N99-10:2004 (Edition 2)

This is very definitely NOT an SPC Control Chart


(the chairman of SG3 agreed, in an email to J. Zorich, ind2012)
What is the difference between
(QC) Specification Limits
(a.k.a., "spec limits")
and
(spc) Control Limits
Spec Limits vs. Control Limits

Specification Limits ( USL & LSL )


are design or QC requirements; if the product is not within the
Spec Limits, it is considered to be "bad" or "defective" product.

Control Limits ( UCL & LCL )


are boundaries inside of which you can expect to see
almost 100% of Sample Avgs & Ranges,
in the current process, assuming it is " in control ".
The control limits are a graphical indication
of what your current process can do.
In effect, control limits are set at +/ 3 std errors
of the averages & ranges of the samples.
Control Limit Calculations
How to calculate the upper and lower control limits
for the various types of basic charts is shown on the
following slides.

However, the data to use in the calculation is to be taken from


the lots (batches) that YOU choose. You might choose to
use all the data you have, or use just the first 30 or 50 lots,
or use lots 73 thru 129.

You should choose the lots that are "relevant" to the current
process. That is, lots (i.e., data) that represent the current
production process.

For example....
The lots marked with squares were
used to calculate the control limits.
121
2
11.5
1
1
11.0
.
In effect, the
5
10.5
control limits
1
10.0
1 are set at
+/ 3 std errors
.
9.50

9.0
from the
Control limits on the "Averages" midline
chart are equivalent to (i.e., the
+/ 3 std errors of the mean. distance
between the
control limits is
6 standard
errors wide),
calculated
Control limits on the "Ranges" indirectly,
chart are equivalent to using tables.
+/ 3 std errors of the range.
XbarR Chart, (n = 2 or more)
For sample averages:
UCL = AvgAvg + ( A2 x AvgRange )
AvgAvg = Average of all chosen measurements
LCL = AvgAvg ( A2 x AvgRange )

For sample ranges: "Factors"


from table on
UCL = AvgRange x D4 next slide...
AvgRange = Average of all chosen ranges
LCL = AvgRange x D3
n An error in the
textbook !!

" n " i s the


size of a ple,
m
single sa
not the # o f
n c e
seque
e sa m p le,
t h
r t h e # of
no
am p l es !!
s
[ this is a scanned image from "Understanding Statistical
Process Control" (2nd ed.) by Wheeler & Chambers ]
Class exercise (XbarR chart):
Where should you draw the upper and lower control limits for
Averages and Ranges, if...

e
# of data pts per Sample Avg = 9

bl
ta
# of Sample Avgs = 7

m
fro
Average of all 7 Avgs = 100

s
or
ct
Average of all 7 Ranges = 10

Fa
Answer: UCLavg = 100 + (10 x 0.337) =
103.37
LCLavg = 100 (10 x 0.337) = 96.63
UCLrange = 10 x 1.816 = 18.16
LCLrange = 10 x 0.184 = 1.84
Thecircled
Lots lots circled in are
in blue blueshown
were used
in thetohistogram
calculate
and
thewere
control
used limits.
to calculate the control limits.

LCL(avg) UCL(avg)
= 6.5 = 13.5
Thecircled
Lots lots circled in are
in blue blueshown
were used
in thetohistogram
calculate
and
thewere
control
used limits.
to calculate the control limits.

LCL LCL
(avg) (avg)
= 7.5 = 12.5
Thecircled
Lots lots circled in are
in blue blueshown
were used
in thetohistogram
calculate
and
thewere
control
used limits.
to calculate the control limits.

LCL(avg) = 9.5
UCL(avg) = 10.5
Because the variation in both Averages and Ranges has been
greatly reduced, the process on the far right is making
higher quality product than the process on the far left,
regardless of what % of product is made "in spec".
The purpose of SPC is to help processes move from left to right!

The goal of an SPC program is to make product that is more "on target"
with "minimum variation".
How is "Out of Control" detected?
OUT OF CONTROL = any data point or set of points (on
the control chart ) that would have little likelihood of occurring
by chance alone, assuming the data is normally distributed
( "out of control" = "special cause" is present).

YOU decide what " little likelihood " means. Over 80 years
ago, that meant a probability of 1 in 20 , whereas current
preference is 1 in 370.

Whenever you see "out of control" on a control chart, you


should investigate the reason, to try to determine root cause
(i.e., to identify and react to "special cause); otherwise, you
are wasting your company's time by making SPC charts.
Red-circled points indicate "out of control" situations
bb

t rol
n
Out of control point f co "
u t o end
O
" tr

NOT out of control

tr ol
n
f co "
u t o ries
O e
"s
"Rules" for detecting "out of control"
(all taken from SPC textbooks)
Probability of occurring by chance
(assuming no "special cause" is acting)
1 point outside either control limit 1 in 370
9 in sequence on one side of midline 1 in 256
9 in an ascending or descending trend 1 in 256 (on avg)
8 in sequence on one side of midline 1 in 128
10 of 11 on same side of midline 1 in 102
12 of 14 on same side of midline 1 in 105
14 of 17 on same side of midline 1 in 117
16 of 20 on same side of midline 1 in 135
many, many others !! 1 in 100 to 400 !!!
Reasons not to have too many rules...
USING THE ROLL OF 4 DICE AS AN EXAMPLE
The chance of not getting a 6 on any die is 5/6 x 5/6 x 5/6 x 5/6 =
approximately 50 % ; therefore, about 50% of the time, a toss of 4 dice
will have a 6 showing on at least 1 of the dice.
USING ALL THE RULES ON THE PREVIOUS SLIDE
369/370 x 255/256 x 255/256 x 127/128 x 101/102 x 104/105 x 116/117
x 134/135 = 95 % ; therefore, about 5 % of the time ( = 1 out of every 20
times), a point will be called "out of control" even tho it is the result of
random variation ( = "common cause"; that is, NOT "special cause").
That may be too frequent for the boss's taste !!!
USING ONLY (the first) 3 RULES
369/370 x 255/256 x 255/256 = 99 %, which means only 1% or about 1
in a hundred times will a "false alarm" be triggered by chance (is that
more acceptable to your boss ??). That % is recommended in the
Handbook of Statistical Methods in Manufacturing (R. B. Clements, 1991).
Random vs. Representative Sampling
The data on your SPC chart should faithfully represent the
process you're trying to improve.

To acquire representative samples, you could either

Wait till the end of the production run, and then randomly
choose a sample from output of the entire run, OR...

Take one item per arbitrary time period (e.g., one part
every hour, or one part after each 100 parts made, or ??)
and combine all the collected parts as a "representative"
sample.

If your boss makes you take only the first few parts from a
day's run, don't argue -- it's better than having no SPC program
!!
Rational Sub-grouping of Samples
Basic rule for good SPC charts:

Plotted points must have NO KNOWN SYSTEMATIC


SOURCE OF VARIATION between them, other than
sequential production over time.

For example:
Manufacturing occurs on day, swing, and graveyard
shifts. The average of a sample from each shift's
production is plotted sequentially on the same SPC chart.

IS THIS GOOD OR BAD (see next slide)?


This
is BAD
Rational Sub-grouping !!

Evening, Night
This
is GOO
Rational Sub-grouping D !!
Rational Sub-grouping
Basic rule for good SPC charts: Plotted points must
have NO KNOWN SYSTEMATIC SOURCE OF
VARIATION between them, other than sequential
production over time.

If you ignore that rule, you may miss chances for valuable
investigations of "special cause" incidents, or you may
waste time investigating "common cause" effects.

Corollary: If your production process has not yet been


standardized (equipment, procedures, raw materials, etc.)
then it's too early for SPC.

CONTROVERSY: This instructor agrees with authors who


state that SPC can be initiated even if process is out-of-
control from the start.
Sample Size = n
n = quantity of product chosen for each sample that is
plotted as a single point on a control chart (that is, " n " is
the "sample size" within each single point on the chart).
Historically, sample size has been a choice based on ease
of calculation (but with computers or calculators, this
reason is not important).
Theoretically, can be any value ( n = 1 or higher).
However, if n = 1, then cannot evaluate "within sample"
variation !! Practically, if your raw data is not "normal", you
should start with a large sample (e.g., n = 10 ) in order to
take advantage of the Central Limit Theorem (= avgs of
"large" samples tend to be normally distributed).
To ensure you have the best chance for improving the
process, have n = 7 or more.
Harold F. Dodge: Worked in the quality assurance department at Bell
Laboratories from 1917 to 1958.

Of his early experiences Dodge wrote:


"There have been several things of special interest in my work in
this field over the years. It all goes back to the beginnings of
statistical quality control in 1924. Our work in cooperation with shop
engineers was influenced heavily by great pressures to save money
and to make the quality control methods simple and easy to use.
Initially, the basic procedures for variables called for samples of
four, with one chart for the average, and another for the standard
deviation, Shop reaction was prompt against anything as complicated
as computing the standard deviation. After some study we proposed
the use of the range, R. On top of that we proposed shop use of
samples of five instead of four; it is easier to divide by five than by
four. These simplifying steps quickly became the basis for shop
practice."

That text is from: http://www.asq.org/join/about/history/dodge.html


HOW DOES AN SPC PROGRAM WORK?
1. Identify which important or troublesome steps in the QC,
assembly, or manufacture process are to be in the SPC Program,
and set up a separate SPC control chart to record data on each
such step.
2. Monitor the control charts, looking for out-of-control points or out-
of-control series or trends.
3. Investigate "out of control" situations, to discover their cause. This
may require not only technical studies of product, process, or
equipment, but also interviews with assemblers and production
personnel, and possibly even "brainstorming" sessions or other
such techniques involving personnel from all relevant departments
--- in other words, this will consume a lot of time !!
4. Devise + implement changes to product, process, documents,
equipment, personnel, or environment, as needed to eliminate or
reduce the effect of identified causes of variation.
HOW DOES AN SPC PROGRAM WORK?
5. Recalculate (or manually re-set) the SPC control limits whenever
the variation between lots or within lots has been significantly
reduced. This is done so that, assuming the process is in control,
about 1 in 370 data-points occurs outside the control limits of the
now less variable process!!
6. Go back to step 2 above !!
7. Continue these cycles as long as the reduction in variation is worth
the expense ( $$$ ) of the SPC effort ( = time & other resources ).
8. CONTROVERSY: Some textbooks state that SPC should not be
initiated until a process is "in control". The instructor agrees with
other authors (e.g., Wheeler and Chambers) who advise using
SPC for any process, even if it is currently "out of control,
because SPC can then be used to help improve the process (that
is, to help get it "in control").
Capability Indices
Capability Indices

This topic typically applies only to "variables" data.


It's application to "count" data is not discussed here.

Because these indices have no "confidence" statement


associated with them, the sample size chosen is irrelevant
(except for the fact that the larger the sample size, the closer
your result is likely to be to the "true" answer, as is concluded
from the "Law of Large Numbers").

Because of the lack of a confidence statement, John Zorich


prefers to use confidence/reliability statements (as were
taught during Day 2 of this workshop) rather than these
Capability Indices; but that is a personal choice.
CAPABILITY INDICES
In the following slides...
n = Sample Size used to calculate each
dot on the SPC chart
USL = Upper (QC) Specification Limit
LSL = Lower (QC) Specification Limit
NSL = Nearest Spec Limit (i.e., whichever of the USL or
LSL is nearest to the process average, i.e., nearest
to the midline of the Xbar chart)
UCL = Upper Control (chart) Limit of Avgs
LCL = Lower Control (chart) Limit of Avgs
"Sigma X" = standard deviation of raw data
(or of "transformed" data)
Capability Indices assume that the
raw data is "Normally Distributed".
Sigma
Here, Sigma X
Sigma X

70 80 90 100 110 120 130

In a "normal distribution, virtually all of the raw


data (99.73% of it) is in a range that is
6 "Sigma X" long = ( Avg +/ 3 "Sigma X")
6 times "Sigma X" = "6Sigma"
To calculated a Capability Index, you need to know how large
the value "6Sigma" is.
6Sigma can be calculated 2 ways:
INDIRECT METHOD:
= ( UCL LCL ) x ( Sqrt ( n ) )
(where n = sample size used to
calculate each dot on the SPC chart)
DIRECT METHOD:
= 6 x Standard Deviation of the raw data
(e.g., using Excel's "=stdev" function)
Capability Indices

Cp =
ratio of the width of the specification limits to the SPC-
estimated width of the range that the encompasses
99.7% of the product population.
= ( USL LSL ) / (6Sigma [indirect])
The larger Cp is, the better, because large numbers
indicate that a large % of the product might lie within
the Upper and Lower spec limits
( = USL & LSL), that is, a large % might pass QC.
NOTE: If use "6Sigma [direct]", instead of "6Sigma
[indirect]" you're actually calculating Pp, not Cp.
Capability Indices

Cp
This is useful only if the average data value is
currently near the specification target. If the average
data value is not near the target spec, then this
gives a false indication of % in-spec.

Cp is used mostly to indicate what % of product


might be in-spec, IF the average data value were
near the specification target
(that is, if the midline of the control chart was
identical to the mid-point of the specification range).
What % is "in spec"
Assuming that the data is distributed "normally" and the
process is centered on the QC Specs, Cp and Pp indicate the
following:
Cp or Pp Value % Product within Specification
0.33 = 1 "Sigma X" away 67.8 %
"Capable
0.67 95.6 % Process"
is often
1.00 = 3 "Sigma X" 99.7 % defined as
1.33 99.99 % Cp = 1.33
or larger.
1.67 99.9999 %
2.00 = 2 x 3 "Sigma X" 99.999999%
= "Six Sigma" (slang for "Six Sigma X")
d
Capability Indices

Cpk =
2 times the distance (as a positive number) that the
AVG (data) VALUE is from the nearest spec limit,
divided by "6 x SigmaX [ indirect ]":
= 2 x |( NSL AvgValue )| / (6Sigma
[indirect])
The larger Cpk is, the better, because large numbers indicate
that a large % of the product does pass QC.
NOTE: If use " 6Sigma [direct] ", instead of "6Sigma [indirect]"
you're actually calculating Ppk, not Cpk.
Capability Indices
Cpk is useful no matter whether the avg data value is
currently near the specification target or not; i.e., even if
the average data value is not near the target spec, Cpk
still gives good indications of % in-spec.
That % in-spec would be higher IF the average data value
were nearer the spec target
(Cp gives an indication of that higher %).
Express Cpk as a negative number,
only if the "Avg Value" is outside the spec limits.
Most companies claim to be calculating Cp & Cpk, but
an examination of their formulas reveals that they are
really calculating Pp & Ppk !!!
Classroom exercise
Calculate the Cp, based upon this data, using the
equations given on the previous slides...
9 = n = Sample Size
130 = USL = Upper (QC) Specification Limit
70 = LSL = Lower (QC) Specification Limit
105 = UCL = Upper Control (chart) Limit of Avgs
95 = LCL = Lower Control (chart) Limit of Avgs
Answer:
Cp = (USL LSL) / (6Sigma[indirect])
(USL LSL) = 130 70 = 60
"6Sigma..." = ( 105 95 ) x sqrt( 9 ) = 10 x 3 = 30
Cp = 60 / 30 = 2.00
What % is "in spec"
For a given set of data, Cpk and Ppk values are always
smaller than or equal to Cp & Pp, respectively (they can
never be larger).

Cpk & Ppk give a picture of the actual current situation;


whereas Cp & Pp give a picture of what could be, IF the
process were centered on the specification target.

Cp and Pp cannot be calculated if there is only a one-sided


spec, but Cpk and Ppk can.

The % associated with a given Cpk or Ppk value depends


on what the specs are. See STUDENT file
"CpCpkPpPpk Percent In-spec"
Cp = Cpk

Cp = Cpk
Cp
Cpk
Cp = Cpk

Cp = Cpk
histogram of
raw data

If QC specs limits are here,


then Cpk = 1.00

If QC specs limits are here,


then Cpk = 2.00
It is obvious that...
a Cpk of 2.00 is much better than a Cpk of 1.00 !!

histogram of histogram of
raw data BEFORE raw data AFTER
process improvements process improvements

SPC helps to
get you from
here to here

3 4 5 6 7 8 9 3 4 5 6 7 8 9

If QC specs limits are here, If QC specs limits are here,


then Cpk = 1.00 then Cpk = 2.00
Non-normal Data (not transformed)

Raw data
Spec Cpk = 0.97
limits =
0.07 to 3 parts per 2000
0.15 are predicted to be
out-of-spec
Non-normal Data, TRANSFORMED

Raw data
transformed
(1/X)
Cpk = 1.09
Transformed 1 part in
Spec limits = 2,000 is
6.7 to 14.3 actually
out-of-spec

Therefore, you predicted 3 times more defective product


( 3 / 2000 vs. 1 / 2000) when you assumed (incorrectly)
that the untrasformed data was normally distributed.
In summary for this course:
How to implement what you've learned?

Be patient (no one wants to talk about statistics!!)

Gather data (Make observations, calculations, and charts


that you design to be convincing to your MANAGEMENT.
Try to relate to something they consider important. In a
"start-up" company, that might be "time to market" or a
successful product launch. In an established firm, that
might be scrap-rate or labor-savings, i.e., $$$$.)

Present your ideas at the right time in the right setting


(a tense meeting about a product problem might not be the
best time to talk about statistics --- it might be better to wait
until after the meeting, and then tell your boss in private).

You might also like