Professional Documents
Culture Documents
On Design of Experiments in
Continuous Processes
Erik Vanhatalo
On Design of Experiments
in Continuous Processes
ERIK VANHATALO
ACKNOWLEDGEMENT
The research presented in this thesis was carried out at the Division of Quality
Technology, Environmental Management and Social Informatics, Lule University of
Technology between November 2005 and November 2009. Indeed, this work would
not have been possible without the important contributions from a large number of
people and organizations.
I am indebted to my supervisors Dr. Bjarne Bergquist and Prof. Kerstin Vnnman
for their continuous guidance, support, and valuable cooperation during the work
presented in this thesis. Thank you!
I thank LKAB for the financial support of this research, for the ongoing exchange
of ideas and research results during the research project, and for providing me with a
very interesting and challenging research case the LKAB Experimental Blast Furnace
(EBF). The financial support from the European Union, European Regional Development Fund, Produktion Botnia is gratefully acknowledged.
Many people at LKAB have made important contributions to the work presented
in this thesis. First of all, I thank Gunilla Hyllander who has been the head of the EBF
methodology development project and worked tirelessly together with me to move
the project forward. Gunilla has made important contributions to the research presented
in this thesis. For valuable contributions, interesting discussions and support I thank:
Mats Hallin, Anna Dahlstedt, Peter Sikstrm, Nicklas Eklund, Guangqing Zuo, Jonas
Lvgren, Mikael Pettersson, Carina Brandell, Anna Brnnmark, Sofia Nordquist, PerOla Eriksson, and Bo Lindblom. They have all made important contributions to the
results presented in this thesis and made my visits to LKAB a pleasure.
I want to thank all my outstanding colleagues at the university for their friendship
and support. Special thanks to Bjrn Kvarnstrm for all interesting collaborations,
exchange of ideas, discussions, and hockey talk during the years. For valuable feedback
at, for example, pie seminars I thank Thomas Zobel, Peter Sderholm, Bengt Klefsj,
Karin Schn, sa Wreder, Malin Albing, Anna-Karin Jonsson-Kvist, Klara Palmberg,
Peder Lundqvist, Fredrik Backlund and Mari Runardotter
Special thanks to Dr. Rickard Garvare for priceless feedback and comments on an
early draft of this thesis.
Finally, but certainly not least, I thank my family and friends for their support and,
at times, taking my mind off the continuous process of research questions and
questioning of my research. I dedicate this thesis to my mother Doris and my
grandparents Elsa and Bertil, who always support and encourage me.
Erik Vanhatalo
Lule, November 2009
III
ABSTRACT
Design of Experiments (DoE) includes powerful methods, such as factorial designs, to
help maximize the information output from conducted experiments while minimizing
the experimental work required for statistically significant results. The benefits of using
DoE in industry are thoroughly described in the literature although the actual use of
the methods in industry is far from being pervasive.
Continuous processes, frequently found in the process industry, highlight special
issues that are typically not addressed in the DoE literature. The overall objective of this
research is to increase the knowledge of DoE in continuous processes. More
specifically, the aims of this research are [1] to identify, explore, and describe potential
problems that can occur when planning, conducting, and analyzing experiments in
continuous processes, and [2] to propose methods of analysis that help the experimenter
in continuous processes tackle some of the identified problems.
This research has focused on developing analysis procedures adapted for
experiments in continuous processes using a combination of existing DoE methods and
methods from the related fields: multivariate statistical methods and time series analysis.
The work uses real industrial data as well as simulations. The method is dominated by
the study of the practical use of DoE methods and the developed analysis procedures
using an industrial case - the LKAB Experimental Blast Furnace plant.
The results are presented in six appended papers. Paper A provides a tentative
overview of special considerations that the experimenter needs to consider in the
planning phase of an experiment in a continuous process. Examples of important
experimental complications further discussed in the papers are: their multivariate nature,
their dynamic characteristics, the need for randomization restrictions due to
experimental costs, the need for process control during experimentation, and the time
series nature of the responses. Paper B develops a method to analyze factorial
experiments with randomization restrictions using principal components combined
with analysis of variance. Paper C shows how the use of the multivariate projection
method principal component analysis can reduce the monitoring problem for a process
with many and correlated variables. Paper D focuses on the dynamic characteristic of
continuous processes and presents a method to determine the transistion time between
experimental runs combining principal components and transfer function-noise models
and/or intervention analysis. Paper E further addresses the time series aspects of
responses from continuous processes and illustrates and compares different methods to
analyze two-level factorials with time series responses to estimate location effects. In
particular, Paper E shows how multiple interventions with autoregressive integrated
moving average models for the noise can be used to effectively analyze experiments in
continuous processes. Paper F develops a Bayesian procedure, adapted from Box and
Meyer (1986), to calculate posterior probabilities of active effects for unreplicated twolevel factorials, successively considering the sparsity, hierarchy, and heredity principles.
Keywords: Design of Experiments, Continuous process, Process industry, Multivariate
statistical methods, Process monitoring and control, Time series analysis, Analysis of
unreplicated factorials.
SWEDISH ABSTRACT
Frsksplanering omfattar kraftfulla metoder, exempelvis faktorfrsk, fr att maximera
informationsutbytet vid experiment och samtidigt minimera de resurser som krvs fr
att n statistiskt skerstllda resultat. Nyttan av att anvnda frsksplanering vid
industriella experiment r vl beskriven i litteraturen men varken knnedomen om eller
anvndningen av metoderna r lika utbredd i industrin.
Kontinuerliga processer, vilka r frekvent frekommande i processindustrin, ger
upphov till speciella problem vid experiment som normalt inte behandlas i litteraturen.
Det vergripande syftet med den freliggande forskningen r drfr att ka kunskapen
om frsksplanering i kontinuerliga processer. Mer specifika ml r att: [1] identifiera,
utforska och beskriva potentiella problem som kan uppst vid planering, utfrande och
analys av experiment i kontinuerliga processer, samt [2] att fresl analysmetoder som
kan vara till hjlp fr att hantera ngra av de identifierade problemen.
Denna forskning fokuserar p att utveckla analysmetoder anpassade fr experiment
i kontinuerliga processer genom att kombinera befintliga metoder inom frskplanering
med metoder frn de nrliggande omrdena multivariat dataanalys och tidsserieanalys.
Arbetet anvnder verklig industriell data samt simuleringar. Forskningsmetoden
domineras av praktiskt anvndande och tester av metoder inom frsksplanering och de
utvecklade analysmetoderna kring ett verkligt industriellt fall LKAB:s experimentmasugn i Lule.
Resultaten av forskningen presenteras i sex bifogade artiklar. Artikel A ger en
preliminr versikt av srskilda vervganden som behvs vid planeringen av ett
experiment i kontinuerliga processer. Exempel p problem som kan uppst i
kontinuerliga processer och som diskuteras i de efterfljande artiklarna r: deras
multivariata natur och dynamiska karaktr, behovet av begrnsad randomisering av
delfrsk fr att minska kostnader, behovet av processtyrning under pgende frsk
och resultatvariabler som representeras av tidsserier. Artikel B utvecklar en metod fr att
analysera faktorfrsk, med begrnsad randomisering, baserat p principalkomponenter
och variansanalys. I Artikel C anvnds den multivariata projektionsmetoden principalkomponentanalys fr att reducera vervakningsproblematik fr en process med mnga
korrelerade variabler. Processdynamik diskuteras djupare i Artikel D som utvecklar en
metod fr att bestmma omstllningstider mellan delfrsk baserat p
principalkomponentanalys samt verfringsfunktions-brus-modeller och interventionsanalys. Artikel E utvecklar, illustrerar och jmfr olika metoder fr analys av niveffekter
hos simulerade tvnivers faktorfrsk med dynamiska effekter och resultatvariabler i
form av tidsserier. Artikel E visar srskilt hur multipla interventionsvariabler,
kombinerat med ARIMA-modeller fr det kvarvarande bruset, kan anvndas fr att
analysera experiment i kontinuerliga processer. Artikel F utvecklar en Bayesiansk metod,
baserad p Box and Meyers (1986) metod, som i tur och ordning tar hnsyn till
analysprinciperna sparsity, hierarchy och heredity fr att berkna posteriorisannolikheterna att effekterna r aktiva fr icke-upprepade tvnivers faktorfrsk.
Nyckelord: Frsksplanering, Kontinuerlig process, Processindustri, Multivariat dataanalys,
Processvervakning och stryning, Tidsserieanalys, Analys av icke-upprepade faktorfrsk.
VII
CONTENTS
1. INTRODUCTION ...................................................................... 1
1.1 Industrial experiments..........................................................................................1
1.2 Design of Experiments.........................................................................................2
1.3 Continuous processes...........................................................................................4
1.4 Design of Experiments in continuous processes....................................................5
1.5 Research objective and scope ............................................................................10
1.6 The organization of the thesis ............................................................................10
5. FUTURE RESEARCH................................................................55
APPENDIX I ABOUT THE BLAST FURNACE PROCESS ...........57
A.1 The blast furnace process ..................................................................................57
A.2 The LKAB Experimental Blast Furnace (EBF) ..................................................59
REFERENCES ..............................................................................67
APPENDED PAPERS (A F)
IX
APPENDED PAPERS
This thesis includes the following six papers. The papers, which are appended in full, are
summarized and discussed in the thesis.
A1
B2
D3
Paper A was also presented by Erik Vanhatalo, as an invited paper, on October 9th, 2008 at the 52nd
Annual Fall Technical Conference in Mesa, Arizona, USA.
2
The article [Vanhatalo, E., Vnnman, K. and Hyllander, G. (2007). A Designed Experiment in a
Continuous Process. Helsingborg, Sweden: Proceedings of the 10th International Quality Management and
Organizational Development (QMOD) conference] was presented June 20th, 2007 at the conference by Erik
Vanhatalo and is an early and less comprehensive version of Paper B.
3
Paper D was also presented by Erik Vanhatalo at the 9th Annual Conference of the European Network for
Business and Industrial Statistics (ENBIS9) in Gotheburg, Sweden, on September 23rd, 2009.
XI
LIST OF ABBREVIATIONS
Abbreviation Full form
ANOVA
Analysis of variance
ARIMA
ARMA
CUSUM
Cumulative sum
DoE
Design of experiments
EBF
EVOP
Evolutionary operation
EWMA
LKAB
LTU
MANOVA
PCA
PLS
RSM
XIII
INTRODUCTION
1. INTRODUCTION
This chapter provides an introduction and background to the the research area. The objective
and scope of the research and the organization of the thesis are then presented.
Variables such as experimental variables are often labeled factors in DoE literature. Factors and
variables are used interchangeably in this thesis to label such entities that affect the system under
experimentation.
X1
Inputs
X2
X3
Xp
Output
Process
Process
(orsystem)
system)
(or
Z1
Z2
Z3
Y1, Y2 Yr
Zq
Figure 1.1 A general model of a process (system) under experimentation. Adapted from
Montgomery (2009, p. 3). Changes in the output responses (Ys) due to changes of experimental
factors (Xs) are measured. However, the system is often affected by uncontrollable factors (Zs) too.
Wu and Hamada (2000) and many other authors, for example, Box et al. (2005), use the concept
Experimental design to label the body of knowledge which is also often referred to as Design of
Experiments (used throughout the thesis).
INTRODUCTION
example, Montgomery (2009) and Wu and Hamada (2000). One of the most
significant contributions was the introduction of fractional factorial designs, see
Finney (1945). After World War II, the methods got a boost when they were
developed to tackle problems in industrial processes (especially in the chemical
industry). Key contributions were made by, for example, George E. P. Box leading
to, for instance, development of response surface methods and sequential
experimentation for process optimization, see Box and Wilson (1951). According to
Steinberg and Hunter (1984), other important subjects within DoE that received
attention during the 1970s were design optimality, computer-aided design, and
mixture designs. Since the 1980s and the much debated work of G. Taguchi,
discussed by, for example, Box et al. (1988), an increased focus has been on
experimental designs for variation reduction in products and processes. According to
Borror et al. (2000) and Montgomery (2009), the increased interest in quality
improvement by Western industries together with Taguchi methods helped to
expand the use of DoE. In particular, the methods became more widely used in
discrete parts industries, such as automotive and electronics manufacturing. Today,
DoE has grown far beyond the agricultural area and is now used in many areas of
science and engineering.
DoE contains many statistical methods and therefore knowledge about statistics
is a central part of understanding how the methods work. DoE along with statistical
process control was early adopted by the quality movement and are often classified
as important methodologies within quality management, see, for example, Hellsten
and Klefsj (2000), Deleryd et al. (1999), Xie and Goh (1999), and Powell (1995).
Today, important forum for the development of DoE methods are, for example, the
journals published by the American Society for Quality and other journals with the
word quality in their names. DoE is also one of the important methodologies in the
strategic quality improvemenet initiative Six Sigma, see Goh (2002). Since quality
improvement is linked to reduction of variation in products and processes, not least
due to Shewhart (1931), statistical thinking and quality improvement is closely
connected (Snee, 1990).
DoE is well known by professionals in the fields of statistics and quality, but
Goh (2001) argues that the use of DoE in industry is far from being pervasive.
Studies of industrial use of DoE show mixed results. In Sweden, Gremyr et al.
(2003) report that DoE is used by a little over 50 percent of the studied industries,
while Bergquist and Albing (2006) present a much lower number in their study.
Tanco et al. (2008) report that DoE was used by about 20 percent of the companies
INTRODUCTION
Continuous production processes can be found in, for example, the pulp and paper
industries, chemical industries, parts of the medical and food industries, as well as
parts of the mining and steel industries. The blast furnace process, which is studied
more closely in this work, is an example of a continuous process in the steel
industry.
Continuous production is generally characterized by, for example: hightechnological and complex production processes, capital-intensive production plants,
low-technological products, low added value to products, high production speed,
low equipment flexibility, large change-over times, large volumes of product, and
divergent product flow (many products are produced from a few raw materials), see,
for example, Rajaram and Robotis (2004), Dennis and Meredith (2000), Fransoo
and Rutten (1994), and Kim and Lee (1993).
According to Fransoo and Rutten (1994), continuous processes are typically
hard to control, leading to variable yield and reflux flows of material. The raw
materials to the process industries often come from mining and agricultural
industries and this means that the materials often are afflicted with natural variations.
http://www.scopus.com
http://www.scholar.google.com
INTRODUCTION
Dynamic systems
A prerequisite to correctly estimate the effect of the change in factor X on the
response Y is that full impact of the factor changes on the process must have taken
place. In continuous processes, the propagation of a disturbance (for instance when
changing an experimental factor X) can take time which can lead to requirements of
prolonged experimental runs compared to experiments in non-continuous
production (Saunders and Eccleston, 1992). Black-Nembhard and Valverde-Ventura
(2003) differentiate between dynamic and responsive systems. They explain that
in a dynamic system, a period of delay will occur between the time that X is
changed and the time that this change is realized in the output Y, while this change
in Y is immediate in a responsive system. The time needed for the change in X to
reach full impact in the process, in this thesis often referred to as the transition time,
can also depend on which factor that is changed and how large the change is. The
design of continuous processes often include, for example, tanks, reactors, chemical
reactions, buffer systems, reflux flows, mixing, product state changes and so on,
which typically make continuous processes dynamic systems.
Draper and Stoneman (1968) make the point that, in some situations, it can be
desirable to keep the number of level changes at a minimum, since the time
required for apparatus to return steady-state after changes can be considerable and
depend on the number of factors that are changed. In fact, this is opposite to the
recommendations of Saunders et al. (1995), Cheng and Steinberg (1991), and Meyer
and Napier-Munn (1999) who focus on maximizing the number of level changes to
keep bias from time trends in the process to affect the responses. John (1990) also
discuss how factorial designs can be made robust against time trends. Meyer and
Napier-Munn (1999) recommend that the sampling intervals should be as small as
possible, but not smaller than the time it takes for the process to reach a new
equilibrium state. Since recommended designs for time-dependent (autocorrelated)
processes may have a large number of level changes they may be costly to
implement (Martin et al., 1998). The experimenter may therefore need to balance
optimal designs against practical and cost issues, for example, the time required for
the process to return to steady-state after a change is made. Another related problem
highlighted by Pan et al. (2004) is that many continuous processes are non-stationary
and thus the steady-state assumption is often not reasonable for industrial data.
Although the processes often are non-stationary (or not in statistical control), key
DoE concepts like randomization, replication and blocking still make it possible to
perform designed experiments in these processes, see Bisgaard et al. (2008).
7
INTRODUCTION
INTRODUCTION
12
INTRODUCTION
The conceptual ideas for the analysis method were sparked during a course in time
series analysis and all authors helped develop the analysis method. The data
analysis work in relation to the paper was made by Erik Vanhatalo with the
assistance of Bjrn Kvarnstrm. Erik Vanhatalo mainly wrote the paper with
contributions by the other authors.
1.6.7 Paper E: Analyzing two-level factorial experiments with time series
responses. Vanhatalo, E., Bergquist, B., and Vnnman, K. (2009a).
Paper E focuses on the time series aspects of responses from continuous processes.
The paper proposes and compares different methods to analyze time series responses
and estimate location effects. Time series responses are simulated using dynamic
propagations of the effects to mimic a situation that can occur in continuous
processes. The results show how time series analysis and in particular multiple
interventions with autoregressive integrated moving average models for the noise
can be used to analyze two-level factorial experiments in a continuous process.
Time series analysis of the responses is compared with, for example, traditional
analysis methods using averages as the single response in analysis of variance. The
results indicate that by using intervention-noise models to estimate the significance
of the effects, fewer spurious effects are found when the effects are small compared
to the noise, and a larger number of the active effects are found when replication is
limited. The results also show that using averages for each run as the single response
is a straightforward and fairly robust analysis method, which is used to provide crude
estimates of the effects needed to guide the analyst using the multiple interventionnoise models.
This paper was initiated by Erik Vanhatalo who developed the simulation model
in Matlab for the dynamic effects and time series responses, performed all the
analyses, and did the main part of the writing of the paper. Bjarne Bergquist and
Kerstin Vnnman were both involved in the discussions leading up to the
simulation of the time series, the proposed analysis methods, the setup of the
study, and were also involved in the writing process.
1.6.8 Paper F: A Bayesian analysis of unreplicated two-level factorials
using effects sparsity, hierarchy, and heredity. Bergquist, B., Vanhatalo,
E., and Lundberg Nordenvaad, M. (2009).
Paper F focuses on the analysis of unreplicated factorials. The paper is not limited to
continuous processes only. However, in continuous processes, issues like process
stability concerns, cost of experimentation, and the relatively small differences in
factor levels can lead to small unreplicated designs and in these cases powerful
13
analysis methods are of special importance. The viability of the sparsity, hierarchy,
and heredity principles are first studied by analyzing experiments found in the
literature. The results are then used for prior probability assessment in a Bayesian
procedure, adapted from Box and Meyer (1986), to calculate posterior probabilities
of active effects for unreplicated two-level factorials. A three-step approach is
outlined using the results concerning the sparsity, hierarchy, and heredity principles.
Individual prior probabilities for each effect being active are specified in three steps,
successively considering sparsity, hierarchy, and heredity and posterior probabilities
are calculated for each step.
The paper was originally initiated by Bjarne Bergquist who located and performed
the analysis of the experiments found in the literature. Erik Vanhatalo worked to
adapt the Box and Meyer (1986) approach to use the proposed three-step
procedure and developed a calculation application in Matlab with the help of
Magnus Lundberg Nordenvaad. Magnus Lundberg Nordenvaad did the main
development of the Markov chain Monte Carlo integration procedure needed for the
Bayesian analysis method. Analysis of the examples in the paper and the writing
of the paper were performed by Bjarne Bergquist and Erik Vanhatalo jointly.
14
RESEARCH METHOD
2. RESEARCH METHOD
This chapter provides a summary of the research method and process including descriptions of
the methodological choices and data collection activities made during the research.
2.1 An introduction
I first came in contact with DoE in 2002 during courses at Lule University of
Technology (LTU) within the masters programme of Industrial and Management
Engineering. I immediately found quality technology and applied statistics a very
interesting area and I was happy when I got the chance to focus on this area in my
PhD studies at LTU. Prior to me becoming a Ph.D. student I held a teaching
position at the university for about 18 months where I taught, for example,
introductory courses in DoE, statistical process control, and multivariate statistical
methods for engineering students.
The research presented in this thesis began in late 2005 when I started as a
Ph.D. student. The arrangement of my research project meant that I together with
my supervisors Bjarne Bergquist and Kerstin Vnnman became involved in a
collaboration project with the Swedish mining industry company LoussavaaraKiirunavaara AB8 (LKAB).
LKAB had been running an Experimental Blast Furnace plant (hereafter the
EBF) in Lule since 1997, mainly for product development experiments and
customer experiments. When initially discussing the forming of the collaboration
project, the research engineers at the EBF (EBF engineers) had expressed an interest
to improve their experimental work and they were also interested in testing factorial
designs at the EBF. The typical experimental designs used at the EBF at the start of
the project, were different forms of one-factor-at-a-time experiments.
The collaboration project was viewed as an excellent opportunity to conduct
research on the use of DoE in continuous processes. Given that the EBF is
specifically designed for experimental purposes it would present opportunities to
study and take part in the planning and analysis of several experiments as well as to
learn from the experimental experiences of the EBF engineers. This led to the start
of the Experimental Blast Furnace methodology development project (hereafter
the EBF project) in November 2005. Two years into the project (in late 2007) it
was decided to prolong the collaboration for another two years until November
2009. Hence, the research presented in this thesis has been conducted within the
8
15
frame of this collaboration project and mainly with the EBF as the studied case.
Further descriptions of the blast furnace process and the EBF are given in Appendix
I and in the appended papers.
On behalf of LKAB, the project steering group has, with some minor variations over time, consisted
of Gunilla Hyllander, Mats Hallin, Anna Dahlstedt, Peter Sikstrm, Nicklas Eklund, Guangqing Zuo,
Carina Brandell, Jonas Lvgren, Mikael Pettersson, Anna Brnnmark, and Per-Ola Eriksson.
16
RESEARCH METHOD
elaborate and explain specific matters concerning the EBF process. Descriptions of
the seven steps of the interview process along with the questions can be found in
Vanhatalo (2007, p. 24 and Appendix).
With the understanding created from the interviews the next step was to plan,
conduct and analyze a pilot test of a factorial experiment in the EBF. The purpose
of this experiment was to investigate the potential of using factorial experiments as
experimental designs in the EBF. The design, analysis and results of this experiment
are not elaborated at any length in the appended papers. Briefly, the experiment was
a 2 2 factorial design with center points, testing the two process variables: blast
volume and moisture content of the blast air. The experiment required seven days
of operation in the EBF, and used 24 hours for each run. One of the conclusions
from the interviews and from the experience of running a factorial experiment in
the EBF was that a more structured way of planning experiments in the EBF was
needed. Therefore a new experimental planning guide was developed during the
spring and summer of 2006. This guide was developed in collaboration with the
EBF engineers and by incorporating recommendations found in the literature.
Further refinement of the planning guide came from using it to plan upcoming
experiments in the EBF. The essence of the planning guide is described by the
thirteen-step checklist given in Paper A.
The results from the pilot test of factorial designs in the EBF showed promising
potential but also raised new questions. An important concern was the minimum
length required for a factorial run in the EBF. There were also questions about the
proper analysis procedure for the experiments due to the multivariate nature of the
responses. Nonetheless, a new and somewhat more complex factorial experiment
was planned and conducted in October 2006. The second experiment tested two
experimental factors, one at three levels and one at two levels, see Paper B for more
details. This time experiences from the previous activities in the collaboration
project were used in the planning phase. Furthermore, after this second experiment,
the experience of working with the new planning guide was evaluated and flaws in
the guide were corrected to produce a template for future use at LKAB.
The period from November 2006 to May 2007 was spent reflecting over the
experiences from the two experiments in the EBF plant. The large number of
responses from the experiments in the EBF together with the recommendations of
multivariate analysis tools for similar situations in process industry found in the
literature led to a focus on multivariate statistical methods for the analysis. The
multivariate analyses of process data from the EBF experiments were made in close
17
18
RESEARCH METHOD
this point, the dynamics of the EBF process had been handled by simply excluding
observations of the responses during the transition time between the experimental
runs. The transition times had been estimated using the EBF engineers experience
adding some margin to be on the safe side. Time series analysis provided us with
formal methods to approach this problem. The initial work concerned the use of the
time series techniques transfer function-noise modeling and intervention analysis
[see e.g. Box et al. (2008)] combined with principal components to more formally
assess the transition times for experiments in the EBF and at the same time handle
the multivariate nature of responses. This work was made in the spring of 2009 and
the results are presented in Paper D.
During the summer and fall of 2009 we continued to work on the time series
aspects of the responses from continuous processes. During the work with transfer
function-noise models to estimate the transitions times the idea to use these models
to analyze the entire time series from an experiment was initiated. In particular, it
was interesting to compare a time series analysis approach with other ways of
analyzing experiments with time series responses, such as using averages of the
response in each run. Paper E describes this work where we chose to focus on twolevel factorial experiments with time series responses. As experimental data from
two-level factorials in the EBF were limited, a simulation program was built in
Matlab to be able to simulate responses from a continuous process using the EBF
as inspiration for the dynamics of the effects and responses in the simulations. The
results of this work are given in Paper E.
A question that had been discussed throughout the whole research project was
effective analysis procedures for unreplicated factorials. The question is by no means
limited to continuous processes but grew stronger in our minds since it soon became
evident that large experiments with many replicates typically become too costly in
industrial settings, not least for continuous processes. Hence, powerful analysis
procedures for unreplicated experiments are of special importance for large-scale
industrial experiments. Already in 2006 we started to think about how the analysis
of unreplicated experiments could benefit from incorporating prior knowledge in
form of the governing principles of sparsity, hierarchy, and heredity. The first idea
was to specify a method using the normal probability plot. However, in the spring
of 2009 the work was directed towards a Bayesian method that would allow us to
incorporate the principles into the prior probabilities and create a more formal
procedure. This work is presented in Paper F. Paper E also discusses the analysis of
19
unreplicated factorial designs, but focuses on experiments with time series responses.
The whole research process is also summarized in Figure 2.1.
November 2005
Start of the
research project
February 2006
Interviews with
EBF engineers
March 2006 to
August 2007
Developing a
planning guide
for experiments
at the EBF
Literature study
September to
December 2007
May to
December 2008
Planning, conducting,
and analyzing
factorial experiments
at the EBF
22 factorial
3x2 factorial
Developing a multivariate analysis
method
Paper
A
Reflecting on and
prioritizing areas for
continued research.
Writing of Licentiate
Thesis
Work to develop
multivariate monitoring
of the EBF
Paper
C
Mainly in the
spring of 2009
Paper
F
Paper
B
February to
June 2009
Developing a Bayesian
analysis procedure for
unreplicated two-level
experiments.
Developing a method
to determine transition
time for a dynamic
process like the EBF
August to
November 2009
Developing analysis
methods and performing
simulations of experiments
with time series responses
Paper
E
Figure 2.1 An overview of the main research activities during the research process.
20
Paper
D
RESEARCH METHOD
Exploratory
Descriptive
Explanatory
research has been run much in accordance with the abductive approach, see Figure
2.2.
Deduction
Induction
Abduction
THEORY
EMPIRICAL
DATA
Figure 2.2 Deduction, induction and abduction according to Alvesson and Skldberg (1994, p.
45) and the approach used in this research. The Figure is inspired by Sderholm (2005).
The research strategy can be viewed as the framework for the collection and
analysis of data, see Bryman (2001). This research has relied heavily on the EBF and
the EBF engineers as the source to find potential problems when planning,
conducting, and analyzing experiments in a continuous process. The EBF has also
been the source of industrial data and constitute the case around which the proposed
analysis methods have been developed and tested (see especially papers B, C, and
D).
It is possible to view the research strategy as dominated by a single case study
of an industrial case (the EBF). Yin (2003) is an excellent reference for case study
research but has an explicit focus on social science research. This research studies the
use of statistical methods in an industrial context but the work mainly relates to the
development and testing of analysis methods inspired by the industrial context; a
strategy that certainly does not fit the typical description of a case study.
Nonetheless, I believe that certain aspects of the recommendations for research
design and methods in Yin (2003) are worth considering and I refer to them when
appropriate. For example, using a single industrial case was judged to be a good
strategy since the EBF provided the opportunity to closely follow the experimental
work in a continuous process specifically designed for experimental purposes. The
EBF can therefore be argued to be a unique case, and hence be a reason for
choosing a single case, see Yin (2003).
This research can also be considered to include elements of action research, as
the author (and his supervisors) participated in, for example, the planning and the
analysis of experiments at the EBF plant. Whether action research is a research
strategy by itself or not is not obvious in the literature. However, I view action
research as a method to perform a case study, see also Gummesson (2000). Coughlan
and Coghlan (2002, p. 236) argue that action research generates emergent theory,
22
RESEARCH METHOD
in which the theory develops from a synthesis of that which emerges from the data
and that which emerges from the use in practice of the body of theory which
informed the intervention and research intention. The action research process is
undertaken in a spirit of collaboration and co-inquiry and aims to stimulate change
in organizations, develop self-help competencies, and to add to scientific knowledge
(Shani and Pasmore, 1985). I believe these descriptions of action research fits well to
the collaborative nature within the EBF project.
10
The protocols can be viewed, after consideration of possible secrecy issues, by contacting the author
of this thesis.
23
Main data
collection through
Interviews with
EBF engineers.
Observations and
discussions.
Paper
B
Main data
collection through
Planning,
performing, and
analyzing a 3x2
factorial experiment
in the EBF.
Planning,
performing and
analyzing two
factorial
experiments.
Paper
C
Main data
collection through
Study of process
data from the EBF.
Discussions with
EBF engineers and
study of logbooks.
Online tests of
monitoring model.
Paper
D
Main data
collection through
Study of past
process data and
experimental factor
changes in the EBF.
Paper
E
Main data
collection through
Study of process
data from the EBF.
Simulations.
Paper
F
Main data
collection through
Literature survey of
published
experiments.
Simulations.
Discussions with
EBF engineers.
Work to develop
planning guide.
Data analysis
activities
Data analysis
activities
Data analysis
activities
Data analysis
activities
Analysis of
interviews.
Univariate analysis
of experimental data.
Principal
component analysis.
Principal
component analysis
Analysis of
experimental data in
spreadsheet
software and by
multivariate
statistical analyses.
Principal
component analysis.
Comparison of
monitoring signals
with actual control
actions in logbooks.
Time series
analysis through
transfer functionnoise models and
intervention analysis.
Multivariate (and
univariate) analysis
of variance.
Evaluation of
online tests.
Data analysis
activities
Comparison of
analysis methods for
simulated
experiments with
dynamic effects.
Analysis of
variance.
Time series
analysis.
Discussions with
supervisors and
engineers at LKAB
EBF.
Intervention-noise
modeling.
Data analysis
activities
Analysis of
experiments found in
the literature survey.
Bayesian
calculation of
posterior
probabilities.
Re-analysis of
experiments found in
the literature.
Figure 2.3 The main data collection and analysis activities in connection to each paper.
Secondly, on a summarized level, the analysis of the empirical data and results
from the separate research activities was made by comparing empirical evidence
against the theory by reflection and drawing of rational conclusions. The analysis can
hence be described as an iterative process where the empirical results in the research
project either strengthened or overthrew our prevailing understanding about DoE in
continuous processes. Discussions between the author and his supervisors as well as
with EBF engineers at both informal and formal project meetings have been an
important part of this analysis process.
24
RESEARCH METHOD
Research quality
Validity concerns
the integrity of the
conclusions
generated from the
research. That is,
do we measure
what we intended to
measure?
Reliability
concerns the
question if the
results of a study
are repeatable. That
is, can the results
be considered
stable?
In order for
replication to take
place, the study
needs to be
replicable. This
criteria is closely
connected to
reliability.
Figure 2.4 Three important criteria for evaluating the quality of research. The figure is inspired
by the explanations of the concepts in Bryman (2001).
Another important criterion for assessing the quality of research is the domain to
which the results can be generalized, which is referred to as external validity by Yin
(2003).
Yin (2003, p. 34) describes tactics useful within a case study research to secure
research quality. I believe that some of these tactics are relevant to strengthen the
quality of this research. A strengthening of the validity of the conclusions from the
research based on the EBF case, is provided by the multiple sources of evidence that
were used during data collection, such as experiments, interviews, data analysis of
process data, and observations. The many different sources of evidence gathered
using the EBF case makes data triangulation possible, see Yin (2003).
To further strengthen the validity, separate reports describing ongoing work
and emerging results have been produced within the EBF project and continuous
discussions were held with key informants at the EBF plant. Examples of such
separate reports are protocols from project meetings, monthly reports, and internal
feedback reports. The frequent discussions helped us to focus on problems that were
considered important to the engineers doing experiments in a continuous process.
A general weakness of research that relies heavily on the study of a single
industrial case, here in form of the EBF, is that the results may be hard to generalize,
or in other words, the results may have poor external validity (Yin, 2003). Instead of
statistical generalization (made possible by studying many different industrial
contexts), this research has to rely on analytical generalization which tries to
generalize the results to broader theory. The results are analytically compared to
previous theory about DoE, continuous processes, and the methods of analysis.
Analytical generalization is hence used to increase the external validity of the results.
25
In addition, the ongoing study of literature has been used to guide decisions
concerning data collection activities during the research project.
A challenge in case study research is to produce reliable results, or more
specifically, results that can be repeated (Yin, 2003). In the EBF project, most
decisions and data collection activities were recorded in protocols from meetings
and stored in a database. Documentation of activities and decisions within the EBF
project has thus been used to increase the reliability. However, it is difficult to
replicate the study because of the elements of action research, the collaborative
nature of data collection within the study, and the uniqueness of the EBF setting.
The specific data analyses, using, for example, process data from the EBF, described
in the appended papers are however possible to replicate for an outsider, which
strengthens the reliability of the results.
26
Barton (1997) discusses experimental planning but focus on graphical tools to aid the
planning process.
Table 3.1 Seven steps of designing an experiment. Source: Montgomery (2009, p. 14), who
argues that steps 2 and 3 are often done simultaneously or in reverse order.
Step Activity
1
28
designs the accumulated transition times can become too costly. Hence,
randomization restrictions are often needed for experiments in continuous processes.
Transition time
Response
variable
Yi
Full impact
of change
Change of
factor level
Time
Figure 3.1 An illustration of the need for a transition time between experimental runs in an
experiment in a continuous process.
Many responses (often highly autocorrelated) are needed to capture the effect
of the experimental treatments. A multivariate response situation needs to be
considered already during the planning phase as it undoubtedly affects the analysis of
the experiment. The multivariate situation also makes it more complicated to follow
the recommendations in literature of detailed planning of, for example, anticipated
effects and to foresee and document suspected interactions.
Furthermore, experiments in continuous processes typically mean large-scale
and long-term experimentation (around the clock). Coordination, information, and
control issues become even more important (and complicated) in such cases. The
scale, complexity, and the many involved people in the experiments make it hard to
perform pilot tests of, for example, factor levels and the experimenter must also plan
for breakdown of important process equipment. The continuous nature of the
process makes every incident more severe since it can come to affect a long period
of process operation and results in large costs.
3.1.1 Experimental design and continuous processes
Industrial experimentation is expensive not least in full-scale continuous processes.
Therefore, factorial designs, especially two-level factorials, are often interesting
designs that produce information at a relatively low cost. Montgomery (2009)
consideres two-level factorial designs to be the cornerstone of industrial
experimentation. Two-level factorials also form the basis for fractional factorial
designs which are valuable for screening experiments (Box et al., 2005). Fractional
29
factorials are arranged so that less likely interactions are aliased (varied in the same
pattern) with factors or interactions considered more likely to be active. The
resolution of a fractional factorial design provides important information about the
alias structure of the design. In a resolution III (three) design, for example, main
effects are aliased with two-factor interactions. See, for example, Myers and
Montgomery (2002) for more on design resolution. Czistrom (1999) describes the
advantages of factorial designs (compared to one-factor-at-a-time experiments) for
testing two or more experimental factors:
x they require less resources for the amount of information that is obtained,
x the effect estimate for each factor (or interaction) is more precise given the
same number of observations,
x the interaction effects between two or more factors are systematically
estimated, and
x the experiment produces information in a larger region of the factor space.
Figure 3.2 gives an example of short notation (used in Paper F) and explanations for
a two-level 1/8 fractional factorial design testing six factors in eight runs with
resolution III.
Number of factors
Number of levels
for each factor
Number of design
generators (fractional
factorials)
2 6III3
Number of runs
required for
each replicate
of the design
Resolution of the
design (fractional
factorials)
The appended papers to this thesis have an explicit (Paper B, E, and F) as well as an
implicit (Paper D) focus on the use and analysis of factorial designs in continuous
processes.
Randomization is one of the core principles of well-designed experiments and
should be used whenever possible, see, for example, Young (1996) and Bjerke
(2002). That is, both the allocation of experimental material and the order of the
runs in the experiment should be determined randomly. Randomization is used to
avoid bias or systematic error to affect the conclusion of the experiment (Cox and
Reid, 2000). Randomization should be considered of special importance when
experimenting in processes that are non-stationary in nature, see Bisgaard et al.
(2008).
30
31
32
long time period) may be the only viable experimental design for full-scale
production plants.
3.1.2 The need for process control and monitoring during
experimentation
As highlighted in Papers A and B and specifically discussed in Paper C, process
control during the experiment may be unavoidable. In the EBF case the thermal
state of the process needs to be monitored and controlled for personal and plant
safety reasons. In many plants autonomous and automatic control systems are
constantly working to create process stability even during the experiments. Process
control may also be non-automated (manual) and performed by operators. An
adaption of Figure 1.1 may therfore be appropriate to give a more realistic
representation of an experiment in many industrial processes, see Figure 3.3.
Controllable factors:
experimental factors, control variables
and held-constant factors
X1 X2
Inputs
C1 C2
Xp
Ck
Output
Process
Process
(orsystem)
system)
(or
Z1
Z2
Z3
Y1, Y2 Yr
Zq
Figure 3.3 An adaptation of Figure 1.1 to show a general model of an industrial process under
experimentation. Here Xs label experimental factors varied according to a pre-determined
experimental design or held-constant factors. The Cs label control variables that are varied to
maintain process control during the experiment. The Zs are the uncontrollable (noise) factors.
As pointed out by Hild et al. (2000) control actions can lead to that process
responses are not directly visible as changes in typical responses but instead as
changes in control variables. In relation to Figure 3.3 this means that the response
due to an experimental treatment may be displaced from typical responses (Ys) to
control variables (Cs). Analyzing data from a process subjected to feedback control
(automated or manual) is often referred to analysis under closed-loop operation, see,
for example, Box and MacGregor (1974). An implication is that sometimes these
control variables must be used as responses. A prerequisite to do so is, however, an
unbiased control. When people are involved in the control actions, as is the case at
the EBF, a subjective dimension is added to control decisions which further
33
complicates the matter. Particularly, it becomes hard to secure that the same control
actions are made by different people given a certain process situation.
Another complicating issue is the need to analyze many responses jointly to
determine the current process state and judging the need for control actions. This is
a consequence of the multivariate nature of continuous processes. Furthermore, the
control of a continuous process is complicated due to its dynamic characteristic. The
experimenter should anticipate some time-lag for the process control actions to
reach full effect, just as for responses to experimental factor changes. In addition,
there may be a time-delay in the measurement of responses used as information
about the process state. Paper C deals with the situation at the EBF plant where
process control actions are made by operators based on information about the
process state given by certain responses from the process. A starting point for
unbiased control actions is process monitoring that signals when something out of
the ordinary is occurring in the process. The need for process monitoring relates this
research, Paper C in particular, to the area of statistical process control (or statistical
quality control). It is a well-developed area and an introduction can be found in, for
example, Montgomery (2005). Distinctive tools that form the basis for statistical
process control are control charts such as the Shewhart (Shewhart, 1931),
cumulative sum, CUSUM, (Page, 1954), and the exponentially weighted moving
average, EWMA (Roberts, 1959). Briefly the control charts are used to monitor
processes and designed to signal12 when a process shift has occurred or when the
variability in the process is unusually large or small.
Multivariate monitoring and control have received increasing attention not
least due to the development of computers and software the last decades. The focus
on multivariate monitoring and control is especially apparent within the area of
chemometrics and many examples of multivariate process monitoring and control
come from process industries and continuous processes. Wise and Gallagher (1996)
provide just one of many descriptions of how process industries often are richly
instrumented with sensors routinely collecting measurements on many process
variables, such as temperatures, pressures and physical properties. Hence, process
industries often need to monitor a multitude of variables and they face a multivariate
monitoring situation.
12
The typical terminology is to say that a process that operates with only chance causes of variation
present is in statistical control but when assignable causes are present the process is said to be out of conrol,
see Montgomery (2005).
34
35
a quick overview of the thermal state in the EBF process, and a summary of
the information in many original process variables which is important to
make correct and timely control decisions,
a way to standardize how the thermal state is assessed, which is one step
towards an unbiased decision of when to perform control actions even though
human deliberations may still be needed to determine what the appropriate
action is, and
formal decision criteria to determine when the process is operating normally,
which could be used to decide on when a new experiment is to be started or
to select data observations to be included in the analysis of the experiment.
significant differences within the group of responses. By doing this the overall
significance level is known and can be kept at the desired level. Individual ANOVAs
can then be used to test the main and interaction effects on individual responses.
Similar to ANOVA, MANOVA partitions the total variation into components
attributable to the main and interaction effects and to the error. Unreplicated
experiments have no internal estimate of the experimental error and therefore
require other types of analysis. Analysis of unreplicated factorial designs will be
discussed more thoroughly later in connection to Paper F.
3.2.1 Multivariate statistical analysis of experiments
Multivariate statistical analysis refers to all statistical methods that
simultaneously analyze multiple measurements on an object (Hair et al., 1998).
Typically, an analysis of more than two variables simultaneously can be considered
multivariate. Today, it is common for industries to have to deal with large amounts
of measurement data, such as temperatures, pressures and physical properties, which
often are logged on-line (Yang, 2004). This is a common situation for continuous
processes which is discussed above and in the appended papers.
In a situation with many crosscorrelated variables to analyze, a one-variable-ata-time approach to analysis is often ineffective, inefficient, and can contribute to the
drawing of wrong conclusions. MacGregor (1997) argues that interpreting results
from a univariate approach to analysis under the presence of correlation among
responses is analogous to the inferior one-factor-at-a-time approach to experiments
under the presence of interactions; Daniel (1959) also makes this point.
The field of multivariate analysis incorporates many techniques which are not
all elaborated here. A good reference is Johnson and Wichern (2002), who explain
many of the commonly used multivariate techniques. The research in this thesis has
made frequent use of latent variable techniques, and PCA in particular. The two
latent variable techniques PCA and PLS reduces the dimensionality of the data by
projecting the information in the data into low-dimensional spaces defined by a
small number of latent variables (Kourti et al., 1996). PCA is discussed and described
in more depth in papers B, C, and D and is therefore not elaborated here. Jackson
(1980; 1981; 2003) also provide a good introduction to PCA. Eriksson et al. (2006)
describe PLS as a regression extension of PCA with the aim to connect two data
matrices (X and Y) to each other. While PCA can be described as a maximum
variance least square projection of one of the matrices X or Y, PLS is a maximum
covariance model of the relationship between X and Y (Eriksson et al., 2006). The
37
mathematics behind the PLS technique can be found in, for example, Eriksson et al.
(2006) or Hskuldsson (1988).
Paper B illustrates the important connection between designed experiments
(using factorial designs) and multivariate statistical analysis and shows how the
multivariate characteristic of continuous processes affects the analysis of the
experiments. Paper B proposes an analysis method for experiments in the EBF
process that combines PCA, MANOVA, and ANOVA. PCA is used to derive
latent, uncorrelated variables (principal components) that summarize the strongest
signals in the response data. The principal components are then used as new
responses to test for statistical significance of main and interaction effects. The many
responses from each run in the EBF process are time series with observations
available each second or each hour. Consequently, the principal components also
become time series from each run. Since the original responses are highly
autocorrelated, so are the principal components. Therefore the approach used in
Paper B is to calculate averages of the values of the principal components from each
run before applying MANOVA and ANOVA to the averages. The assumption of
independent and normally distributed observations for MANOVA and ANOVA is
reasonably achieved by calculating averages for each run. The proposed analysis
method is summarized in Figure 3.4. The analysis in Paper B also compares the
result between the assumption of a completely randomized design and a split-plot
design since the design used in Paper B lies somewhere between the two extremes.
M responses
N observations
Y1
Y2
trix
ma
YM
t1
PCA
t2
l
ip a
nc e nts
i
r
P
on
mp
o
c
tk
MANOVA
and
ANOVA
(on averages)
Two examples of similar analysis approaches as the one proposed in Paper B have
been found in the DoE literature. Ellekjer et al. (1997) use PCA and normal
probability plots of the estimated effects based on principal components to analyze
split-plot experiments. Bjerke et al. (2008) use PCA as a first step when analyzing an
experiment with restricted randomization in the food industry.
38
T 0 4 B at
(3.1)
Yt G1Yt 1
Xt
Z0 X t b
Dynamic
system
Input
Yt
Output
Figure 3.5 The dynamic relation between an input to, and output from a system can be
represented by a transfer function. The figure is adapted from Box et al. (2008, p. 440).
Intervention analysis (Box and Tiao, 1976) can be viewed as transfer function-noise
modeling using qualitative step or pulse variables as inputs (instead of continuous X
variables) to indicate the presence or absence of an event of some kind. Transfer
function-noise models and intervention analysis are further discussed in Papers D
and E, see also Box et al. (2008), Montgomery et al. (2008), Wei (2006), and Jenkins
(1979).
Paper D returns to the observation discussed above and in connection to
papers A and B, namely that continuous processes are dynamic systems with inertia.
Process dynamics must be considered already during experimental planning since it
affects the required length of the experimental runs in the process. Bisgaard and
Kulahci (2007) briefly discuss the problem of studying what they call regime
changes in industrial processes and point to the use of transfer functions and
intervention analysis to study the transition periods. Paper D outlines a method to
analyze process dynamics and formally estimate the transition time required between
runs in a dynamic process with many responses. The proposed method combines
the time series analysis techniques transfer function-noise models and intervention
analysis with PCA. Similar to the approach in Paper B, PCA is performed in order
to summarize the variation in the responses. The principal components are then
used in conjunction with transfer function-noise models (quantitative X) or
intervention analysis (qualitative X) to model the propagation of a change of
experimental factors in the EBF process. The steps used for model building
proposed in Paper D (except the use of principal components) builds on the
recommendations in Montgomery et al. (2008), and Bisgaard and Kulahci (2006b;
2006c).
Paper D once again highlights the multivariate nature of continuous processes
and connects the use of multivariate statistical methods to time series analysis
40
41
operation. Dynamic effects are then allowed to affect the resulting time series during
the simulated experiment, for example:
yt
(3.2)
Even if two-level factorials and fractional factorials are used, the experimental design
can become too large and costly if it is replicated. Therefore, unreplicated two-level
factorials are often used to generate information at a low cost. However,
unreplicated designs provide no independent estimate of the experimental error.
As described in Paper F, the analysis of unreplicated two-level factorials is
traditionally made by studying a normal probability plot or half-normal probability
plot to determine which of the effects that seem to divert from the reference
distribution of inert13 effects, see, for example, Daniel (1959). Other more formal
methods of analysis have been proposed in the literature. For example, it is possible
to select a number of effects or contrasts, prior to the experiment, that are unlikely
13
Frequently used nomenclature calls effects that are significantly larger than most of the other effects
active effects, while those effects that seem to be measuring only random noise are called inert.
42
to be active and use them to estimate the experimental error, see Finney (1945).
Another alternative is to sort the contrasts based on their absolute sizes and use some
fraction of the smallest effects to estimate the distribution of inert effects, see, for
example, Lenth (1989). Much work has been focused on developing objective
methods of analysis for unreplicated experiments, for example, Daniel (1959), Zahn
(1975), Box and Meyer (1986), Voss (1988), Benski (1989), Lenth (1989), Berk and
Picard (1991), Le and Zamar (1992), Box and Meyer (1993), Dong (1993), and
Venter and Steel (1996). Hamada and Balakrishnan et al. (1998) provide a
comprehensive review and comparison of these and other methods. The research
area of the analysis of unreplicated factorials remains an active one, and further
contributions are given by, for example, Sandvik-Wiklund and Bergman (1999),
Chen and Kunert (2004), and Costa and Pereira (2007). The results in foremost
Hamada and Balakrishnan et al. (1998) and Chen and Kunert (2004) show that there
is no clear winner among the methods. Some methods are good when there are
only a few active effects but perform worse when there are many active effects.
Most analysis methods for unreplicated factorials rest on the implicit hypotheses
called the effects sparsity principle, which states that, in general, only a few of the
effects in a factorial experiment will be active, see Box and Meyer (1986). Two
other important hypotheses for the analysis are the effects hierarchy principle,
which states that lower order effects are more likely to be important than higher
order effects (Wu and Hamada, 2000), and the effects heredity principle, which
implies that an interaction is more likely to be active if at least one of its parent
factors are active. These hypotheses are often referred to but have seldom been
validated in the literature. Paper F sets out to investigate the viability of the three
hypotheses important for analysis of unreplicated experiments by studying
experiments found in the literature. All three principles are found to be viable. The
results presented in Paper F largely agrees with those presented by Li et al. (2006).
In Paper F it is argued that during analysis of unreplicated experiments, prior
knowledge (such as knowledge about the three principles) could make a
contribution during analysis and increase power. Paper F focuses on incorporating
the prior knowledge about the three principles in the Bayesian14 approach to analyze
unreplicated two-level factorials presented by Box and Meyer (1986), which in its
original form only considers effects sparsity. In the adapted Box and Meyer method
14
In Bayesian inference, the unknown parameters are regarded as stochastic variables with a prior
distribution. From the observations the posterior distributions of the parameters are calculated using
Bayes rule. In classical (or frequentist) inference, the unknown parameters are regarded as
deterministic.
43
Di
P i active Ti ,V
T 2
exp 2 i 2
k
2k V
2
T
T 2
Di
exp 2 i 2 1 D i exp i2
k
2k V
2V
(3.3)
The conditioning on V in (3.3) needs to be removed and Paper F shows how this
can be achieved by integrating (3.3) over the posterior distribution of V , pV T .
Box and Meyer (1986) propose numerical integration to calculate the posterior
probabilities. Stephenson et al. (1989) show how the posterior probabilities can be
calculated analytically for up to 15 effects as well as by numerical integration. The
solution in Paper F involves the application of Bayes rule, see Gelman et al. (2004):
p V T
p V p T V
p T
v p T V p V
(3.4)
and numerical integration using a Markov chain Monte Carlo approach. In Paper F,
the Metropolis algorithm is used to perform the numerical integration required to
calculate the posterior probabilities. The Metropolis algorithm is a special case of the
Metropolis-Hastings algorithm discussed by, for example, Chib and Greenberg
(1995) and Gelman et al. (2004).
Paper F outlines a three-step method that successively considers the sparsity,
hierarchy, and heredity principles and calculates the posterior probabilities that
effects are active. The principles are incorporated in the adapted Box and Meyer
(1986) method by adjusting the prior probabilities, D i (see Eq. 3.3), for the effects.
The method in Paper F extends the Box and Meyer (1986) approach by also
considering effects hierarchy and heredity. These principles have been incorporated
44
in other Bayesian algorithms for variables selection for more general regression
models, see Chipman (1996), and Chipman et al. (1997).
In addition, the Bayesian method in Paper F also allows for the consideration
of process knowledge when specifying the prior probabilities for the effects.
However, process knowledge and statistical analysis skills do not always reside in the
same person and therefore Box and Liu (1999) stress the collaboration between
statisticians and experimenters during design and analysis. Knowledge about both
statistics and the process itself is important to successfully design, conduct and
analyze an experiment.
45
48
control actions. Furthermore, there are often many variables that have to be
considered jointly to make the control decision. If the control includes human
deliberations at specific processing states, as in the EBF case, a subjective dimension
is added to control decisions which further complicates the matter. Process
monitoring is needed to make correct and timely control decisions. Paper C
discusses the development of multivariate monitoring of the EBF process using
principal components to reduce the number of variables to monitor, improve
monitoring results, decrease subjectivity in control actions, and in the end improve
experimental validity. However, an exhaustive investigation of the importance of
the effects of process control on the results and suitable strategies to tackle the need
for control actions during experiments has not been conducted during this research.
An experimenter that performs lengthy experiments in a complex process,
closely resembling a full scale production plant, should be prepared for the
possibility of disturbances (sometimes of a critical nature) during the experiment. In
continuous processes, disturbances in individual runs can, due to the dynamic nature
of the process, come to affect a long period of operation. Good reliability of process
equipment and proper maintenance are hence important issues, but also to,
beforehand, develop strategies to tackle such disturbances if they do occur. The
adaptive design strategy, exemplified in Paper B, is an example of such a strategy. In
brief, this means that the experimenter can choose to prolong individual runs during
the experiment if disturbances occur. Alternatively, leaving unplanned time at the
end of the experiment to be able to compensate for disturbances can be a good idea.
The experimenter in continuous processes often cannot measure the actual
phenomena taking place in the process due to a variety of practical issues. Instead,
many secondary responses such as flows, pressures, and temperatures are measured
and the experimenter needs to use prior process knowledge to interpret what is
occurring inside the process. The many responses are typically crosscorrelated and
frequent logging of many responses together with the dynamic characteristic of
continuous processes causes a high degree of autocorrelation (often positive).
Therefore, as shown in Papers A, B, C, and D, multivariate statistical methods make
an important contribution during the analysis of experiments in continuous
processes. This research has especially shown how latent variable techniques (in
particular PCA) can be used to extract the strongest signals in response data when
there are many responses to analyze. The latent variables can then be used as new
responses in the following analyses of the experiments or in process analysis before
the experiment. The multivariate nature of response data can also become
49
problematic during the planning phase, see Paper A. An abundance of responses and
possible interactions can make it impractical to maintain the detail (for example
predicting effects and interactions) in the experimental planning process. However,
it is still critical that prior process knowledge is used in the planning process.
In continuous processes, restrictions on the ranges in which factors may be
varied are frequent. With a weak signal sent into the system, the corresponding
effect on the process output or performance can be difficult to detect, especially if
the noise of the process is further amplified by process control activities. In addition,
as split-plot designs often need to be used in continuous processes, the experimenter
should also expect a lower precision when measuring whole-plot effects.
Furthermore, experiments in continuous processes are expensive and making many
replications of experimental runs is therefore not always realistic. Paper F, although
not limited to continuous processes, shows how prior process knowledge, or
knowledge about the sparsity, heredity, and hierarchy principles, can be used to
increase the power of the analysis of unreplicated factorial designs. Underlining the
importance of prior process knowledge in continuous processes is the fact that the
experimenter will need to consult process knowledge to interpret the meaning of,
for example, principal components, if multivariate statistical methods are to be used
during analysis.
This research also demonstrates some advantages of viewing responses from
experiments in continuous proceses as time series. Again, this is due to the dynamic
characteristic of the processes and that experimenters in continuous processes
normally are interested in the performance of the process during experimental runs
with an experimental setup which is fixed for some period of time. As shown in
Papers D and E time series analysis thus becomes a useful tool to analyze
experiments in continuous processes, to model process dynamics, and to establish
transition times. In particular, Paper E shows that using an intervention-noise model
to analyze an experiment with time series responses constitutes a more
comprehensive method that seems to result in fewer spurious effects and higher
power for unreplicated experiments than using a more simplified analysis based on
averages and ANOVAs.
50
in continuous processes is exclusively based on a blast furnace process (in pilot scale).
This choice has both advantages and drawbacks.
By focusing on one specific case (the EBF), an in-depth understanding of the
special considerations that are needed to plan, conduct and analyze experiments in
the process has been attained. The pilot-scale, and the fact that the EBF is
specifically designed for experimental purposes, also provided unique opportunities
to study and be a part of many experiments during the research project. Although
the EBF is in pilot scale, it is by no means a small plant. Running the EBF requires
similar deliberations, personnel, and machinery as running a full-scale furnace, but
the volumes handled are of course much smaller. If the choice had been made,
instead, to study several different continuous processes (during the same amount of
time) it would, according to this author, have been at the cost of the depth of
understanding of the phenomena that the experimenter needs to consider in the
specific processes. The collaboration between the author and the EBF engineers has
been valuable to create an understanding of the problems that are encountered when
trying to apply DoE methods and related analysis tools in a complex continuous
process setting. It is my belief that this understanding could not have been acquired
using, for example, interviews, questionnaires or simulations only. However, I am
aware that the strength of the approach at the same time is its weakness, since some
of the results presented in this thesis cannot without reflection be transferred to
other continuous processes. Instead I have to rely on analytic generalization.
However, I believe that the proposed analysis methods in papers B, D, E, and F are
general and not limited to the studied industrial case.
It is my conviction, from studying the literature, that the special considerations
that this research reports regarding planning, conducting, and analyzing experiments
based on the EBF case apply for many continuous processes. Important experimental
complications for continuous processes have been found and verified by studying
the EBF. Among these are, for example, the problems of running large and fully
randomized experiments, their dynamic characteristics, the multivariate nature of
data, the need for process control during the experiments, and the time series aspects
of the responses.
However, it is not unlikely that I would have found some additional
circumstances and complications of importance if I had been studying, for example,
a paper mill, a pelletizing plant, or a chemical process. Especially, I belive that the
even larger scale of a full-scale production plant with its specific complications, such
as having to sell the product produced during the experiment, further complicates
51
4.3 Contribution
Approaching the end of this thesis it is indeed time to ask the question: Is there a
contribution in all this? Below I put forward what I believe to be the main
contributions of the research presented in this thesis.
This research explicitly explores and describes special considerations and
problems that can be encountered when planning, conducting and analyzing
experiments in a dynamic continuous process. These types of considerations and
problems have been scarcely described in DoE literature. Using the EBF case to
discuss experimental challenges and demonstrate many of the proposed analysis
methods in the appended papers hopefully adds to create a better understanding of
the practical use of DoE methods in industry. The identified special considerations
and problems can be seen as a theoretical contribution to the DoE field regarding
the use in practice of DoE methods, such as the use and analysis of factorial designs.
This reseach also identifies the need to, and illustrates the benefits achieved by,
combining methods from four rather distinct fields: DoE, multivariate statistical
methods, time series analysis, and statistical process control and monitoring to deal
with some of the identified problems. In particular, this research shows how
multivariate approaches to analysis and monitoring, using time series analysis to
determine transition times and analyze experiments, and a Bayesian approach to
analysis of unreplicated experiments can be used to tackle some of the problems in
continuous processes. Although this research does not deal with theoretical
development of the specific applied analysis methods per se, I believe that the
development of adapted analysis procedures for experiments in continuous processes
is an important contribution to the DoE field and provide powerful aids for the
experimenter in continuous processes.
52
53
future work to develop strategies and methods of analysis for response data collected
from a process under control (closed-loop) would be valuable.
Many process responses from the EBF are time series that the analysts need to
handle in some way. Established analysis methods at the EBF often use, for example,
averages and standard deviations of the responses to analyze the experiments. Papers
D and E of this thesis show the benefits achieved by using time series analysis
techniques to analyze process dynamics and analyze experimental results. I believe
that time series techniques can be used to extract even more information from the
experiments, and reduce the possible negative effect that the noise in the process can
have on the analysis results. Since the experimental time in the EBF is costly, I argue
that more powerful analysis methods, although more complicated, are warranted to
make the most out of the information output from the conducted experiments.
Added importance comes from the fact that the experiments often have a limited
number of replications. Indeed, there still remains issues concerning, for exmple,
how to handle process disturbances and missing values in the response time series
from the EBF, that need to be addressed. The Bayesian analysis method presented in
Paper F can also be used to analyze experiments at the EBF, where the costs and
time concerns do not allow for many replications of experimental runs. The
Bayesian approach allows for incorporation of prior knowledge, which the analyst
may have regarding activity of effects, to increase the power of the analysis.
54
FUTURE RESEARCH
5. FUTURE RESEARCH
Research should also be viewed as a continuous process. Ideas and new questions for future
research have come up during the research process and this chapter presents those implications
for future research that I find the most interesting.
A natural continuation of this research could be to test the external validity of the
results presented here which relies heavily on the study of a single industrial case
(the EBF). To closely study how experiments are performed in other continuous
process industry settings, for example, the pulp and paper industry or the chemical
industry, is probably a good idea to verify the results, discover new potential
complications, and learn more about experiments in continuous processes. To study
experiments in full-scale continuous processes would also be important to uncover
additional possible complications. I suspect that other statistical and quality
engineering methods like statistical process control and capability analysis also are
affected by the special characteristics and problems found in continuous processes. A
future study including these methods may therefore be valuable.
One of the recommendations in this thesis is to consider split-plot designs
when planning experiments in continuous processes. I would find it interesting to
study how frequently split-plot type designs are used in industry, how often they
should have been used, and how often these designs are in fact analyzed correctly.
Based on personal communication, I suspect that in many cases actual split-plot
experiments are analyzed as if they were fully randomized,
A complication that has been encountered during the research presented in this
thesis is the need to control the process during experimentation. It has been found
that process control decisions can cause ambiguity regarding the experimental
results. I suspect that experiments performed under closed-loop conditions are
common in process industry settings. Paper C discusses a multivariate method for
process monitoring, but this method does not eliminate the possible bias in the
experimental results due to control actions. For automated control loops (without
human deliberations) the control variable(s) can be used as a response, but how
should a manual control variable affected by subjective operator decisions be treated?
I believe that further research on the analysis of closed-loop experiments and how
the need for process control affects experimental procedures, analysis and results is
highly motivated for the future.
The time series nature of the responses from continuous processes is specifically
treated in Papers D and E, where time series analysis methods and multivariate
55
56
APPENDIX I
Figure A.1 A diagram of a blast furnace with important terms indicated. Source: Zuo (2000, p.
1), with permission from the author.
15
The Electrical Arc Steelmaking process uses electric energy to melt scrap metal.
57
An obvious aim for a modern blast furnace is to produce as much raw iron as
possible at the lowest possible cost. Thus, minimizing coke consumption becomes
an important task since coke is the most expensive component in the burden. This
motivates substituting coke with less expensive auxiliary fuel in form of coal powder
or oil and injecting them through the tuyeres.
In general, the efficiency of the blast furnace process is considered to be the
reductant rate per metric ton hot metal. This is monitored by continuously
58
APPENDIX I
measuring the chemical composition of the top gas in the furnace. The percentage
of the CO gas that has been transformed into CO2 is expressed as the gas utilization;
see Geerdes et al. (2004) and Biswas (1981).
CO gas utilization
% CO 2 in top gas
% CO % CO 2 in top gas
(A.1)
Zuo (2000) argues that the blast furnace process is quasi-stable. The process is
controlled not only by measures taken by operators but also by constantly complex
variations in the blast furnace, for example, changes of the composition and
distribution of the burden and in the gas as well as the position of the cohesive zone
in the furnace. Hence, Zuo (2000, p. 2) further argues that the modeling and
control of the process in practice is difficult and arduous because of the following
four characteristics:
1. The blast furnace is a continuous process and any irregular fluctuations in the
process will disturb the steady-state condition reached, lasting from several
hours to days with a negative effect on production.
2. Time-lag of furnace responses to adjustments of operational parameters.
3. Black box. The difficulties (often due to the high-temperature and dusty
environment) of online measurement for showing transport phenomena and
reactions occurring inside the furnace make parts of the blast furnace
operation a black box with multiple input and output variables.
4. Dynamics and non-linearity. Changing one parameter in the blast furnace often
causes a chain of changes. The quantitative relationships between process
variables are time-dependent and probably non-linear.
aim to study and develop the blast furnace process, for example, in projects with the
aim to lower the coke consumption and the CO2 emmissions from the blast furnace
processes. Figure A.2 shows a picture of the EBF plant.
Figure A.2 The LKAB Experimental Blast Furnace (EBF). Source: LKAB with permission.
Although the experimental cost per run and risks associated with performing
experiments are great even in this pilot scale, they are substantially lower than they
would have been in full-scale operation. Volume-wise the EBF is much smaller than
a commercial blast furnace, but running the EBF requires similar deliberations,
personnel, and machinery as running a full-scale furnace. The EBF has much of the
same measurement possibilities as commercial blast furnaces, but, in addition, the
EBF has burden probes for extraction of burden materials for analysis, such as, semi60
APPENDIX I
reduced iron ore pellets from the process, see Figure A.3. The EBF is typically run
for two experimental campaigns per year. The length of the campaigns varies, but
often lies around two months. Each campaign may, in turn, consist of several
specific experiments with different aims. After each campaign, the EBF is quenched
by nitrogen and, after cooling, excavated and material samples from different layers
in the furnace can be analyzed.
Figure A.3 Exploded view of the EBF, specifically showing two shaft burden probes and one
inclined probe at the cohesive zone. Source: LKAB with permission.
The customers of iron ore pellets generally want their blast furnaces to run
efficiently and effectively with few disturbances and that the resulting iron is of a
high grade. The experiments performed in the EBF include response variables
related to the quality of the produced iron, that is, the chemical composition of the
pig iron and the slag, wich is determined by off-line analyses. Since LKAB is a
producer of iron ore pellets to be used in blast furnaces, there is a natural interest to
evaluate the performance of the EBF while a specific product (e.g. pellet type) is
being charged into the furnace. Therfore, responses related to energy efficiency and
stability of the process itself, such as, top gas composition, burden descent rate, CO
gas utilization, pressure drops, temperatures in the top of the furnace and in the
shaft, and cooling effects are highly important during the analysis of experiments in
the EBF. Figure A.4 presents a schematic outline of the EBF process together with
examples of measurement possibilities, and Table A.1 gives some facts about the
61
EBF. Hallin et al. (2002) provide further details about the EBF plant. The
experimental work at the EBF is further discussed in Paper A.
x Iron-bearers (pellets and/or sinter)
x Fuel (Coke)
x Fluxes
Iron-bearers
Furnace
height 8m
Coke layer
Sensors for measuring:
Temperatures,
Cooling loss, etc.
x Pre-heated blast air
( 1300 q C)
x Oxygen
x Auxiliary fuel
(coal powder or oil)
Dripping materials
Raceway
Tuyere
Temperature
Chemical composition of pig iron and slag
Figure A.4 Schematic outline of the EBF process inspired by Zuo (2000). A few examples of
possible responses are underlined.
Table A.1 Examples of specifications of the EBF. Source: Hallin et al. (2002, p. 311).
Working volume
8.2 m3
Hearth diameter
1.2 m
Hearth height
5.9 m
Number of tuyeres
3 (diameter 54 mm)
Top pressure
up to 1.5 bar
Injection through tuyeres
coal, oil, slag formers
Blast volume
up to 2000 Nm3/h
Blast heating
pebble heaters
Maximum blast temperature 1300 C
Furnace crew, excluding
5/shift
sampling and research staff
Tapping volume
a 1.5 t/tap
Tap time
5-15 min
Fuel rate
a 500 kg/t hot metal
62
APPENDIX I
Experimental factors are often different raw materials tested under constant or varied
processing conditions. Table A.2 provide some examples of these and other
experimental factors in the EBF.
Control factors, Cs (and held-constant factors)
The thermal state of the blast furnace must be controlled during the experiments.
Furthermore, disturbances of a critical nature may also bring about a manipulation of
control variables during the experiment. The process is typically controlled by the
coke rate, adding or subtracting coke in the burden mixture, but can also be
controlled by other variables, see examples given in Table A.3. The choice is often
made to control the process by manipulating, for example, the coke rate and/or the
amount of axulilary fuel injected through the tuyeres. Depending on the choice of
control variable(s), the remaining variables are normally kept at a constant level
during the experiment.
Disturbance factors, Zs
The number of possible disturbance factors that can affect an experiment of the
magnitude and complexity as those performed in the EBF are large. Table A.4
presents some examples of disturbances that can and have affected experiments run
in the EBF.
Responses, Ys
As discussed above, responses from the EBF are primarily related to two general
classes: process responses and output responses. There are also responses related
other more qualitative signals of furnace stability and to lab tests on the extracted
material through the burden probes. Table A.5 provides some typical examples.
63
Table A.4 Examples of disturbance factors that can affect an experiment in the EBF.
Examples of disturbance factors
Moisture content of coke
Furnace irregularities such as:
- Channeling (uneven gas distribution and flow through the burden)
- Scaffolds and scabs (build-up of materials at the shaft wall)
- Hanging (obstructed downward flow of the burden)
- Slips (sudden rapid downward movements of the burden)
Breakdown of vital machinery that disturb the material flow
Variations in the incoming raw materials
Human factors, for example, control decisions
64
APPENDIX I
Table A.5 Examples of typical responses used to analyze experiments in the EBF.
Examples of typical responses
Process responses
x CO gas efficiency [%]
x Direct reduction rate [%]
x Burden permeability index [no unit]
x Differential pressures measured in the shaft [mbar]
x Production rate [t/hour]
x Burden descent rate [cm/min]
x Radial temperature index in furnace top [ C]
x Temperature at the shaft wall [ C]
#
Output responses
x Silicon in pig iron [weight %]
x Sulphur in pig iron [weight %]
x Carbon in pig iron [weight %]
x Hot metal temperature [ C]
x Iron content in slag [weight %]
x Magnesium oxide in slag [weight %]
x Slag basicity, CaO/SiO2 [no unit]
#
Responses from extracted burden material
x Disintegration strength tests through tumbling
x Chemical analyses
x Sieve fractions
#
Other signals of furnace stability, such as furnace irregularities
x Channeling, scaffolds, scabs, hanging, and slips
65
REFERENCES
REFERENCES
Alvesson, M. and Skldberg K. (1994). Tolkning och reflektion: vetenskapsfilosofi och
kvalitativ metod. Lund, Studentlitteratur. (In Swedish).
APICS dictionary (2008). 12th ed. Edited by: Blackstone, J. H. Alexandria, VA,
APICS - the Association for Operations Management.
Barton, R. R. (1997). Pre-Experiment Planning for Designed Experiments:
Graphical Methods. Journal of Quality Technology, 29(3): 307-316.
Benski, H. C. (1989). Use of a Normality Test to Identify Significant Effects in
Factorial Designs. Journal of Quality Technology, 21(3): 174-178.
Bergquist, B. and Albing M. (2006). Statistical Methods - Does Anyone Really Use
Them? Total Quality Management, 17(8): 961-972.
Berk, K. N. and Picard R. R. (1991). Significance Tests for Saturated Orthogonal
Arrays. Journal of Quality Technology, 23(2): 79-89.
Bersimis, S., Psarakis S. and Panaretos J. (2007). Multivariate Statistical Process
Control Charts: An Overview. Quality and Reliability Engineering International,
23(5): 517-543.
Bingham, D. R. and Sitter R. R. (1999). Minimum-Aberration Two-Level
Fractional Factorial Split-Plot Designs. Technometrics, 41(1): 62-70.
Bingham, D. R. and Sitter R. R. (2001). Design Issues In Fractional Factorial SplitPlot Experiments. Journal of Quality Technology, 33(1): 2-15.
Bisgaard, S. (2000). The Design and Analysis of 2k-p x 2q-r Split Plot Experiments.
Journal of Quality Technology, 32(1): 39-56.
Bisgaard, S. and de Pinho A. L. S. (2004). The Error Structure of Split-Plot
Experiments. Quality Engineering, 16(4): 671-675.
Bisgaard, S. and Kulahci M. (2006a). Quality Quandaries: The Application of
Principal Component Analysis for Process Monitoring. Quality Engineering,
18(1): 92-103.
Bisgaard, S. and Kulahci M. (2006b). Quality Quandaries: Studying Input-Output
Relationships, Part I. Quality Engineering, 18(2): 273-281.
Bisgaard, S. and Kulahci M. (2006c). Quality Quandaries: Studying Input-Output
Relationships, Part II. Quality Engineering, 18(3): 405-410.
Bisgaard, S. and Kulahci M. (2007). Quality Quandaries: Process Regime Changes.
Quality Engineering, 19(1): 83-87.
Bisgaard, S., Vining G. G., Ryan T. P., Box G. E. P., Wheeler D. J. and
Montgomery D. C. (2008). Must a Process Be in Statistical Control before
Conducting Designed Experiments, Original article by S. Bisgaard with
discussion. Quality Engineering, 20(2): 143-176.
Biswas, A. K. (1981). Principles of Blast Furnace Ironmaking. Brisbane, Australia,
Cootha Publishing House.
Bjerke, F. (2002). Statistical Thinking in Practice: Handling Variability in
Experimental Situations. Total Quality Management, 13(7): 1001-1014.
67
REFERENCES
69
70
REFERENCES
Han, J. and Kamber M. (2001). Data Mining: Concepts and Techniques. San Francisco,
CA, Academic Press.
Hau, I., Matsumura E. M. and Tucker R. R. (1996). Building Empirical Models for
Data From Factorial Designs with Time Series Responses: Toward Fraud
Prevention and Detection. Quality Engineering, 9(1): 21-34.
Hellsten, U. and Klefsj B. (2000). TQM as a Management System Consisting of
Values, Techniques and Tools. The TQM Magazine, 12(4): 238-244.
Hild, C., Sanders D. and Cooper T. (2000). Six Sigma* on Continuous Processes:
How and Why it Differs. Quality Engineering, 13(1): 1-9.
Hskuldsson, A. (1988). PLS Regression Methods. Journal of Chemometrics, 2: 211228.
Jackson, J. E. (1980). Principal Component and Factor Analysis: Part I - Principal
Components. Journal of Quality Technology, 12(4): 201-213.
Jackson, J. E. (1981). Principal Components and Factor Analysis: Part II - Additional
Topics Related to Principal Components. Journal of Quality Technology, 13(1):
46-58.
Jackson, J. E. (2003). A User's Guide to Principal Components. Hoboken, NJ, Wiley.
Jenkins, G. M. (1979). Practical Experiences with Modeling and Forecasting Time Series.
St. Helier, Jersey, Channel Islands, Gwilym Jenkins & Partners.
John, P. W. M. (1990). Time Trends and Factorial Experiments. Technometrics,
32(3): 275-282.
Johnson, R. A. and Wichern D. W. (2002). Applied Multivariate Statistical Analysis,
5th ed. Upper Saddle River, NJ, Prentice Hall.
Kim, Y. and Lee J. (1993). Manufacturing Strategy and Productions Systems: An
Integrated Framework. Journal of Operations Management, 11: 3-15.
Kourti, T. and MacGregor J. F. (1995). Process Analysis, Monitoring and Diagnosis,
Using Multivariate Projection Methods. Chemometrics and Intelligent Laboratory
Systems, 28: 3-21.
Kourti, T., Lee J. and MacGregor J. F. (1996). Experiences with Industrial
Applications of Projection Methods for Multivariate Statistical Process Control.
Computers & Chemical Engineering, 20: 745-750.
Kourti, T. (2005). Application of Latent Variable Methods to Process Control and
Multivariate Statistical Process Control in Industry. International Journal of
Adaptive Control and Signal Processing, 19: 213-246.
Kowalski, S. M., Parker P. and Vining G. G. (2007). Tutorial: Industrial Split-plot
Experiments. Quality Engineering, 19(1): 1-15.
Kresta, J. V., Macgregor J. F. and Marlin T. E. (1991). Multivariate Statistical
Monitoring of Process Operating Performance. The Canadian Journal of
Chemical Engineering, 69: 35-47.
Kvale, S. (1997). Den kvalitativa forskningsintervjun. Lund, Studentlitteratur. (In
Swedish, translation by Sven-Erik Torhell).
71
72
REFERENCES
REFERENCES
Zahn, D. A. (1975). Modifications of and Revised Critical Values for the HalfNormal Plot. Technometrics, 17(2): 189-200.
Zikmund, W. G. (2000). Business Research Methods, 6th ed. Fort Worth, Texas,
Dryden Press.
Zuo, G. (2000). Improving the Performance of the Blast Furnace Ironmaking Process.
Department of Chemical and Metallurgical Engineering, Lule University of
Technology, Lule. Doctoral Thesis.
76
PAPER A
Published as:
Vanhatalo, E. and Bergquist, B. (2007). Special Considerations when Planning
Experiments in a Continuous Process. Quality Engineering, 19(3): 155-169.
Available at:
Due to the publishers restrictions of the use of the post-print version of the article,
Paper A is not published electronically. The paper can be accessed through the link
below.
http://dx.doi.org/10.1080/08982110701474100
PAPER B
Published as:
Vanhatalo, E. and Vnnman, K. (2008). Using Factorial Design and Multivariate
Analysis When Experimenting in a Continuous Process. Quality and Reliability
Engineering International, 24(8): 983-995.
Available at:
Due to the publishers restrictions of the use of the post-print version of the article,
Paper B is not published electronically. The paper can be accessed through the link
below.
http://dx.doi.org/10.1002/qre.935
PAPER C
Published as:
Vanhatalo, E. (2009). Multivariate Process Monitoring of an Experimental Blast
Furnace. Quality and Reliability Engineering International, In Press, published online ahead
of print. DOI: 10.1002/qre.1070
Available at:
Due to the publishers restrictions of the use of the post-print version of the article,
Paper C is not published electronically. The paper can be accessed through the link
below.
http://dx.doi.org/10.1002/qre.1070
PAPER D
PAPER D
Abstract:
1. Introduction
Dynamic processes are frequently found in continuous process industries, where process steps
such as: mixing, melting, chemical reactions, silos, and reflux flows contribute to their dynamic
characteristics. Planning, conducting, and analyzing experiments in dynamic processes highlight
special issues that the experimenter needs to consider. Examples of such issues are: process
dynamics (inertia), a multitude of responses, large-scale and costly experimentation, and many
involved people, see Hild et al. (2000) and Vanhatalo and Bergquist (2007). In this article we
focus on the dynamic characteristic of continuous processes and argue that process dynamics
must be considered already during the experimental planning phase.
In a dynamic process, a delay, here called the transition time, will occur between the
change of an experimental factor until the response is affected, whereas in a responsive system,
this change is almost immediate, see, for example, Saunders and Eccleston (1992) and Black
Nembhard and Valverde-Ventura (2003). Consequently, time series of the process responses
need to be studied after each experimental treatment is applied to allow for the possible effect
of the experimental treatment to manifest itself. By contrast, responses can often be measured
on individual experimental units directly after the experimental treatment has been applied in
many parts production processes.
When planning an experiment in a dynamic process it is important to have some
knowledge of the transition time caused by process dynamics. The transition time affects the
required length of each experimental run in the process and long transition times may call for
restrictions in the randomization order of the experimental runs, see, for example, Vanhatalo
PAPER D
and Vnnman (2008). Knowledge about the transition time will help the experimenter to
avoid experimental runs that are either too short for a new steady-state to be reached, and thus
wrongly estimating the treatment effects, or unnecessarily long, which increases costs. By
knowing the required transition time, the experimenter can choose a better design that
produces the needed information at a lower cost, without jeopardizing experimental validity.
Determining the transition time in a dynamic continuous process can be difficult due to
a number of reasons. Continuous processes are often heavily instrumented to measure different
aspects of the process and product being processed, and multiple process responses are often
needed to capture the effect of experimental treatments. The transition time may also vary for
different responses and treatment changes may affect some responses but not all. Process
variables are typically correlated and often react to the same underlying event, see Kourti and
MacGregor (1995). In such cases latent variable techniques such as principal component
analysis (PCA) can be used to achieve data reduction and aid interpretation, see, for example,
Wold et al. (1987) and Jackson (2003).
The sample rates of the process measurements in continuous processes in industry are
often frequent enough to estimate process dynamics, usually with a higher sampling frequency
than the frequency of the process oscillations. Slow process drifts and oscillations combined
with high sampling frequencies lead to positively autocorrelated responses. The autocorrelation
indicates that time series analysis could be a useful analysis tool. Time series analysis contains
techniques where stochastic and dynamic models are developed to model the dependence
between observations sampled at different times. The time series analysis techniques transfer
function-noise modeling and intervention analysis have been proposed to model the dynamic
relation between an input time series Xt and an output series Yt and between a known
intervention at time t and an output series Yt respectively, see, for example, Box et al. (1994),
Wei (2006), and Montgomery et al. (2008).
In this article, we propose a method to determine the transition time for experimental
runs in a dynamic process. The proposed method combines multivariate statistical methods and
time series analysis techniques to analyze process dynamics and estimate the transition times in
dynamic processes. Section 2 introduces the method where PCA summarizes the systematic
variation in a multivariate response space. Transfer function-noise modeling or intervention
analysis is then used to model process dynamics and determine the transition time between an
input time series event and output time series response using principal component scores. The
approach is illustrated in Section 3 using data from a continuously operating experimental blast
furnace.
2. Proposed method
Continuous processes running in, for example, process industries usually
with old process data, and any important process that has run for some
undergone trials or experiments to improve it. Often, past interventions
changes planned for an upcoming experiment can be found. If similar past
PAPER D
not exist, a trial run (if possible) designed to induce the transition in the process may be
justified to estimate the transition times.
Normally, process analysis of continuous processes is a multivariate task, and determining
the transition time can be rather difficult. For this problem we use PCA to create a few,
independent, linear combinations of the original variables that together summarize the main
variability in the response space. We then use time series analysis on the principal component
scores to investigate the transition time. The following sections introduce PCA and the times
series analysis techniques transfer function-noise models and intervention analysis.
2.1 Principal component analysis (PCA)
PCA can reduce the dimensionality of the response space by extracting a few new, latent,
uncorrelated variables called principal components (linear combinations of the original
variables) that together explain the main variability in the data, see, for example, Johnson and
Wichern (2002) and Jackson (2003). The use of the PCA technique to summarize the response
space is outlined below.
Let y c > y1 , y2 ! ym @ represent a random response vector describing an m-dimensional
response variable with covariance matrix . Let have the eigenvalue-eigenvector pairs
(O1 , p1 ) , (O2 , p 2 ) ,..., (Om , p m ) . The m principal components (PCs) are formed as linear
combinations of the original responses:
PC 2
p1c y
pc2 y
#
PCm
pcm y
PC1
(1)
pm1y1 pm 2 y2 ! pmm ym
The PCs are orthogonal to one another, ordered according to their variances. The first
PC has the largest variance, the second PC the second-largest variance and so on, where the
eigenvalues, Oa , a 1,2,! , m , are the variances of the PCs. The eigenvectors, p a ,
a 1,2,!, m , have unit length, pca pa 1 , and are called the PC loading vectors. PCA is scaledependent and the responses are usually scaled to unit variance, using standardized variables,
before the PCA. The PCs of the standardized variables are obtained by calculating the
eigenvalues and eigenvectors of the correlation matrix of y instead of the covariance matrix,
see, for example, Johnson and Wichern (2002). In the applications studied here we scale to
unit variance before the PCA and, hence, use the correlation matrix of y to derive the
eigenvectors and eigenvalues.
The correlation matrix is unknown in practice and estimated by the sample correlation
matrix calculated from an observed n u m Y matrix with n observations of each of the m
responses. The values of the PCs for each observation are here called PC scores, and the score
vectors, t a , a 1,2,! , m , represent the n observed values of the PCs based on the observed
Y matrix.
The goal of the PCA is reduction of dimensionality, and if the variables are highly
correlated, much of the systematic variation described by a correlation or covariance matrix
3
PAPER D
can be described using A m dimensions, and the remaining m A dimensions are considered
to contain mostly random noise.
The loading vectors, p a , a 1,2,!, A , define the reduced dimensional space (A) with
respect to the original responses and the score vectors, t a , a 1,2,!, A , are the projections of
the original observations onto the A-dimensional reduced space.
The number of retained principal components (A) can be derived by several methods,
see Jackson (2003). One way is to extract the number of components that are needed to
reproduce a specific fraction of the variance of the original response data. When working with
standardized variables, it is also common to only keep PCs with eigenvalues larger than one, so
each PC explains at least as much of the total variation as one of the original variables. Crossvalidation, see Wold (1978), is also frequently used to select the appropriate number of
components.
If only the A first PCs are used to approximate the variability in Y , we can write:
A
t pc E
TP c
(2)
a 1
where T is an n u m matrix with the score vectors as rows, Pc is an m u m matrix with the
loading vectors as columns, and the variability in the remaining m A PCs are summed up in
the residual matrix E .
X B xt N t
i 0
(3)
the unobservable zero-mean noise. The number of coefficients in X (B) are usually assumed to
be limited to a fairly small number and to follow the structure:
X B
Z B
G B
Z0 Z1B ! Zs B s
1 G 1B ! G r B r
(4)
The coefficients Z0 ,!, Zs and G 1 ,!, G r determine the structure of the transfer function,
X (B) , and s and r are the orders of the numerator and denominator respectively. The
PAPER D
coefficients Xi , also called the impulse response function, can be obtained recursively from the
coefficients Z0 ,!, Zs and G 1 ,!, G r , see Montgomery et al. (2008, chapter 6). Sometimes there
is a delay before xt starts to affect yt . If we assume that this pure delay is b time units, the
transfer function-noise model can be represented by:
yt
Z B
x Nt
G B t b
(5)
) B 1 B N t
4 B Ht
where ) B
(6)
noise. See, for example, Montgomery et al. (2008) for the process of fitting ARIMA models.
By combining (5) and (6), the transfer function-noise model can be expressed as:
yt
Z B
4 B
xt b
H
d t
G B
) B 1 B
(7)
The following seven steps are taken to obtain the transfer function, X (B) , and the noise
model, see also Montgomery et al. (2008). In the following descriptions we assume that the
input and output series have been scaled so the mean is zero for each series.
Step 1: Prewhiten the input series xt
If the input series xt is autocorrelated, the method of prewhitening is needed to obtain the
transfer function. Prewhitening is a procedure that transforms the input series xt into white
noise. Normally, an appropriate ARIMA model is used to filter xt :
) x (B ) 1 B
Dt
xt
4 x (B )
(8)
where the filtered input series, D t , should be white noise with zero mean and variance V D2 .
Step 2: Apply the prewhitening filter to the output series yt
The same prewhitening filter is then applied to the output series yt to obtain
) x (B ) 1 B
Et
4 x (B )
yt
(9)
where the filtered output series has variance V E2 and is not necessarily white noise. The cross
correlation function between the prewhitened input series Dt and filtered output series Et is
PAPER D
directly proportional to the weights, Xi , in the transfer function. We have the following
relation, see Montgomery et al. (2008),
VE
U (i )
V D DE
Xi
(10)
where UDE (i ) is the cross correlation function between Dt and Et at lag i, i 0, r 1, r 2,! .
Step 3: Obtain initial estimates of the impulse response function Xi
Using the sample estimates UDE (i ) , VD , and V E of UDE (i ) , V D , and V E , respectively and
applying Eq. (10) we obtain the initial estimates of the impulse response function Xi as:
V E
U (i )
VD DE
Xi
(11)
the time series, as the approximate 95 % confidence interval to judge the significance of the
cross correlations and thus the estimated weights Xi .
Z (B )
G(B )
X(B )
(12)
yt
Z (B )
xt b
G(B )
(13)
By studying the time series plot, the autocorrelation function and the partial autocorrelation
function of the estimated noise series in (13), an appropriate ARIMA model is chosen to
model any remaining structure in the noise series.
PAPER D
yt
Z(B )
4(B )
xt b
Ht
)(B )(1 B )d
G (B )
(G1 !G r )c ,
(Z1 !Z s )c ,
(14)
(I1 !I p )c , and
(T1 !T q )c are obtained by an iterative maximum likelihood fit of the specified model to
Z(B ) (T )
4(B )
[
H
G (B ) t b )(B )(1 B )d t
(15)
where [ t(Tb) is a binary deterministic indicator variable with value 0 for nonoccurrence, and
with value 1 for occurrence of the specific event and b determines the possible pure delay of
the intervention effect. Two common types of indicator variables are the step variable:
S t(T )
0, t T
1, t t T
(16)
Pt (T )
1, t T
0, t z T
(17)
PAPER D
function on the form Z0B 1 G 1B . Often a gradual response is reasonable to assume, which
corresponds to the case 0 G1 1 .
1
0
S t(T )
Input
Z0
G1
Z0St(Tb)
(a)
0
T+b
Z0
1 G1
0 G1 1
G1
Z0
St(Tb)
1 G 1B
Z0
1 B
St(Tb)
(b)
Output
0
T+b
Z0
(c)
T+b
Figure 1. Response to an intervention in form of a step function based on a step variable and a
simple transfer function depending on different values of G1 . The figure is adapted from Box et al.
(1994, p. 464).
3.1 Transition time when changing oxygen content in the blast air
During experiments in the EBF, it is often of interest to change the production rate to test the
raw materials under different process conditions. The production rate can be changed either by
altering the oxygen content of the blast or by changing the blast volume. The needed
8
PAPER D
transition time when changing the oxygen content is therefore important to estimate. Process
responses calculated from, for example, pressure sensors and thermocouples in the EBF can be
used to study how the process reacts to changes. Table 1 presents the process responses used to
analyze the transition time.
Data consisting of hourly averages for each of the variables in Table 1 from a past
experimental campaign were located where the oxygen content of the blast air had been
changed between two target values: 45 and 90 Nm3/h. In total, 371 observations (hours) were
available for each variable.
x Iron-bearers (PELLETS and/or sinter)
x Fuel (Coke)
x Fluxes
Iron-bearers
(pellets and/or sinter)
Furnace
height 8m
Coke layer
Dripping materials
Raceway
Tuyere
Temperature
Chemical composition of iron and slag
Figure 2. Outline of the EBF process. Examples of possible responses are underlined. The two
types of changes studied in this article (pellets and oxygen content in the blast air) are indicated by
bold uppercase font.
Table 1. Important process responses from the EBF. The numbering is for future reference.
Process response
1. Differential pressure over furnace
2. Differential shaft pressure, dp5-45
3. Differential shaft pressure, dp5-225
4. Top temperature
5. Temperature BR 1
6. Temperature BR 2
7. Burden descent rate
8. CO (CO gas utilization)
9. Cooling effect tuyeres
10. Blast speed
11. Gas speed furnace
Unit
bar
bar
bar
C
C
C
cmmin-1
%
kW
ms-1
ms-1
Process response
12. Gas speed top
13. Flame temperature
14. Production rate
15. Specific blast volume
16. Direct reduction rate, DRR
17. Solution loss
18. Wall flow index
19. Center flow index
20. Top gas flow
21. Burden resistance index, BRI
Unit
ms-1
C
tonh-1
Nm3ton-1
%
kg[C]ton-1
C
C
Nm3ton-1
No unit
PAPER D
PCA was conducted and the first four PCs explained 80.5% of the variation in the data,
see Table 2. We found that the first PC, which explains the largest part of the variability in the
process responses, separates the two oxygen contents. See Figure 3a-b, where the scores t1 of
the first PC are plotted against the scores t2 of the second PC. We conclude that the change of
oxygen content seems to explain the main variability visible in the variables in Table 1. None
of the other PCs showed similar clear dependence on the oxygen content. Hence, only the
scores t1 will be studied in the following analysis. Figure 4 shows a time series plot of the 371
observations on t1 together with the input oxygen content.
Table 2. Results from the PCA on the 371 observations of the process variables in Table 1.
PC Explained variance [%] Cum. explained variance [%] Eigenvalue
1
32.5
32.5
6.83
2
24.0
56.5
5.04
3
14.9
71.5
3.14
4
9.0
80.5
1.89
0,5
10
12
0,4
p[2]
-5
11
0,1
20
14
13
-20
-5,0
21
15
-0,2
Oxy gen
target lev el
45 Nm3/h
90 Nm3/h
-15
0,0
-0,1
-10
19
0,2
0
t[2]
16
17
0,3
18
8
-0,3
-0,4
-2,5
0,0
t[1]
2,5
-0,4
5,0
-0,3
-0,2
-0,1
0,0
p[1]
0,1
0,2
0,3
0,4
From Figure 4 we see that the oxygen content seems to affect the first principal
component. When the oxygen content is increased, the first PC decreases. However, since
there is autocorrelation present we need to apply transfer function-noise modeling to be able
to draw any clear conclusions.
10
PAPER D
90
80
70
60
50
a)
40
1
50
100
150
200
250
300
350
Hours
5,0
t[1]
2,5
0,0
-2,5
b)
-5,0
1
50
100
150
200
250
300
350
Hours
Figure 4. Time series plots of a) the oxygen content of blast air and b) t1 .
PAPER D
0.25
0.2
0.15
Cross correlation
0.1
2
371
2
371
0.05
0
-0.05
-0.1
-0.15
-0.2
-0.25
-25
-20
-15
-10
-5
Lag
10
15
20
25
Figure 5. Cross correlation function between D t and E t . The sampling interval was one hour.
By interpreting Figure 5 tentative values of the delay b and the orders of r and s in (4)
can be found as discussed in Step 4 in Section 2.2. A pure delay of one hour (b = 1) seems
reasonable, since the lag 1 cross correlation coefficient is the first significant coefficient. We see
one (possibly two) significant spikes in the CCF at lag 1 and 2. We are only interested in the
cross correlations at lag 0 and positive lags to see how changes of the input is correlated to t1 .
Large spikes at negative lags are likely due to spurious correlations.
Since the cross correlation coefficients are proportional to the impulse response function
according to (10), the pattern in the CCF was compared to theoretical patterns of the impulse
response function in Montgomery et al. (2008, pp. 305-306). From this comparison two
tentative transfer functions were identified and fitted. The transfer functions were assumed to
have denominator of degree 0, that is r = 0. Two possible values for the numerator in the
transfer function were considered, s = 0 (one spike) and s = 1 (two spikes). The remaining
correlation structure in the noise from both models was eliminated, as described in Step 5, by
an ARIMA(1,0,0) model. Note that this is a AR(1) model and hence the shifts in t1 can be
explained by the oxygen content. Finally, the overall models were fitted as described in Step 6
in Section 2.2. Table 3 gives a summary of the two transfer functions-noise models found
together with the ARIMA(0,1,1) model that does not consider changes of the input variable.
We present model criteria for comparison in Table 3. For details about these criteria, see,
for example Montgomery et al. (2008, pp. 57-60). Generally, models with small standard
deviation of the residuals, small mean absolute error, high adjusted coefficient of
determination, and small values on the Akaike Information Criterion (AIC) and Schwarz
Information Criterion (SIC) are preferable. The AIC and SIC criteria penalize the sum of
squared residuals when including additional parameters in the model. Montgomery et al. (2008)
recommend using SIC over AIC.
12
PAPER D
Table 3. Comparisons of the transfer function-noise models and a univariate ARIMA(0,1,1) model
for the time series of the first principal component, t1 . The models were fitted using JMP 8.0
statistics software. The standard errors for the fitted parameters are given above or below the
parameter values. The arrows next to the model criteria indicate if the corresponding criterion
should be large () or small ().
Fitted models (hourly averages)
d.f.
s.d
MAE
Radj
AIC
SIC
a)
ARIMA
(0,1,1)
369
0.72
0.54
0.92
811
815
367
0.64
0.49
0.94
727
740
365
0.64
0.49
0.94
725
741
' t 1 t
1
b)
b, r, s
(1,0,0)
t 1 t
c)
b, r, s
(1,0,1)
t 1 t
0.059
1 0.21 B H t
0.32
0.0044
7.47 0.11 xt 1
Ht
1 0.67 B
0.038
0.32
0.015
0.015
Ht
1 0.67 B
0.038
Notes: ' indicates that the first difference of the time series is modeled.
d.f. = degrees of freedom; s.d = standard deviation of the residuals; MAE = Mean Absolute prediction Error
AIC = Akaike Information Criterion; SIC = Schwarz Information Criterion.
Model b) and c) perform equally well. According to model b), the gain due to the
change of oxygen content is completely realized one hour after the change and according to
model c) two hours after the change. However, the standard error of second parameter in the
transfer function in model c) is large compared to the estimated coefficient. We therefore
choose to exclude model c) from further consideration and conclude that the shift in t1 seems
to occur within the first hour after the oxygen content in the blast air has been changed.
13
PAPER D
Table 4. Transfer function-noise model for the first principal component, t1 , based on ten-minute
averages. The standard errors for the fitted parameters are given above or below the parameter
values.
Fitted model (ten-minute averages)
t 1 t
d)
b, r, s
(0,2,0)
0.36
0.0099
0.010
0.0099
d.f.
s.d
MAE
Radj
AIC
SIC
2211
0.77
0.56
0.92
5160
5200
xt
0.029
1 0.83 B H t
1 1.37 B 0.40 B
2
0.040
0.034
As t1 describes the main variability of the process responses caused by the change of the
oxygen content w conclude that a transition time of about 20 minutes is a reasonable estimate
for the transition time of the described change in the oxygen content of the blast air.
14
PAPER D
Table 5. Variables in the analysis of pig iron and slag from the EBF.
Hot metal
Unit
Slag
x Hot metal temperature (Temp) C
x Iron content (Fe)
Wt. % x Calcium oxide (CaO)
x Carbon (C)
Wt. % x Silicon dioxide (SiO2)
x Silicon (Si)
Wt. % x Manganese oxide (MnO)
x Manganese (Mn)
Wt. % x Sulfur in slag (S slag)
x Phosphorus (P)
Wt. % x Aluminum oxide (Al2O3)
x Sulfur (S)
Wt. % x Magnesium oxide (MgO)
x Nickel (Ni)
Wt. % x Sodium oxide (Na2O)
x Vanadium (V)
Wt. % x Potassium oxide (K2O)
x Titanium (Ti)
x Vanadium oxide (V2O5)
x Titanium dioxide (TiO2)
x Phosphorus oxide (P2O5)
Unit
Wt. %
Wt. %
Wt. %
Wt. %
Wt. %
Wt. %
Wt. %
Wt. %
Wt. %
Wt. %
Wt. %
Wt. %
Table 6. Results from the PCA on the 99 observations of pig iron and slag variables.
PC Explained variance [%] Cum. explained variance [%] Eigenvalue
1
43.5
43.5
9.13
2
20.0
63.5
4.21
3
8.1
71.6
1.69
A L2O3
Temp
NA 2O
0,3
K2O
SI
0,2
FE
0,1
p[2]
t[2]
NI
0,4
Pellets ty pe
Pellets A
Pellets B
0
-1
0,0
-0,1
-0,2
-2
-0,3
-3
-0,4
-4
-0,5
-15
-10
-5
S
MNO
V2O5
SIO2
TI
C
TIO2
MGO
-0,2
-0,1
t[1]
CA O
MN
-0,3
S slag
P2O5
0,0
p[1]
0,1
0,2
Change of pellets
occurs at
obs. nr. 38
Pellets ty pe
Pellets A
Pellets B
t[2]
1
0
-1
-2
-3
-4
1
10
20
30
40
50
60
70
80
Observation number (~hours)
90
Figure 7. Time series plot of t 2 . Observation 38 is the first after the change of iron ore pellets has
occurred. The observations are approximately one hour apart.
15
0,3
PAPER D
Intervention analysis
Since the difference between the two iron ore pellets cannot be expressed quantitatively (the
pellets may differ not only in chemistry, but also in processing conditions, production time,
and production plant), the transition can instead be modeled by intervention analysis. Using
intervention analysis, the change of pellets can be described by a step function, St(T ) 0 for
pellets A, and St(T )
structure of the transfer function in an intervention model. Instead, the structure of the transfer
function must be estimated by viewing the time series in the light of the underlying
mechanisms behind the change. When new pellets are charged at the top of the furnace, they
will descend for a few hours before reaching the reaction zone. It is reasonable to assume that
the response will exhibit a pure delay during this descent. The newly molten material is then
mixed with the remaining material from the previous burden mix in the bottom of the
furnace, and it is thus likely that the chemistry of the melt will change gradually. A reasonable
assumption is therefore the following transfer function (see also Figure 2)
Z0
St(Tb)
1 G 1B
(18)
Again, models of the time series in Figure 7 were developed in a stepwise manner and
compared. First an ARIMA(0,1,1) model was fitted to the time series of t 2 , where the
differencing was needed to account for the nonstationary behavior (the shift in t 2 ). Thereafter
the intervention variable was introduced, testing different values of the pure delay (b). The
intervention variable accounts for the shift in the time series and the remaining noise was
described by an ARIMA(1,0,1) model. See Table 7 for a summary of the tested models.
Table 7 shows only minor differences among the fitted models for the model criteria. It
can be concluded that the intervention variable can explain the shift in t 2 that otherwise
warrants first differencing of the output time series. Model f) and g) in Table 7 perform
similarly, and slightly better than model e) for all criteria except SIC and practically only differ
in the choice of pure lag (b).
By testing other change-overs between pellet types at similar production rates in the EBF
(not elaborated here) we conclude that b = 3 is probably the best choice. We chose model f) in
Table 7 for calculation of the transition time.
16
PAPER D
Table 7. Comparison of intervention models and a univariate ARIMA(0,1,1) model for the second
principal component, t 2 . The standard errors for the fitted parameters are given above or below the
parameter values. The arrows next to the model criteria indicate if the corresponding criterion
should be large () or small ().
d.f.
s.d
MAE
2
AIC SIC
Radj
Fitted model
e)
0.087
1
' t 2 t 1 0.41 B Ht
ARIMA
97
0.60 0.46
0.91
179
181
(0, 1, 1)
f)
Intervention
Noise
(b = 3)
g)
Intervention
Noise
(b = 4)
t 2 t
0.27
0.78
0.47
2.0
1 0.78B
St 3
0.12
1 0.46 B
H
1 0.91B
0.074
t 2 t
1.98
0.96
1 0.72 B
0.096
91
0.58
0.43
0.92
173
186
90
0.58
0.43
0.92
171
184
0.057
0.33
0.49
St 4
0.12
1 0.45 B
H
1 0.92 B
0.055
Notes: ' indicates that the first difference of the time series is modeled.
d.f. = degrees of freedom; s.d = standard deviation of the residuals; MAE = Mean Absolute prediction Error
AIC = Akaike Information Criterion; SIC = Schwarz Information Criterion.
Transition time
Using model f) in Table 7 we assume a pure delay of three observations (about three hours)
before the intervention starts to affect the chemistry in the pig iron and slag. According to
Jenkins (1979, p. 62) the estimated gain, g, the ultimate change of t 2 due to the intervention,
can be calculated from the transfer function as:
g
Z 0
1 G1
0.78
1 0.78
3.54
(19)
Hence, the change of pellets will eventually cause the average of t 2 to decrease with 3.54
units. An estimate of the percentage of the change that has occurred after each time period
(with the start in period b = 3) can be calculated as:
;!;
Z0G 1f
g
(20)
17
PAPER D
100
Model f)
Validation model
80
60
40
20
0
1
10
11
12
13
14
15
Figure 8. The estimated cumulative percent of change after the change of pellets in the EBF as a
function of time based on estimated intervention models for two pellet changes. Model f) is given
in Table 7.
18
PAPER D
Quantitative X
DYNAMIC
PROCESS
PCA
Output series
PC scores
Qualitative X
Input series
Time
series
analysis
t1, t2, , tA
Estimation of
transition time
We illustrate the method using data from an experimental blast furnace where two types
of transitions were studied. The transition time after a change of a quantitative process variable
was determined through transfer function-noise modeling while the change of a qualitative
raw material variable was analyzed by intervention analysis. The results show that the estimated
transition time for the material variable is substantially longer than for the process variable,
which is important information for the experimental planning process, but also for the analysis
phase. The results differed compared to the prevailing understanding among the engineers at
the EBF. Previously, changes of pellets were thought to be noticeable after four hours, while
changes in blast parameters were considered to take even longer time, which also affected
decisions in the planning phase of experiments in the EBF. The complete transition times in
the EBF process for these changes were not fully known.
The estimation of the transition time using PCs provides a summarized and manageable
overview of the course of events in a multivariate response situation. However, if the change
of the input affects the process in several ways with different time lags, we have to be careful.
Assume a situation where our change affects response Yi more slowly than the other responses.
Then Yi will probably be uncorrelated with most other responses and have loadings of small
magnitude in the first number of PCs. That is, Yi is correlated to the change of the input but
not to the other responses due to the different lag structure. Indeed, transfer function-noise
models and intervention models for single-output variables like Yi may be needed to
complement models using PCs to get the complete picture of the transition time. The
experimenter may check that responses of special interest have loadings of significant
magnitude in the PCs. Otherwise, single-output models should be considered.
By knowledge of the transition time we can establish the needed length of each
experimental run. Each run requires enough time to include the transition time and an
additional time during which responses can be sampled at the new process state. Furthermore,
knowing the transition time between runs is important when selecting representative data for
each run to include in the analysis of the experiment. Usually the responses during the
19
PAPER D
transition time are excluded from the analysis, which further stresses the importance to
correctly determine the transition time. In addition, good estimates of the transition time in
the process are useful to achieve better traceability in dynamic processes. For example, the
transition time can measure the dynamic propagation of a disturbance or product change in the
process output. The transition time can also be of importance for process control strategies and
design of engineering control systems.
For some process variables the estimated transition may be gradual and slow (see, for
example, Figure 8) and the experimenter may need to decide a reasonable cutoff. In such cases,
the transition time can, for example, be defined as the time required to reach 90 percent of the
total change modeled by the transfer function.
If the transition time for different experimental factors differs significantly, the
experimenter may consider randomization restrictions for factors with longer transition times.
Hence, split-plot designs can be arranged based on information about the transition time for
the factors to minimize the required length of the whole experiment. Further descriptions
about split-plot designs are given in, for example, Box and Jones (1992) and Kowalski et al.
(2007). Factors with longer transition times can be natural choices for whole-plots while those
with shorter transition times are potential sub-plot factors. The transition time is by no means
the only consideration when deciding on the appropriate experimental design, such as a splitplot design, but since the experimental time in continuous processes normally is limited and
costly the transition time is an important issue.
Acknowledgement
The financial support from the Swedish mining company LKAB and the European Union,
European Regional Development Fund, Produktion Botnia is gratefully acknowledged. The
authors thank all members of the LKAB EBF methodology development project for their
important contribution to the results presented here. Special thanks to Gunilla Hyllander at
LKAB for valuable support.
20
PAPER D
References
Bisgaard, S. and Kulahci M. (2006). Quality Quandaries: Studying Input-Output
Relationships, Part I. Quality Engineering, 18(2): 273-281.
Black-Nembhard, H. and Valverde-Ventura R. (2003). Integrating Experimental Design and
Statistical Control for Quality Improvement. Journal of Quality Technology, 35(4): 406423.
Box, G. E. P. and Jones S. (1992). Split-Plot Designs for Robust Product Experimentation.
Journal of Applied Statistics, 19(1): 3-26.
Box, G. E. P., Jenkins G. M. and Reinsel G. C. (1994). Time Series Analysis: Forecasting and
Control, 3rd ed. Englewood Cliffs, NJ, Prentice-Hall.
Hild, C., Sanders D. and Cooper T. (2000). Six Sigma* on Continuous Processes: How and
Why it Differs. Quality Engineering, 13(1): 1-9.
Jackson, J. E. (2003). A User's Guide to Principal Components. Hoboken, NJ, Wiley.
Jenkins, G. M. (1979). Practical Experiences with Modeling and Forecasting Time Series. St.
Helier, Jersey, Channel Islands, Gwilym Jenkins & Partners.
Johnson, R. A. and Wichern D. W. (2002). Applied Multivariate Statistical Analysis, 5th ed.
Upper Saddle River, NJ, Prentice Hall.
Kourti, T. and MacGregor J. F. (1995). Process Analysis, Monitoring and Diagnosis, Using
Multivariate Projection Methods. Chemometrics and Intelligent Laboratory Systems, 28:
3-21.
Kowalski, S. M., Parker P. and Vining G. G. (2007). Tutorial: Industrial Split-plot
Experiments. Quality Engineering, 19(1): 1-15.
Montgomery, D. C., Jennings C. L. and Kulahci M. (2008). Introduction to Time Series
Analysis and Forecasting. Hoboken, NJ, Wiley.
Saunders, I. W. and Eccleston J. A. (1992). Experimental Design for Continuous Processes.
The Australian Journal of Statistics, 34(1): 77-89.
Vanhatalo, E. and Bergquist B. (2007). Special Considerations when Planning Experiments in a
Continuous Process. Quality Engineering, 19(3): 155-169.
Vanhatalo, E. and Vnnman K. (2008). Using Factorial Design and Multivariate Analysis
When Experimenting in a Continuous Process. Quality and Reliability Engineering
International, 24(8): 983-995.
Wei, W. W. S. (2006). Time Series Analysis: Univariate and Multivariate Methods, 2nd ed.
Boston, Pearson/Addison-Wesley.
Wold, S. (1978). Cross-Validatory Estimation of the Number of Components in Factor and
Principal Components Models. Technometrics, 20(4): 397-405.
Wold, S., Esbensen K. and Geladi P. (1987). Principal Component Analysis. Chemometrics
and Intelligent Laboratory Systems, 2: 37-52.
21
PAPER E
PAPER E
Abstract:
1. Introduction
Many industrial processes exhibit a dynamic behavior and combined with typical high
measurement sampling frequencies, the measurement series become autocorrelated. The
autocorrelation is especially evident in process industry, where process dynamics contribute to
slow-moving propagations of disturbances. When experimenting on such systems, the
observed responses are represented by time series. In these situations, our experience is that the
time series aspects often are ignored and instead a single response value is assigned to each
experimental run. Analysis procedures that ignore the dynamic nature of the responses may be
ineffective or even erroneous. To disregard the time series characteristics by, for example,
averaging out the entire time series including the transition periods when the process reacts to
different treatments should be a poor alternative, as it likely leads to underestimation of
location effects and overestimation of the variation.
We argue that a comparison and discussion of different ways to analyze experiments with
time series responses may be valuable for many experimenters. The purpose of this article is
hence to propose, illustrate, and compare different ways to analyze factorial experiments with
time series responses. Here, standard replicated and unreplicated two-level factorials with three
PAPER E
experimental factors are used as examples, and we limit our study to the estimation of location
effects. To compare the analysis methods, we use a simple simulation model that emulates an
experiment performed on a process with dynamic behavior.
The only prior work we have found in design of experiments (DoE) literature that
explicitly focuses on the analysis of experiments with time series responses is Hau et al. (1996).
They use regression analysis with the response series in each run as the dependent variable and
time as the independent variable. The overall average and trend for each run are estimated and
these regression parameters are then used as response observations of each run.
The serial dependence between adjacent observations in many industrial processes
suggests that methods such as autoregressive integrated moving average (ARIMA) models may
be more effective, see Box et al. (2008). However, according to Montgomery et al. (2008) the
estimation of the parameters of an ARIMA model requires at least 50 observations from a time
series, implying that the response series from each run needs to include sufficiently many
observations. Having 50 observations may also be inadequate. Each run needs to include as
many observations as needed to capture process reactions, and the observations should be
sampled close enough to capture the speed of change of relevant events.
Modeling of dynamic relations between, for example, experimental factors and responses,
is possible using transfer function-noise modeling or intervention analysis. Already Jenkins
(1979, p. 70) stated that intervention models represent generalizations of methods used for the
analysis of data, usually not expressed as time series, and referred to by statisticians as the design and
analysis of experiments. See, for example, Box et al. (2008) for a discussion of intervention
analysis. Transfer function-noise models also allow for modeling of the dynamic relation
between experimental factors and the response, see Bisgaard and Kulahci (2006a; 2006b) and
Box et al., (2008).
i 1
i 1
G Ii yt i H t TiH t i
(1)
where yt is the value from the process at time t , G is the constant term in the model,
I1 ,I2 ,! ,I p are the autoregressive (AR) coefficients, p is the order of the AR part of the
model, T1 ,T 2 ,!,Tq are the moving average (MA) coefficients, q is the order of the MA part of
the model, and H t is Gaussian white noise. Even if we define our process to be stationary
during normal operation, it may still exhibit cyclic behavior and strong autocorrelation. Here
we assume that the process is operating in such a way that the process settles at a new level
some time after a treatment change, and that the process can be represented by a stationary
model after stabilization. The stationarity of an ARMA(p,q) process is related to the AR part of
the model, see, for example, Montgomery et al. (2008, p. 253). If the absolute values of the
roots of the polynomial:
PAPER E
m p I1m p 1 I2m p 2 ! I p
(2)
are all less than one, then the ARMA(p,q) process is stationary. By choosing
I1 I2 ! I p 0 the process is reduced to a MA(q) process, and when
T1 T 2 ! Tq 0 the process is reduced to an AR(p) process. The mean of a stationary
ARMA(p,q) process is:
E yt
(3)
1 I1 I2 ! I p
i 1
i 1
G W t( A ) W t( B ) ! W t( AB ) ! W t( ABC ) Ii yt i H t T iH t i
(4)
where, W t( A ) ,W t( B ) ,! ,W t( AB ) ,! ,W t( ABC ) are the contributions to the mean of the time series at time
t due to possible main and interaction effects. Note that the effects are time-dependent.
Letting the effects depend on time allows for modeling of situations where the effects gradually
develop and stabilize, and modeling of different dynamic responses for the different effects.
The time between the intervention until the response has stabilized on a new level is here
referred to as the transition time, see also, for example, Black Nembhard and Valverde-Ventura
(2003) or Vanhatalo et al. (2009).
Below, the modeling of the effect dynamics is exemplified by using the main effect of
factor A, but all effects are modeled in the same way. As t increases and A is left unchanged,
W t( A ) approaches its long-term value W ( A ) . The pace may, however, differ for the different
factors. The model is intended to emulate a gradual change of the response and should also
PAPER E
allow for a pure lag. The pure lag labels the possible initial delay before the effect starts to
develop. We use transfer functions and ideas from intervention analysis to model this behavior,
see, for example, Box and Tiao (1976), Jenkins (1979), and Box et al. (2008, Chapter 13). First,
let a binary indicator variable, called the step variable, represent the two levels of each factor or
interaction. Thus, for factor A:
St( A )
(5)
The dynamic response pattern of the effects is then modeled by a transfer function. This
means that the contribution to the mean of the time series at time t due to factor A is given
by:
ZA
W t( A )
1 \ AB
St( Ab)A ,
(6)
which corresponds to a change with the rate determined by the constant \ A , 0 d \ A 1 , and
the initial gain constant ZA , see Box et al. (2008, p. 531). The pure lag is conveyed by the
pure lag constant bA , and B is the backshift operator on t so that BW t( A ) W t(A1) . We can thus
re-write (6) on the form:
W t( A)
Z ASt(Ab) \ AW t(A1)
(7)
The resulting change pattern of (7) is gradual if \ A ! 0 and eventually, given that St( A )
remains unchanged, W t( A ) approaches the long-term value:
WA
ZA
(8)
1 \ A
The choice of \ A determines the inertia of the effect and a larger value of \ A results
in longer transition times. Letting \ A 0 results in a direct response with value ZA after any
pure delay. The change pattern of the effect thus means that the effect is approximated through
a first-order dynamic response to a step change. This means that the change rate is proportional
to the difference between the effect at time t and the equilibrium at the high and low level, see
Box et al. (2008, pp. 442-447).
The contribution of the AR part of the undisturbed process must also be considered to
obtain the expected long-term main effect of A . Using (3) and (8), and given that the main
effect of factor A is defined as the expected change in the response when factor A is changed
from its low to its high level, the expected long-term effect of factor A is:
Aeffect
2W A
1
I
I
1 2 ! Ip
2ZA
1 \ A 1 I1 I2 ! I p
(9)
Consider the following example as an illustration of the suggested simulation model. Let
0.153 and \ A 0.5 (from one of the forthcoming simulations in Section 6) and let the
undisturbed process be an ARMA(1,1) process with I1 0.6 . Also let the undisturbed process
be affected by introducing factor A at its high level at time t 1 , and then factor A is changed
ZA
PAPER E
to its low level at time t 49 . For ease of illustration, disregard the current mean of the
process at time t 0 and let y0 0 . Furthermore, disregard the Gaussian noise H t that
normally affects the process, then with a pure delay bA 1 the deterministic part of the effect
is seen in Figure 1.
0.8
0.6
Run 1
0.4
yt
0.2
0
-0.2
-0.4
Run 2
-0.6
-0.8
0
10
20
30
40
50
60
70
80
90
Figure 1. An illustration of the dynamics (the first 97 observations) of the A effect, given
\ A 0.5 , ZA 0.153 , I1 0.6 , and bA 1 . Here W A 0.306 and Aeffect 1.53 .
1 ni ( i )
yj
ni j 1
(10)
4.2 Method II, based on averages after removal of the transition time
The expected consequence of including observations during the transition time in Method I is
an underestimation of the location effects. A presumable improvement, given that the
transition time is known or can be estimated, is to eliminate the observations during the
transition time from each run. Vanhatalo et al. (2009) propose a formal method to determine
the transition time for experimental factors in dynamic processes based on transfer functionnoise modeling or intervention analysis. A less formal alternative is to use engineering
judgements based on inspection of the time series. Once the transition time is estimated, the
observations during the transition time are removed from each run and the adjusted averages
are calculated and used as responses.
PAPER E
1 B yt
yt yt 1
(11)
PAPER E
Intervention analysis requires input time series for each main effect and interaction effect
to model their relation to the output response series. Normally, intervention analysis uses a
binary indicator variables with values 0 and 1, but here the two levels are coded as 1 (low
level) and +1 (high level), following DoE conventions for two-level factorial designs.
Let yt be the time series response at time t from the entire experiment. Then we assume
that:
yt
Z (B )
ZA (B ) ( A ) ZB (B ) ( B )
)
[
[ ! ABC [t(ABC
Nt
G A (B ) t b G B (B ) t b
G ABC (B ) b
A
ABC
(12)
where [t(AbA) , for example, is a binary deterministic indicator variable with value 1 when factor
A is on its low level, and with value +1 when A is on its high level, bA determines the possible
pure delay of the intervention effect of the main effect of A, and N t is the remaining noise
after the contributions from the input variables have been accounted for. An ARIMA(p,d,q)
model is used to account for any remaining structure in the noise, N t , thus producing an
intervention-noise model.
The general structure of the transfer function of, for example, factor As main effect is
written as:
Z A (B )
\ A (B )
(13)
where s and r are the orders of the numerator and denominator polynomials respectively.
A possible drawback of using a binary coded variable for quantitative factors is that any
deviation of the quantitative experimental factor settings from the experimental plan, such as
difficulties of reaching and maintaining r1, as well as unintended variation in the factors is
disregarded. When there are only quantitative experimental factors, transfer function-noise
models could be used instead of intervention analysis, since transfer function-noise models
allow the use of actual factor settings, see, for example, Box et al. (2008, chapters 11-12).
Another difference between transfer function-noise models and intervention-noise models is
that the analyst needs to postulate a tentative structure for the transfer functions in intervention
analysis. When the inputs variables are quantitative continuous variables, the so-called prewhitening procedure is typically used to determine the structures, see, e.g., Jenkins (1979).
4.5.1 Model building procedure
To iteratively test all possible transfer functions in (13) for the model in (12) may become
overwhelming and the fitting of all parameters at once can cause numerical problems. We
therefore propose the following simplifications. Let s 0 and r 1 in (13), which limit the
possible candidates for the transfer function and gives a simple transfer function that can model
a gradual response, see also Box et al. (2008, p. 531):
Z A ( B)
\ A ( B)
Z0, A
1 \ 1, A B
(14)
0 , for all effects.
PAPER E
One way to analyze the experiment is to use backward-selection as follows. First estimate
the parameters of the transfer functions, and then successively exclude nonsignificant transfer
functions. However, we sometimes encounter numerical problems using this approach, such as
non-convergence of the iterative maximum likelihood estimation algorithm in the software.
An alternative and, in our opinion simpler way, is to iteratively fit the transfer functions for the
effects, then fit an ARMA(p,q) model for the resulting noise series. The proposed analysis
procedure has four distinct steps:
1
Analyze the experiment using Method II. Estimate all effects and rank the effects based
on their absolute sizes. Focus on the effects of largest absolute size that are found active
or nearly active in, for example, an ANOVA.
2
Fit the transfer functions for the effects, starting with the effects that were found active
and nearly active using Method II. Adjust the pure lags if appropriate. We use the model
criteria described below as an aid to determine the appropriate pure lags for the factors.
Study the resulting residuals from the models with the fitted transfer functions. Look for
any remaining structure that is related to the remaining input variables. If no such
structure is found, continue to Step 3.
3
Study the ACF and PACF of the residuals from the model from step 2. Determine the
appropriate ARMA (or ARIMA) model for the noise series and then fit the overall
model. Time series from the process sampled before the experiment or from stable
operation in one or a few of the experimental runs can also be used to find a tentative
model for the noise series.
4
Study the significance of the estimated parameters of the transfer functions in the model
and make the necessary adjustments. The effects are estimated through the parameters of
the different transfer functions in the final selected model. The transfer functions for
non-significant effects are removed and cannot be estimated using Method V.
An effect in Method V is considered significant if the fitted parameters of its transfer
function are large compared to their standard errors (the corresponding p-values are smaller
than the chosen significance level). In steps 2-4 above, competing models are compared using
2
), standard deviation of the
model criteria such as the adjusted coefficient of determination ( Radj
residuals, mean absolute prediction error, Akaike information criterion (AIC), and Schwarz
information criterion (SIC), see Montgomery et al. (2008, pp. 57-60). Models with small
standard deviation of the residuals, small mean absolute error, high adjusted coefficient of
determination, and small values on the AIC and SIC are preferable. The SIC generally results
in the choice of a more parsimonious model and is recommended over AIC by Montgomery
et al. (2008).
PAPER E
(2007) and Vanhatalo and Vnnman (2008). The simulated response we use to exemplify our
work in this article is carbon monoxide (CO) efficiency (hereafter KCO ); an important response
for the blast furnace where higher values generally are preferred as they indicate a more
energy-efficient process. Throughout this article, KCO is measured in percent units and it is
assumed that a new observation on KCO is available each hour. Hourly data correspond to the
sampling frequency of other important responses in the studied blast furnace, such as chemical
analysis of the pig iron and the slag. Using hourly observations, the gas efficiency response,
under normal operation, can be described by an ARMA(1,1) process:
KCO
G I1KCO H t T1H t 1
(15)
t 1
E yt
(16)
1 I1
V H2
1 T12 2I1T1
1 I12
(17)
Based on stable furnace operation, the following parameter values are assigned to the
ARMA(1,1) model in (15): G 19.2 , I1 0.6 , and T1 0.28 . Hence, the model used for
simulating the response under normal operation is:
KCO
(18)
with V H2t 0.36 . This implies a process with the mean 19.2 1 0.6
deviation:
V ARMA(1,1)
0.36
0.585
0.765
(19)
Figure 2 presents a simulated time series with 100 observations from the model in (18),
which emulates to a process under normal, undisturbed, operation.
50.5
50
49.5
49
KCO [%]
48.5
48
47.5
47
46.5
46
45.5
0
10
20
30
40
50
60
70
80
90
100
Figure 2. Simulated time series with 100 observations from the ARMA(1,1) model in Eq. (18).
PAPER E
AB
AC
BC
ABC
Replicate 1
(1,1)
t
(1,1)
t 1
,y
Replicate 2
(1,1)
t 47
,! , y
(1,2)
t
(1,2)
, yt(1,2)
1 ,! , yt 47
abc
Abc
aBc
ABc
+
+
+
+
+
+
+
+
+
#
#
#
#
#
#
#
#
AbC
aBC
#
#
#
#
ABC
(8,1)
yt(8,1) , yt(8,1)
1 , ! , yt 47
(8,2)
yt(8,2) , yt(8,2)
1 ,! , yt 47
abC
We argue that it is reasonable to relate the size of the simulated effects to the standard
deviation of the ARMA(1,1) process under normal operation. Using (9) and (19), SN A is
defined as the signal-to-noise ratio for the main effect of factor A in relation to the standard
deviation of the process response under normal operation:
SN A
2ZA
1 T1
1 \ A 1 I1 V H2
2I1T1
1 I12
2
(20)
The larger SN A is, the easier it will be to detect the effect through the noise of the process.
10
PAPER E
A choice of ZA
units):
0.153 and \ A
2 0.153
1 0.51 0.6
Aeffect
1.53
(21)
2 0.153
1 0.51 0.6 0.765
(22)
Ten randomized 23 factorial experiments are simulated using Aeffect in (21). All other
effects are set to 0. Each simulation uses a new randomization order of the 16 runs in the
design. We are aware that ten simulations are few, but time series model building, especially
Method V, is an iterative approach where the analyst must take active part in each step.
Therefore, we cannot (and do not want to) automate the analysis of the simulated time series.
This makes the analysis time consuming.
The proposed analysis methods are illustrated by outlining the analysis procedure for the
first simulated experiment below. The illustrated experiment has the randomized run order:
AbC, ABc, abc, ABC, ABc, aBc, Abc, aBC, Abc, AbC, abc, aBC, abC, aBc, abC, ABC, and the
corresponding time series from the experiment is given in Figure 3.
52
AbC
1
ABc
2
abc
3
ABC
4
ABc
5
aBc
6
Abc
7
aBC
8
Abc
9
AbC
10
abc
11
aBC
12
abC
13
aBc
14
abC
15
ABC
16
51
50
49
48
47
46
45
100
200
300
400
t
500
600
700
0.153 ,
11
PAPER E
thus need to estimate the transition time. Here the transition time is estimated by visual
inspection of the time series and we conclude that the transition time is approximately ten
hours. Consequently, the first ten observations in each run are disregarded. Another way to
estimate the transition time is to consider those runs that produce a change of the different
factors and assign individual transition times for the runs accordingly. However, such an
approach requires the analyst to speculate about which of the effects that are active before
doing the analysis. Therefore, here the robust choice is made to eliminate the same transition
time for all runs, long enough to assume that even the slowest of effects are fully developed.
Table 2 presents the averages using Methods I and II. Table 3 presents an ANOVA table based
on the averages in Table 2. The significance level 0.05 is used in all analyses.
Table 2. Averages for the runs of the simulated 23 factorial experiment in Figure 3 using Method
I (M-I) and Method II (M-II). In Method II, the first ten hours of each run are excluded.
Std.
order
AB
+
AC
+
BC
+
ABC
abC
+
+
+
+
+
+
AbC
aBC
ABC
+
+
+
+
+
+
+
abc
Abc
aBc
ABc
Replicate 1
M-I
M-II
Replicate 2
M-I
M-II
47.59
47.60
47.29
47.22
48.67
49.00
48.79
48.93
47.13
48.89
47.39
48.81
47.38
48.66
47.43
48.74
47.31
47.45
47.01
46.84
48.36
48.46
48.30
48.04
47.15
48.86
47.16
49.03
46.82
48.27
46.69
48.20
Table 3. ANOVA and estimated effects for Method I (M-I) and Method II (M-II). In Method II
the first ten hours of each run is excluded.
Source
Model
A
B
C
AB
AC
BC
ABC
Pure error
Cor. Total
Sum of Squares
M-I
M-II
8.167
7.710
0.001
0.335
0.102
.001
.010
.008
.386
8.553
9.012
8.132
.0003
.658
.0398
.0044
.0263
.152
.805
9.817
D.f.
7
1
1
1
1
1
1
1
8
15
Mean square
F value
Prob>F
Estimated effect
M-I
M-II
M-I
M-II
M-I
M-II
M-I
M-II
1.167
7.710
.001
.335
.102
.001
.010
.008
.048
1.287
8.132
.0003
.658
.0398
.0044
.0263
.152
.101
24.16
159.6
.030
6.938
2.109
.023
.215
.157
12.80
80.85
.0029
6.538
.396
.0440
.261
1.512
< .0001
< .0001
.867
.0300
.185
.883
.655
.703
.00089
< .0001
.958
.0338
.547
.839
.623
.254
1.388
-.019
-.289
.160
-.017
.051
.043
1.426
-.0086
-.405
.0997
-.0333
.0810
.195
From Table 3, the main effect of factor A is found significant for the average response
whether or not the transition time is excluded from the runs. In both cases, the main effect of
factor C is significant. The estimated effect for A is slightly larger when the transition time is
removed, and so is the absolute value of the C effect.
12
PAPER E
Std.
order
Replicate 2
M-III
I
abc
47.44
.786
47.18
.621
-.411
47.61
-.746
47.58
-.783
Abc
aBc
ABc
48.76
.403
-.554
48.88
.356
-.651
48.53
.851
-.357
48.87
.818
-.351
47.42
.379
-.499
47.44
-.722
47.17
.480
-.552
47.40
-.760
48.69
.519
-.469
48.81
.558
-.425
48.89
-.830
48.81
.376
-.565
abC
47.32
.627
47.43
.582
47.16
.804
46.94
.758
AbC
aBC
ABC
48.38
.401
-.602
48.46
.418
-.627
48.50
.842
48.12
.789
46.97
48.32
.808
.740
46.64
48.26
.505
.609
-.478
47.17
48.71
.442
.792
-.404
-
47.18
49.08
.658
-.660
-
T1
M-IV
I
T1
M-III
I
1
T1
M-IV
I
T1
Table 5. ANOVA table and the estimated effects for Method III (M-III) and Method IV (M-IV).
In Method IV the first ten hours of each run are excluded.
Source
Model
A
B
C
AB
AC
BC
ABC
Pure error
Cor. Total
Sum of Squares
M-III
7.286
6.935
.0076
.244
.0947
3*10-4
4*10-5
.0042
.210
7.496
M-IV
9.026
8.283
.0014
.514
.077
2*10-5
.0042
.146
.738
9.763
D.f.
7
1
1
1
1
1
1
1
8
15
Mean square
F value
M-III
1.041
6.935
.0076
.244
.0947
3*10-4
4*10-5
.0042
.0263
M-III
39.62
264.0
.289
9.296
3.603
.0110
.0016
.161
M-IV
1.289
8.283
.0014
.514
.077
2*10-5
.0042
.146
0.092
Prob>F
M-IV
13.99
89.85
.0159
5.574
.838
.00020
.0456
1.580
M-III
< .0001
< .0001
.605
.0159
.0942
.919
.969
.698
Estimated effect
M-IV
< .0001
< .0001
.903
.0459
.387
.989
.836
.244
M-III
M-IV
1.317
-.0436
-.247
.154
.0085
-.0033
-.033
1.439
.0191
-.358
.139
-.0021
.0324
.191
By comparing Table 5 and Table 3 we see that using the estimated mean from the
ARMA models gives similar results as using the arithmetic run average. Again, it seems
reasonable to remove the estimated transition time before fitting the appropriate ARMA
13
PAPER E
model, since the estimated effect gets closer to the true value of 1.53. The estimated main
effect of factor A for Method IV is slightly larger (1.439 > 1.426) than the estimated effect
produced by Method II. The main effect of factor C is still significant on the significance level
0.05.
2Z 0, A
1 \ 1,A
2 0.265
1 0.641
1.476
(23)
Figure 4 shows a time series plot of the fitted values of model d) versus the observed time
series. The nice convergence of the fitted time series to the observed values indicates that
model d) seems to follow the process behavior well.
14
PAPER E
Table 6. Comparison of intervention-noise models for the time series in Figure 3. The p-values for
the estimated parameters are given above or below the parameter values. The arrows next to the
model criteria indicate if the corresponding criterion should be large () or small (). The models
are fitted using JMP 8.0 statistics software.
Fitted model
a).
A, C
s A,C 0
rA,C
bA
bC
b).
A, C
s A,C 0
rA,C
bA
bC
c).
A, C +
ARMA(1,1)
for the
noise
s A,C 0
rA,C
bA
bC
d).
A+
ARMA(1,1)
for the
noise
sA 0
rA 1
bA 3
KCO
KCO
KCO
0.0001
0.196
0.0001
47.907
1 0.743B
0.0001
1 0.444B
0.0001
St 3
StA3
1 0.623B
0.0001
0.0238
1 0.890B
AIC
SIC
Radj
0.0001
St
Ht
762
0.85
0.68
0.440
1926
1950
0.0236
1 0.888B
St
0.0001
Ht
760
0.84
0.68
0.445
1915
1939
StC
758
0.57
0.45
0.744
1327
1360
760
0.58
0.46
0.742
1331
1354
0.074
0.0220
1 0.893B
0.0001
1 0.313 B
H
0.0001
1 0.576 B
MAE
0.0003
0.0001
0.279
0.0001
47.908
s.d
0.0002
St
0.0001
0.414
0.0001
47.908
d.f.
0.0001
KCO
0.0001
47.911
0.0001
0.265
1 0.641B
0.0001
StA3
0.0001
1 0.303 B
H
t
1 0.598 B
0.0001
Notes: d.f. = degrees of freedom; s.d = standard deviation of the residuals; MAE = Mean Absolute prediction Error
AIC = Akaike Information Criterion; SIC = Schwarz Information Criterion.
15
PAPER E
y(t)
y(pred)
51
50
49
48
47
46
45
0
100
200
300
400
t
500
600
700
Figure 4. Fitted values using model d) in Table 6 versus the observations from the simulated
experiment.
\A
Table 7. The estimated main effects of factor A using Methods I-V (M-I to M-V) for the ten
simulations. The mean squared error is calculated as the mean squared deviation of the estimated
effects from the true effect of 1.53.
Simulation
1 (the illustrated example)
2
3
4
5
6
7
8
9
10
Average estimated effect
Standard deviation
Mean squared error
M-I
1.388
1.740
1.347
1.577
1.366
1.406
1.276
1.187
1.409
1.359
1.406
0.154
0.0368
M-II
1.426
1.672
1.547
1.745
1.502
1.410
1.416
1.277
1.534
1.422
1.495
0.137
0.0181
16
M-III
1.317
1.750
1.247
1.443
1.313
1.425
1.159
1.211
1.302
1.261
1.343
0.168
0.0603
M-IV
1.439
1.720
1.476
1.792
1.505
1.465
1.293
1.336
1.539
1.357
1.492
0.160
0.0245
M-V
1.476
1.776
1.598
1.772
1.495
1.484
1.421
1.274
1.539
1.428
1.528
0.155
0.0218
PAPER E
Table 8. The number of active effects found for the analysis methods I-V (M I to M V). The
significance level 0.05 is used during the analyses.
Simulation
1 (the illustrated example)
2
3
4
5
6
7
8
9
10
Tot. number of active effects
Tot. number of false effects
M-I
A, C
A, B
A
A
A
A
A
A, B
A
A
13
3
M-II
A, C
A
A
A
A
A
A
A
A
A
11
1
M-III
A, C
A
A
A
A
A
A
A, B
A
A
12
2
M-IV
A, C
A
A
A
A, ABC
A
A
A
A
A
12
2
M-V
A
A
A
A
A
A
A
A
A
A
10
0
Based on these initial simulations with one active effect some tentative conclusions are
drawn before doing further simulations and analyses. The results in Table 7 show that the
effects tend to be underestimated by Method I and III and we conclude that the observations
during the transition time should be removed before calculating averages or fitting an ARMA
model to each run. We also note that similar results are obtained using the adjusted average
(Method II) of each run or the estimated mean from the ARMA models (Method IV).
Furthermore, the average of effect estimates from Method V is somewhat closer to the true
simulated effect of 1.53 than the other methods. The analysis methods using ARMA models
for each run (Methods III and IV) do not seem to produce better location effect estimates than
using averages. Fitting ARMA models to each run is also more dependent on the number of
observations in each run and more time-consuming, making it a less attractive method.
Methods I, III, and IV are therefore excluded from further comparisons.
17
PAPER E
Table 9. Results from Methods II and V for ten simulations in the case with SN A 0 (no effect).
The simulated effect, Aeffect 0 , is determined by: ZA 0 , \ A 0 , and bA 0 . For Method II
the first ten observations in each run are excluded. For Method V the time series are adequately
modeled by an ARMA(1,1) model for all ten simulations. Therefore, effect estimates of A are not
available (n.a.) using Method V. Method II provides an effect estimate for each simulation.
Method II
.00800
.149
.0202
3
Method V
n.a.
n.a.
.n.a.
0
Table 10. Results from Methods II and V for ten simulations in the case with SN A 0.5 . The
simulated effect, Aeffect .382 , is determined by: ZA .0382 , \ A .5 , and bA 1 . For Method II
the first ten observations in each run are excluded. For Method V the noise series are adequately
modeled by an ARMA(1,1) model for all ten simulations.
Table 11. Results from Methods II and V for ten simulations in the case with SN A 1.0 . The
simulated effect, Aeffect .765 , is determined by: ZA .0765 , \ A .5 , and bA 1 . For Method II
the first ten observations in each run are excluded. For Method V the noise series are adequately
modeled by an ARMA(1,1) model for all ten simulations.
Method II
.719
.167
.027
14
0
4
Method V
.721
.188
.034
10
0
0
By studying Tables 7-11 the following can be noticed. Method II and Method V seem
to produce similar average effect estimates, standard deviations of the estimated effects, and
mean squared errors. However, the number of significant effects found and false active effects
differ. False effects are declared active more frequently using Method II, especially for small
effects. A possible explanation can be that the ARMA noise model in Method V manages to
18
PAPER E
adjust for the smaller shifts and cyclical behavior of the time series caused by the random
variation in the process. For Method V in Table 10 we see that only two out of ten transfer
functions for A are found significant. One possible explanation is that the ARMA model may
account for the added variation caused by a small effect ( SN A 0.5 ).
Factor
A
B
C
Explanation
Iron ore pellet type
Blast volume, Nm3/h
Moisture content of blast air, g/ Nm3
Low level ()
Type 1
1600
15
Assume that the pellets of Type 2 are better in the sense that their performance in the
blast furnace result in a higher carbon monoxide efficiency (KCO ). Also assume that an
increased blast volume has a negative effect on KCO , and at the same time increases the
production rate. The moisture content of the blast air does not affect KCO . We also assume that
the Type 2 pellets perform better on the low level of the blast volume and that the Type 1
pellets perform better on the high level. That is, we have a positive main effect of factor A, a
negative main effect of factor B, and a negative interaction effect AB. Table 13 gives the
parameters used for the simulations.
Table 13. Parameters used for the simulations.
Effect
Effect (long-term)
SNeffect
A
B
AB
.0612
-.2754
-.0918
0.5
0.1
0.5
3
0
3
.612
-1.53
-.918
0.8
2
1.2
Again, ten simulated experiments are performed and the resulting time series responses
are analyzed using Methods II and V. By visual inspection of the time series we estimate the
transition time to ten hours and, consequently, the first ten hours of each run are disregarded
using Method II.
The results from the analysis of the ten simulated experiments are given in Tables 14 and
15. Method II and Method V produce similar estimates of the effects, standard deviations, and
mean squared errors for the effect estimates. Again, Method II seems to generate more false
active effects. We also note that the pure lags, ( bA , bB , and bAB ), for the transfer functions in
the intervention-noise models are not consistently estimated throughout the ten simulations. A
possible explanation could be that the pure lags used in the simulations are small compared to
19
PAPER E
the total run length and that a slower response can be modeled either by adding a pure lag or
by increasing the denominator in the transfer function.
Table 14. The estimated effects using analysis methods II (M-II) and V (M-V) for the ten
simulations. The mean squared error is calculated as the mean squared deviation of the estimated
effects from the true simulated effects. The pure lags for the estimated effects for Method V are
given in brackets after each estimated effect. For Method V the noise series are adequately modeled
by an ARMA(1,1) model for all ten simulations.
Simulation
1
2
3
4
5
6
7
8
9
10
True effect
Average estimated effect
Standard deviation of estimated effects
Mean squared error
A
.463
.455
.653
.625
.653
.431
.604
.487
.890
.444
.612
.571
.144
.0204
M-II
B
-1.483
-1.626
-1.533
-1.876
-1.383
-1.507
-1.789
-1.457
-1.413
-1.454
-1.53
-1.552
.163
.0245
AB
-.877
-.852
-.786
-.915
-.857
-.869
-1.097
-.947
-.866
-.843
-.918
-.891
.084
.0071
A
.452 (4)
.626 (0)
.722 (2)
.755 (1)
.747 (3)
.378 (0)
.678 (0)
.394 (0)
.898 (1)
.486 (1)
.612
.614
.177
.0281
M-V
B
-1.467 (0)
-1.642 (0)
-1.588 (1)
-1.873 (0)
-1.477 (0)
-1.563 (0)
-1.854 (0)
-1.573 (0)
-1.398 (0)
-1.521 (0)
-1.53
-1.596
.157
.0266
AB
-.872 (3)
-.937 (2)
-.783 (0)
-.931 (1)
-.823 (0)
-.912 (2)
-1.056 (3)
-.968 (3)
-.837 (2)
-.869 (0)
-.918
-0.899
.079
.0060
Table 15. The number of significant effects found using Methods II (M-II) and V (M-V). The
Significant effects
A, B, AB, AC
A, B, AB, ABC
A, B, AB
A, B, AB, BC
A, B, AB
A, B, AB
A, B, AB
A, B, AB
A, B, AB
A, B, AB
33
False significant
effects
AC
ABC
BC
3
M-V
False significant
Significant effects
effects
A, B, AB, AC
AC
A, B, AB
A, B, AB
A, B, AB
A, B, AB
A, B, AB
A, B, AB
A, B, AB
A, B, AB
A, B, AB
31
1
9. An unreplicated 23 experiment
Unreplicated experiments are often used in industry to generate information at a low cost, but
they lack an independent estimate of the experimental error. Analysis of unreplicated two-level
factorials is traditionally made by studying a normal (or half-normal) probability plot of the
effects (Daniel, 1959). In the case of a 23 design, however, there are only 7 effects to plot and
20
PAPER E
determination of the reference distribution of the inert effects is difficult. To reduce the
subjectivity of the normal probability plotting technique, several formal methods to analyze
unreplicated factorials have been proposed in the literature, see Hamada and Balakrishnan et al.
(1998) and Chen and Kunert (2004) for a review and comparison of important methods.
Here we simulate an unreplicated 23 factorial and then analyze the experiment using
Method II and Method V. The background and parameters used in the replicated 23 design
(Section 8) are also used in this example, except for the omission of the replicate. For Method
II the effects are calculated after removing the first ten observations in each run due to the
transition time, and then Lenths method (Lenth, 1989) and the Box and Meyer method (Box
and Meyer, 1986) are used for formal analysis of the estimated effects. We also use the method
outlined in Bergquist et al. (2009), which builds on the Box and Meyer method.
Using Lenths (1989) method we define an effect as likely active if the absolute value of
the effect is larger than Lenths margin of error (ME), and clearly active if larger than Lenths
simultaneous margin of error (SME). Let c1 , c 2 ,! , c m be the effect estimates. Then the Lenths
pseudo standard error (PSE) of the effects is:
where s0
(24)
1.5 u median c j . Two 95 percent confidence intervals for the effects are:
j
(25)
respectively. In (25) t.975,d denotes .975th quantile of the t distribution with d degrees of
freedom, d m 3 , and J 1 .951/m 2 .
Box and Meyer (1986) recommend using the prior probability D 0.2 that an effect is
active, and k 10 , which determines the inflation factor for the standard deviation of an
active effect, as recommended by Box and Meyer. Under the assumption that the effects ci are
independent
and
identically
distributed
from
the
Gaussian
mixture
2
2 2
1 D N 0,V D N 0, k V , the posterior probability, P, that effect i is active given c i and
V is:
P i active c i ,V
c 2
exp 2 i 2
k
2k V
2
c
c 2
D
exp 2 i 2 1 D exp i 2
k
2k V
2V
(26)
The posterior probabilities that each effect is active are calculated using a numerical integration
procedure described in Bergquist et al. (2009). Effects are considered active if their posterior
probability is t 0.5 . We also investigate the performance of the adjusted version of the Box
and Meyer method using a three-step procedure outlined in Bergquist et al. (2009). The
analysis principles, frequently used in the analysis by normal probability plots, of hierarchy and
heredity are included through allowing individual prior probabilities of the effects. These prior
probabilities are: 0.5 for main effects, 0.3 for two-factor interactions exhibiting strong heredity
21
PAPER E
(both main factors are active), 0.02 for other two-factor interactions, and 0.01 for the threefactor interaction.
From Table 16, it can be concluded that the effect estimates are comparable using
Method II and V. Although Method V produces an average estimated effect of B somewhat
further from the true simulated effect than Method II, the opposite is true for the estimates of
the A and AB effects. Hence, there seems to be no clear difference between the methods
regarding the effect estimates. The standard deviations and the mean squared errors for the
estimated effects are also comparable for the two methods.
Table 16. The estimated effects using analysis methods II (M-II) and V (M-V) for the ten
simulations. The mean squared error is calculated as the mean squared deviation of the estimated
effects from the true simulated effects. The pure lags for the estimated effects for Method V are
given in brackets. n.a. means that the effect is non-significant and cannot be estimated. For
Method V the noise series are adequately modeled by an ARMA(1,1) model for all ten simulations.
Simulation
M-II
A
M-V
AB
1
.388 -1.110 -.712
n.a.
2
.433 -1.879 -.997
n.a.
3
.581 -1.474 -1.139
n.a.
4
.617 -1.379 -.473 .732 (3)
5
.423 -1.511 -1.164
n.a.
6
.712 -1.436 -.602 .521 (3)
7
.508 -1.306 -.871 .404 (3)
8
.856 -1.385 -.930 .734 (2)
9
.644 -1.652 -.468 .599 (3)
10
.611 -1.543 -.602 .660 (2)
True effect
.612 -1.53
-.918
.612
Average estimated effect
.577 -1.468 -.796
.608*
Standard deviation of estimated effects .145 .206
.261
.129*
Mean squared error
.020 .042
.076
.014*
*Based on six estimated effects for A and nine estimated effects for AB
AB
-1.202 (0)
-1.635 (0)
-1.405 (1)
-1.337 (0)
-1.356 (0)
-1.064 (0)
-1.383 (0)
-1.388 (0)
-1.474 (1)
-1.389 (1)
-1.53
-1.363
.151
.048
-.649 (2)
-1.131 (3)
-1.189 (1)
-.367 (2)
-1.219 (3)
-.681 (2)
-.932 (3)
-.891 (2)
n.a.
-.870 (1)
-.918
-.881*
.281*
.072*
Table 17 reveals the difference among the methods. Method II combined with Lenths
method seems to be the most conservative, declaring only five effects larger than Lenths ME
(likely active), B in all cases. The Box and Meyer (1986) method results in several additional
effects being declared active. The selected prior probability of active effects, D 0.2 , is
conservative, given that we know that three out of seven effects are active. By using the threestep procedure outlined in Bergquist et al. (2009), we find as many true effects as with Method
V but also two false effects. By using Method V and multiple intervention-noise models, 25
out of the 30 simulated active effects were considered significant. No effects are falsely declared
active. Method V thus appears to perform better than the other methods when there are no
replicates.
22
PAPER E
Table 17. The number of significant effects found using Method II (M-II) combined with Lenths
(1989) [LE89], Box and Meyers (1986) [BM86], Bergquist et al. (2009) [BE09] methods, and
Method V (M-V). For Lenths ME, SME, and Method V, the significance level 0.05 is used in the
analyses. The posterior probabilities for the effects are given as superscripts. False effects are
underlined.
M-II
M-V
Simulation
LE89-ME
( e ! ME )
LE89-SME
( e ! SME )
BM86
Post. prob. t .5
BE09
Post. prob. t .5
Significant
effects
.81 (B)
1.94 (-)
B.80, AB.66
B, AB
B.92
B, AB
.89
.66
1.31 (B)
3.14 (-)
B , AB
1.61 (-)
3.86 (-)
B.51
.79
1.09 (B)
2.60 (-)
2.37 (-)
5.68 (-)
B.68, AB.58
.58
1.50 (-)
3.59 (-)
1.32 (-)
3.15 (-)
B.64
.55
B, AB
.67
A, B, AB
.75
.97
.52
.67
.97
.67
.93
.50
A, B, AB
.68
.95
.83
A, B, AB
.75
.88
.72
A , B , AB
.66
.93
A , B , C , AB
A , B , AB
A , B , AB
B, AB
2.39 (-)
5.71 (-)
A , B , AB
A, B, AB
1.23 (B)
2.94 (-)
B.88
A.69, B.98
A, B
10
.89 (B)
2.12 (-)
5
0
0
0
15
0
25
0
A, B, AB
Method I: using averages for each run as the response in, for example, an ANOVA,
Method II: using averages for each run as the response, but with the observations during
an estimated transition times removed,
Method III: using the estimated mean from an ARMA model fitted to each run as the
response,
Method IV: using the estimated mean from an ARMA model fitted to each run, but
with the observations during the estimated transition time removed, and
The analysis methods are compared by simulations (using Matlab scripts, available
through the corresponding author) of a dynamic continuous process under the assumption that
the effects only affect the mean of the process, not process dynamics or variability. In Methods
III and IV, we fit an appropriate ARMA model to each run. That is, it is not assumed that
most runs should follow an ARMA(1,1) process, which might be expected knowing the
background of the simulation. With experience of the process, an engineer may have enough
23
PAPER E
process knowledge to assume that the dynamics of the process will not change due to the
experimental factors and hence fit, for example, the same ARMA(p,q) model to each run. We
chose to be more general and do not make such an assumption although it would provide the
possibility to further automate the analysis of the time series from each run.
Although we believe that the assumption of unchanged dynamics of the process often is
valid in practice (given that basic process setups and processing steps are left unchanged), effects
on the variability of the process are probably more frequent. A study of analysis methods to
estimate dispersion effects from experiments with time series responses is therefore motivated.
Due to our limited number of simulations, our conclusions are tentative. The process of
building intervention-noise models requires input and iterative evaluation by the analyst in the
different model building steps. The analysis process is therefore difficult to automate, which is
needed in a large simulation study. Time concerns also explain why we choose to keep a
number of parameters constant during the simulations. This includes the choice of the 23
design, the run lengths, and the dynamics of the effects. In a future study we aim to investigate
how the analysis methods perform when the run lengths become small compared to effect
dynamics. Other types of response dynamics will also be tested. Only first order dynamic
responses are modeled in this study, but higher order dynamic responses may occur. Method V
may then perform worse, as the simple transfer function in (14) specifically emulates a first
order dynamics response.
First we conclude that observations from the transition time in the beginning of each run
should be removed to avoid underestimation of location effects before using the averages of
each run or the estimated means from ARMA models. Consequently, the estimation of the
transition time becomes important for analysis of experiments in dynamic processes.
Furthermore, using the estimated average from an ARMA model for each run of the
experiment does not seem to improve estimates of location effects compared to using the run
average. Moreover, splitting the entire time series among the runs and then removing the
transition times can result in too few observations for reliable estimations of the time series
models.
Based in the initial results and the observed drawbacks, we disregard analysis methods I,
III, and IV early on in the simulation study and focus on Method II and V. Fitting time series
models to each run is probably a more attractive method if dispersion effects should be
estimated, as they provide an unbiased estimate of process variability in the presence of
autocorrelation.
We conclude that Methods II and V produce comparable effect estimates (given that a
reasonable estimate of the transition time is made using Method II). However, Method II
seems to produce more false active effects than Method V, especially when effects are small.
For unreplicated experiments, the results indicate that Method V find more of the active effects
than Method II combined with Lenths (1989) and Box and Meyers (1986) methods for
analysis of unreplicated experiments. The Bergquist et al. (2009) method finds as many active
effects as Method V, but also two false effects. We view this result to be of importance for
24
PAPER E
industrial experiments in, for example, continuous processes, where replication often is difficult
due to cost concerns.
We find Method II, using averages adjusted for the transition time, to be a robust and
rather easy way to analyze experiments with time series responses. However, an interventionnoise model constitutes a more comprehensive method that seems to produce fewer spurious
effects when the effects are small and also seems to find more of the truly active effects when
there are no replications made. Another advantage with intervention-noise models is that the
entire time series from the experiment is used. The models also provide means to model the
dynamics of the effects something that is ignored entirely by Method II. The estimated
intervention-noise model can also be used to create an estimated time series that can be
compared to the original response observations (see Figure 4). We believe that comparing a
fitted time series with the actual observations can be helpful during the analysis of time series
responses. Method V also allows for adding more intervention variables that can be used to
model the effect of known disturbances that occur during the experiment.
We are aware of that the use of time series analysis methods to analyze industrial
experiments may suffer from further complications such as critical disturbances during the
experiment and missing observations that break up the time series. It is probably common that
time series from industrial experiments have periods with missing observations, making
straightforward use of Methods II, IV and V difficult. If the missing observations cannot be recreated by, for example, interpolation, the analyst may have to use averages from each run as
the response a method shown to work quite well.
Finally, our recommendations to the analyst interested in location effects in two-level
factorials with time series responses are:
x Step 1: Estimate the transition time and remove it from the time series of each run.
Calculate the run average and use the average as the single response in an ANOVA or
in a normal probability plot of the estimated effects. This is a robust way to analyze the
experiment and estimate location effects.
x Step 2: Fit intervention-noise models to the resulting time series from the experiment
and estimate the effects through the estimated transfer functions. Use the results from
Step 1 and the information about the largest effects as a guide in the model building
process.
Acknowledgment
The financial support from the Swedish mining company LKAB and the European Union,
European Regional Development Fund, Produktion Botnia is gratefully acknowledged. The
authors thank LKAB and the members of the LKAB EBF methodology development project,
especially Gunilla Hyllander, for the contributions to the results presented here. The authors
thank Dr. Murat Kulahci for valuable feedback on the work presented in this article.
25
PAPER E
References
Bergquist, B., Vanhatalo E. and Lundberg Nordenvaad M. (2009). A Bayesian Analysis of
Unreplicated Two-Level Factorials Using Effects Sparsity, Hierarchy and Heredity.
Submitted for publication.
Bisgaard, S. and Kulahci M. (2006a). Quality Quandaries: Studying Input-Output
Relationships, Part I. Quality Engineering, 18(2): 273-281.
Bisgaard, S. and Kulahci M. (2006b). Quality Quandaries: Studying Input-Output
Relationships, Part II. Quality Engineering, 18(3): 405-410.
Black Nembhard, H. and Valverde-Ventura R. (2003). Integrating Experimental Design and
Statistical Control for Quality Improvement. Journal of Quality Technology, 35(4): 406423.
Box, G. E. P. and MacGregor J. F. (1974). The Analysis of Closed-Loop Dynamic-Stochastic
Systems. Technometrics, 16(3): 391-398.
Box, G. E. P. and Macgregor J. F. (1976). Parameter Estimation with Closed-Loop Operating
Data. Technometrics, 18(4): 371-380.
Box, G. E. P. and Tiao G. C. (1976). Intervention Analysis with Applications to Economic
and Environmental Problems. Journal of the American Statistical Association, 70(349): 70-79.
Box, G. E. P. and Meyer R. D. (1986). An Analysis for Unreplicated Fractional Factorials.
Technometrics, 28(1): 11-18.
Box, G. E. P., Jenkins G. M. and Reinsel G. C. (2008). Time Series Analysis: Forecasting and
Control, 4th. ed. Hoboken, NJ, Wiley.
Chen, Y. and Kunert J. (2004). A New Quantitative Method for Analysing Unreplicated
Factorial Designs. Biometrical Journal, 46: 125-140.
Daniel, C. (1959). Use of Half-Normal Plots in Interpreting Factorial Two-Level Experiments.
Technometrics, 1(4): 311-341.
Hamada, M. and Balakrishnan N. (1998). Analyzing Unreplicated Factorial Experiments: A
Review with Some New Proposals (with comments by C. Benski, P. D. Haaland, and
R. S. Lenth). Statistica Sinica, 8: 1-41.
Hau, I., Matsumura E. M. and Tucker R. R. (1996). Building Empirical Models for Data
From Factorial Designs with Time Series Responses: Toward Fraud Prevention and
Detection. Quality Engineering, 9(1): 21-34.
Jenkins, G. M. (1979). Practical Experiences with Modeling and Forecasting Time Series. St. Helier,
Jersey, Channel Islands, Gwilym Jenkins & Partners.
Lenth, R. V. (1989). Quick and Easy Analysis of Unreplicated Factorials. Technometrics, 31(4):
469-473.
Montgomery, D. C., Jennings C. L. and Kulahci M. (2008). Introduction to Time Series Analysis
and Forecasting. Hoboken, NJ, Wiley.
Vanhatalo, E. and Bergquist B. (2007). Special Considerations when Planning Experiments in a
Continuous Process. Quality Engineering, 19(3): 155-169.
26
PAPER E
Vanhatalo, E. and Vnnman K. (2008). Using Factorial Design and Multivariate Analysis
When Experimenting in a Continuous Process. Quality and Reliability Engineering
International, 24(8): 983-995.
Vanhatalo, E. (2009). Multivariate Process Monitoring of an Experimental Blast Furnace.
Quality and Reliability Engineering International, In press, published online ahead of print.
DOI: 10.1002/qre.1070.
Vanhatalo, E., Kvarnstrm B., Bergquist B. and Vnnman K. (2009). A Method to Determine
Transition Time for Experiments in Dynamic Processes. Submitted for publication.
27
PAPER F
PAPER F
Abstract:
This article studies the viability and estimates the strengths of the sparsity,
heredity, and hierarchy principles using metadata. The results from the metastudy are used for
prior probability assessment in a Bayesian procedure to calculate posterior probabilities of
active effects for unreplicated two-level factorials. We specify individual prior probabilities for
each effect based on the results from the metastudy and the posterior probabilities are then
calculated in a three-step procedure where the principles of effects sparsity, hierarchy, and
heredity are successively considered. We illustrate our approach by reanalyzing experiments
found in the literature.
Keywords: Unreplicated factorials, Prior information, Bayesian analysis, Posterior probability of active
effects, Markov chain Monte Carlo integration, Engineering judgments.
1. Introduction
Experiments are usually expensive but often the only viable way to create process knowledge.
The area of Design of Experiments (DoE) was developed in the twentieth century to increase
the effectiveness and efficiency of experimentation, and DoE is now, in various forms,
frequently used in applications such as research, engineering and economics.
Commonly, only two levels of the factors are tested to reduce the experimental effort,
but the experimental venture may be large even so. Unreplicated factorials are therefore often
used to generate information at lower experimental cost and powerful analysis methods for
unreplicated factorials are always sought for. This article discusses analysis methods for
unreplicated two-level factorials.
Analysis of unreplicated experiments often rests on three implicit hypotheses. The first
hypothesis, the effects sparsity principle, is used by almost all methods. According to the sparsity
principle, only a few of the estimated effects are likely to be active. The rest of the tested main
or interaction effects have no practical influence on the measured responses, and the contrasts
could thus be used as estimates of the experimental noise.
The second hypothesis is that active interactions are less likely than active main factors,
and the higher the order of the interaction, the less likely it is that it is active. This principle is
usually called the effects hierarchy principle, and is used, for example, to plan screening
experiments. Since the main purpose of screening experiments is to investigate activity of
PAPER F
many factors rather than to obtain precise cause and effect relations, the possibility to separate
active aliased effects are often sacrificed arguing that active interactions are less common than
active main effects. The effects hierarchy principle is also often used during analysis of
unreplicated experiments, where higher order effects are considered less likely than, for
example, main effects although they are of similar size.
The third and final hypothesis used for separation of active and inert contrasts is the effect
heredity principle, which states that an interaction is more likely to be active if its parent factors
are active. In this article, we refer to these three principles as the governing principles.
A standard way to analyze unreplicated two-level factorials is to study a normal
probability plot of the effects. The normal probability plot lets the analyst boost analysis power
of experiments lacking independent variation estimates, through use of the heredity and
hierarchy principles, and the effects sparsity is the general assumption on which the analysis
method rests. Normal probability plotting, or half-normal probability plotting, see Daniel
(1959) and Daniel (1976), lets the analyst pinpoint outliers deviating from a distribution
estimate based on the contrasts closest to zero. However, the normal probability plot is an
analysis tool where the result of the analysis is highly dependent on the analytical skills and
judgment of the user. Two skilled analysts could come to different conclusions, as the selection
procedure includes a series of subjective classifications and considerations. The analyst must, for
instance, select factors not likely active and individually weigh the hierarchical and heredity
principles. According to our experience, many find analysis by normal probability plots
difficult; in particular to incorporate the hierarchy and heredity principles in the analysis
requires skill and experience.
More formal tests to assess the activity of effects from unreplicated factorials have been
proposed in the literature. Finney (1945) proposed to use the hierarchical principle to select
contrasts that a priori were deemed unlikely and to use these for error estimation, which may
work well when there are such contrasts. More recent methods are based on the effects sparsity
principle and start by sorting the contrasts based on their absolute sizes. Some fraction of the
smallest contrasts are then used to calculate the reference distribution of inert contrasts, see, for
example, Voss (1988), Lenth (1989), Berk and Picard (1991), Dong (1993), and Schneider et al.
(1993). All these methods subjectively include some contrasts likely to be inactive in the
reference distribution estimate; a procedure shown to have low power, see Haaland and
OConnel (1995). Other procedures based on distribution tests of all contrasts have been
proposed. Venter and Steel (1996), formed a null hypothesis that all contrasts were inert and
came from the same reference distribution. The normality assumption of the reference
distribution was then tested and contrasts causing rejections of the null hypothesis (and thus
being outliers) were classified as active. Similar approaches have been used by Le and Zamar
(1992) and Sandvik-Wiklund and Bergman (1999). Hamada and Balakrishnan (1998) provide a
review and comparison of the above mentioned and other methods. Effect hierarchy and
heredity are seldom regarded in the methods referred to above. Hamada and Wu (1992)
presented a method complementing sparsity selection with an iterative comparison of effect
heredity.
2
PAPER F
We believe that formal methods, like the ones above, have the advantage that they may
be automated and do not rely as heavily on the skills of the analyst. On the other hand, they
usually lack the possibility for the analyst to value the likelihood of certain contrasts being
active based on prior information such as the governing principles. The Bayesian approach,
introduced by Box and Meyer (1986), is an exception. Box and Meyer used a prior
probability, D , that the effects are active in the experiment and a parameter, k , expressing the
larger standard deviation generated by active effects as to that of inert effects. Using the Bayes
theorem, posterior probabilities of each contrasts being active are then calculated and contrasts
having a posterior probability greater than 0.5 are considered more likely active than inert.
To use the Bayesian approach, the analyst must specify model parameters before the
analysis. Prior estimates of, for instance, how many times more variation that may be attributed
to active factors than inert factors, and the estimated probability that a factor or interaction is
active should be specified. Setting priors is a task for which the result is sensitive. The use of
Bayesian algorithms by practitioners is further limited since these methods are not available in
standard statistical computer packages.
In this article we look for ways to formalize the analysis procedure for unreplicated twolevel experiments that can incorporate prior knowledge or knowledge about the governing
principles to assess the activity of individual effects. We believe that the Bayesian approach is
suitable in this respect, since it allows the incorporation of the experimenters prior knowledge
about the activity of effects. In absence of prior knowledge about specific effects, the governing
principles can also provide valuable decision support. We argue that since the Bayesian
approach is easier to formalize, it can be made less sensitive to individual preferences and
analysis experience than, for example, the result from an analysis using a normal probability
plot. Indeed, the results will still depend heavily on sound reasoning by the engineer when
deciding on, for example, the prior probabilities.
Effect hierarchy or heredity were not incorporated in the method by Box and Meyer
(1986), but these principles can and have been implemented in Bayesian approaches. For
example, Chipman (1996) and Chipman et al. (1997) incorporated effect hierarchy and
heredity in a Bayesian variable-selection algorithm to search the model space of competing
models and calculate their posterior probabilities. Their approach specifically targeted complex
aliasing situations including screening, mixed level, and supersaturated designs. In order to
account for the governing principles, the methodology was constructed using hierarchical
priors and the stochastic search variable-selection algorithm presented by George and
McCulloch (1993). Some simplifications, compared to the approach taken by Box and Meyer
(1986), were made in order to achieve reasonable complexity. These include an informative
Gamma distribution for the noise variance, rendering analytical posteriors, and additional
design parameters that might be difficult to set for an analyst.
To limit complexity we build on the ideas in Chipman (1996) and Chipman et al. (1997)
to consider hierarchy and heredity principles, however, we choose to incorporate them into
the less parameterized method presented by Box and Meyer (1986).
PAPER F
The purpose of this paper is to first study the viability of the governing principles of
sparsity, hierarchy, and heredity in experiments found in literature. Second, we extend the Box
and Meyer (1986) approach for analysis of unreplicated factorials to a three-step analysis
procedure that successively considers effect sparsity, hierarchy, and heredity and calculates
posterior probabilities of effects being active and provide a numerical integration procedure.
We use individual prior probabilities for the effects being active, which provides the possibility
to incorporate prior knowledge. Results from the literature study are then used to illustrate
how prior probabilities for the effects can be set up reflecting effect sparsity, hierarchy, and
heredity. The approach is exemplified by analyzing experiments found in the literature.
PAPER F
Table 1. The selected experiments from the literature search. The column y is the number of
responses in the experiment. The column k is the variance inflation factor (calculated only for
unreplicated experiments) according to Box and Meyer (1986), see also Section 3 below.
Author
Box et al.
(1978)
(p. 307)
Box et al.
(1978)
(p. 375)
Experiment
Box et al.
(1978)
(p. 326)
Process
development
example
Cerutti et al.
(2004)
Daniel (1959)
Pilot plant
experiment
Reactor
example
Plasma
emission
spectroscopy
Penicillin
production
Design
Author
Experiment
Design
2 u 23
Montgomery
(2005) (p. 242)
Oxide
thickness
24
11.06
25
11
Montgomery
(2005) (p. 290)
IC-yield
2V51
23.67
Injection
mould
shrinkage
2 6IV2
10.16
13.9
Montgomery
(2005) (p. 298)
23
Montgomery
(2005) (p. 308)
CNC jet
turbine
28IV3
6.14
25
4.75
Montgomery
(2005) (p. 326)
Pedersen and
Ramulu
(2006)
Poon and
Williams
(1999)
Spin coater
experiment
2 6IV2
9.15
Cutting force
experiment
2 u 24
Solder printing
process
28IV3
5.90
6.94
Reche et al.
(2000)
Formaldehyde
extraction
2V51
3.71
2.83
Grses et al.
(2002)
Electrocoagulation
24
Laus et al.
(1997)
Polymerization
2 u 23
Lundquist et
al. (2004)
Pulp-reinforced
thermoplastics
24
7.3
3.4
-
2 u 23
Silva et al.
(2003)
Serine
preotease
coupling
24
Filtration rate
24
6.88
Smith et al.
(1995)
Earth-moving
systems
26
Aircraft panel
defects
24
4.72
Epitaxial
growth of
Si/SiGe
2V51
Montgomery
(2005)
(p. 215)
Montgomery
(2005)
(p. 228)
Montgomery
(2005)
(p. 239)
Plasma etch
experiment
PAPER F
care. On the other hand, our metastudy is based on experiments including both full and
fractional factorials. If the full factorials are compared with the reduced ones, the probability of
factors being active was higher in full factorials than in fractional factorials. In the fractional
factorial case, 44 % of tested main factors were active.
Table 2. Rate of active contrasts in experiments found in the literature.
All experiments
- Contrasts or effect
- Main factor
- Two-factor interaction
- Three-factor interaction
Only full factorials
- Main factor
- Two-factor interaction
- Three-factor interaction
Only fractional factorials
- Main factor
- Two-factor interaction
- Three-factor interaction
Tested in total
637
160
320
100
Tested in total
94
146
96
Tested in total
66
174
0
Rate
0.19
0.53
0.10
0.02
Rate
0.60
0.16
0.02
Rate
0.44
0.04
-
Reflecting on the results presented in Table 2, the effects hierarchy principle receives
strong support. For full factorials, it was 3.4 times as likely for a main factor to be active
compared to a two-factor interaction; ten times as likely if only the fractional factorials are
considered. A likely explanation can be that full factorials often are run with a special purpose
of modeling interactions, whereas reduced experiments may be run to detect whether any of
many possible factors or interactions are active.
The effect hierarchy principle may also be used for other predictions besides what types
of factors are likely to be active. In Table 3, the number of times an effect is of largest, second
largest and third largest magnitude is displayed. It was more than 30 times as common for the
largest effect to be related to a main factor than to an interaction. Hence, the results show that
main effects are not only more frequent, but also generally of larger magnitude than interaction
effects.
Table 3. Occurrence of largest effects. Note that two of the experiments did not generate active
effects.
Order of the largest effect in the experiments
A main factor has the largest effect in the experiment
A two-factor interaction has the largest effect in the experiment
A main factor has the 2nd largest effect in the experiment
A two-factor interaction has the 2nd largest effect in the experiment
A main factor has the 3rd largest effect in the experiment
A two-factor interaction has the 3rd largest effect in the experiment
Occurrence
31
1
24
8
14
11
Rate
0.97
0.03
0.75
0.25
0.56
0.44
The results also support the effects hierarchy principle. When all experiments are
included, main factor effects were almost six times as common as two-factor interactions. If
only full factorials are considered, the frequency of active main factors being active is about
four times as high as the frequency of active two-factor interactions. For fractional factorials,
6
PAPER F
the frequency of active main factors is around ten times as high. Only full factorials had
resolution large enough to test activity of three-factor interactions, where only two were active
of the 100 tested.
The effect heredity principle also appears useful for evaluating unreplicated factorials, see
Table 4.
Table 4. Heredity of two-factor interactions.
Type of heredity
Strong heredity
Weak heredity
No heredity
Possibilities
79
163
78
No. of occurrences
26
4
1
Rate
0.33
0.02
0.01
Of the 31 active two-factor interactions, 26 showed strong heredity implying that both
main effects would have been selected as active on their own merits. One third of all twofactor interactions with strong heredity were active. Weak heredity (one of the two parent
factors was active) was present in four cases out of 163 possibilities. Only once (Reche et al.,
2000) was a two-factor interaction active without any of its parent factors being active.
However, the two-factor interaction was small and a reanalysis of the experiment indicates a
need for transformation as well as strong curvature tendencies. If the response is transformed,
the size of the non-hereditary interaction effect is comparable to the inert effects.
Pr i active Ti ,V
T 2
exp 2 i 2
2k V
kV 2S
2
1
1
T
T 2
exp 2 i 2 1 D i
exp i2
Di
2k V
2V
kV 2S
V 2S
Di
(1)
PAPER F
effects T is therefore required to finalize our analysis. Since the effects are assumed to be
independently distributed, we obtain:
Q
f T V
f T V
i
(2)
i 1
Box and Meyer (1986) use a non-informative prior distribution for V (Jeffreys prior):
f V v
(3)
Using this approach for the situation with individual prior probabilities we get the following
conditional distribution for V given T :
p V T v f T V f V
Di
Ti 2
Ti 2
exp
1
exp
D
2 2
2
i
V Q 1 i 1 k
2k V
2V
1
(4)
Pr i active T
Pr i active
Ti ,V p V T dV
(5)
PAPER F
interaction effects, and 0.01 for three-factor and higher order interactions. The posterior
probabilities are then calculated adjusted for hierarchy. For a two-level factorial with 15
estimated effects, the selected priors sum to an average prior probability of
4 u 0.5 6 u 0.1 5 u 0.01 15 0.177 , and for a design with 7 effects a slightly optimistic
average prior probability of 0.259. We note here that our proposed prior probabilities should
be considered as guidelines. The engineer may select appropriate prior probabilities for the
effects reflecting the prior knowledge for each unique experiment but we argue that the
average prior probability should be calculated in this step to consider effects sparsity.
In the third round, the prior probabilities were adjusted to reflect the heredity principle.
From the posterior probabilities in the second round it is possible to determine which main
effects that seem to be active (posterior probability larger than 0.5). The prior probabilities for
two-factor interactions exhibiting strong heredity are then increased to 0.3, and the prior
probabilities for two-factor interactions with weak or no heredity are reduced to 0.02. The
posterior probabilities are then recalculated to reflect effect heredity. Notice that changing the
prior probability of an effect also affects the posterior probabilities of all other effects, as the
estimate of the standard deviation of random effects depends on the priors.
This three-step procedure produces posterior probabilities for all effects reflecting, in
turn, effect sparsity, hierarchy, and heredity. The three posterior probabilities for each effect
can then be compared when deciding on whether the effect is active or not. We argue that the
three steps is a formalization of the thinking process when analyzing a normal probability plot,
successively considering the three governing principles. Figure 1 gives a summary of the
procedure.
Considers sparsity
Round 1
ALL EFFECTS
0.2
Calculate
posterior
probabilities
Considers sparsity,
hierarchy and heredity
Considers sparsity
and hierarchy
MAIN EFFECTS
2FI
0.5
Calculate
posterior
probabilities
0.1
Round 2
0.01
MAIN EFFECTS
0.5
Round 3
0.3
0.02
Calculate
posterior
probabilities
0.01
Figure 1. An outline of the three-step Bayesian analysis procedure in which the sparsity, hierarchy,
and heredity principles are incorporated in the prior probabilities for the effects.
4. Examples
To illustrate our proposed Bayesian approach using the sparsity, hierarchy, and heredity
principles we choose to analyze experiments from the literature where we do not have any
prior knowledge about the activity of effects.
4.1 The spray coating experiment
Consider the article by Saravanan et al. (2001) which describes an unreplicated 2 4 spray
coating application experiment where the effects of altered fuel ratio (A), carrier gas rate (B),
PAPER F
frequency of detonations (C) and spray distance (D) were measured on six responses. Here we
choose to only analyze the porosity response (vol.% of the Al2O3 coating).
For the porosity response, the original authors concluded that factors A, B, and D were
active and that all interactions including two-factor interactions were only measuring noise.
Indeed, A, B, and D are the largest effects but it is unclear from the article whether the original
authors had some prior knowledge that was used during the analysis. We now reanalyze the
experiment assuming lack of prior knowledge and using our proposed three-step Bayesian
approach.
The effects and the three sets of prior probabilities, D , used in our analysis, as well as the
posterior probabilities, Pr, are given in Table 5. Figure 2 shows a normal probability plot over
the effects in Table 5. The posterior probabilities from the three steps in the analysis procedure
can be viewed and compared in Figure 3.
From our analysis we also conclude that factors A, B, and D are active, and this is true
for all rounds. Furthermore it seems likely that factor C as well as the two-factor interaction
BC is active, which becomes even more prominent after considering the hierarchy principle
(round 2).
0.98
0.95
0.90
BC
CD
Probability
0.75
0.50
0.25
AC
0.10
0.05
0.02
-2
-1.5
-1
-0.5
Effects
0.5
Figure 2. Normal probability plot for the effects in the experiment in Saravanan et al. (2001).
Effect heredity is incorporated in round 3, and this further increases the posterior
probability of BC. In addition, the posterior probability of AC increases from about 0.4 to 0.8
and the interaction may be considered active due to heredity. We conclude that it is likely that
A, B, C, D, BC, and AC are active. The spray coating example illustrates how the posterior
probabilities in the three-step procedure produce information about the activity of effects and
the effects of the consideration of the governing principles become transparent. Furthermore it
illustrates how the consideration of effects hierarchy and heredity can increase the posterior
probability of main effects and interactions exhibiting strong heredity (like the AC interaction
above).
10
PAPER F
Table 5. Effects, prior and posterior probabilities for the analysis of the porosity response for the
experiment in Saravanan et al. (2001). k = 10. D for Round 1 is 0.2 for all effects. The posterior
probabilities with values of 0.5 or larger are underlined.
Model term
(effect)
Estimated effect
A (fuel ratio)
B (gas rate)
C (detonations)
D (spray distance)
AB
AC
AD
BC
BD
CD
ABC
ABD
ACD
BCD
ABCD
Average
Round 2
Round 3
Round 1
Round 2
Di
Di
Pr
Pr
Pr
0.5
0.5
0.5
0.5
0.1
0.1
0.1
0.1
0.1
0.1
0.01
0.01
0.01
0.01
0.01
0.18
0.5
0.5
0.5
0.5
0.3
0.3
0.3
0.3
0.3
0.3
0.01
0.01
0.01
0.01
0.01
0.26
0.969
0.699
0.654
0.728
0.049
0.456
0.025
0.685
0.050
0.074
0.029
0.025
0.095
0.025
0.026
-
0.999
0.946
0.914
0.960
0.023
0.416
0.012
0.864
0.023
0.034
0.001
0.001
0.005
0.001
0.001
-
0.999
0.992
0.984
0.994
0.102
0.790
0.044
0.985
0.104
0.159
0.001
0.001
0.007
0.001
0.001
-
-2.048
0.929
0.787
1.022
-0.193
-0.488
-0.0510
0.884
-0.196
0.237
-0.103
0.034
-0.261
-0.041
-0.056
-
Round 1
Round 2
Round 3
Round 3
1,0
0,9
0,8
Posterior probability
0,7
0,6
0,5
0,4
0,3
0,2
0,1
0,0
A
AB
AC
AD
BC
BD
CD
ABC
ABD
ACD
BCD
ABCD
Figure 3. The posterior probabilities of effects after round 1, 2 and 3 for the spray coating
experiment.
11
PAPER F
in seconds (C), transfer time in seconds to the camber former (D), hold time in seconds for the
part in the camber former (E), and quench oil temperature in F (O). The experiment was
replicated three times, and the authors assumed in their analysis that the replicates were true
replicates, rather than duplicate measures, see Table 6.
Table 6. The design used in leaf spring experiment and the corresponding data.
Furnace temp.
B
Heating time
C
+
+
+
+
+
+
+
+
Transfer time
D
+
+
+
+
+
+
+
+
Hold time
E
+
+
+
+
+
Oil temp.
O
+
+
+
+
+
+
+
+
Y=
Rep. 1
7.78
8.15
7.50
7.59
7.94
7.69
7.56
7.56
7.50
7.88
7.50
7.63
7.32
7.56
7.18
7.81
Free height
Rep. 2 Rep. 3
7.78
7.81
8.18
7.88
7.56
7.50
7.56
7.75
8.00
7.88
8.09
8.06
7.62
7.44
7.81
7.69
7.25
7.12
7.88
7.44
7.56
7.50
7.75
7.56
7.44
7.44
7.69
7.62
7.18
7.25
7.50
7.59
The analysis of the experiment followed standard analysis of variance, see Table 7 for the
sum of squares and p-values. It was concluded that the contrasts including the main effects and
interactions O, B, C, CO and E were active. The p-value of the BO interaction was
comparatively low, but it was not concluded to be active and this conclusion was not further
discussed by Pignatiello and Ramberg.
Assume now that only one unreplicated experiment had been performed, how likely is it
that the Bayesian approach depicted here would point to the same conclusions, based on
single-replicate experiment? To try to answer this question using the assumption of true
replicates, one of the three replicate measurements were drawn from each run, to generate a
sample of 200 single-replicate experiments drawn from the 316 = 43 046 721 possible
combinations, (for instance [7.78, 8.15, 7.50, 7.75, 8.00, 7.69, ... , 7.59]). Each of the 200
samples thus represented one possible outcome if the experiment had not been replicated.
Contrasts were calculated for each sample, and then the posterior probabilities for each contrast
were calculated, thus providing a sample of 200 posterior probability vectors. The average
posterior probabilities, P , and the standard deviation of the posterior probabilities, sP , for
round 1 to 3 are given in Table 7.
12
PAPER F
Table 7. Extract of original ANOVA analysis based on three replicates (Pignatiello and Ramberg,
2000), and posterior calculations based on 200 single replicate runs for leaf spring experiment. The
contrasts considered active in the original analysis are underlined and so are the average posterior
probabilities with values of 0.5 or larger in the three rounds.
Model term
(contrast)
B+CDE
C+BDE
D+BCE
O+BCDEO
BC+DE
BD+CE
BO+CDEO
CD+BE
CO+BDEO
DO+BCEO
E+BCD
BCO+DEO
BDO+CEO
CDO+BEO
EO+BCDO
Original analysis
Sum of
p-value
Squares
0.587
<0.001
0.373
<0.001
0.010
0.44
0.809
<0.001
0.004
0.63
0.005
0.69
0.086
0.03
0.015
0.35
0.328
<0.001
0.035
0.16
0.129
0.01
0.001
0.81
0.020
0.26
0.027
0.21
0.009
0.47
Round 1
Round 2
Round 3
sP
sP
sP
0.52
0.40
0.07
0.65
0.07
0.06
0.11
0.10
0.37
0.10
0.15
0.06
0.08
0.08
0.07
0.23
0.27
0.05
0.24
0.03
0.02
0.07
0.12
0.24
0.10
0.14
0.03
0.06
0.05
0.03
0.81
0.69
0.24
0.89
0.04
0.03
0.09
0.05
0.35
0.07
0.41
0.00
0.01
0.01
0.03
0.17
0.22
0.10
0.11
0.02
0.01
0.12
0.08
0.27
0.12
0.19
0.01
0.01
0.01
0.03
0.84
0.73
0.25
0.91
0.10
0.04
0.22
0.05
0.50
0.04
0.44
0.00
0.01
0.01
0.04
0.17
0.24
0.13
0.10
0.10
0.07
0.21
0.08
0.33
0.13
0.23
0.01
0.01
0.01
0.07
Note: P is the average posterior probability for the contrast based on 200 randomly drawn single-replicate samples
from the data in Table 6. sP is the standard deviation for the 200 posterior probabilities for each round.
If we assume that the original analysis of the activity of effects by Pignatiello and
Ramberg was correct, that analysis can be used as a benchmark. If we were to base our
decisions on the Box-Meyer approach (round 1), we would, on average conclude that O and
B were the only active contrasts. Using the hierarchy principles (round 2), one would also
include C. With all governing principles (round 3), CO would also on average be considered
active approximately as often as not. The posterior probability of the E contrast were for seven
samples (of the 200) larger than 0.5 after round 1, a figure that increased to 59 samples after
round 2 and 71 samples after round 3, Note also that the average posterior probabilities of
active effects increase from round 1 to 3. As an illustration of the simulation results, a 3D
histogram of the posterior probabilities of the CO contrast is given in Figure 4.
The leaf spring example shows how the power of the analysis can be increased by
consecutively considering the sparsity, hierarchy and heredity principles in our Bayesian
approach. In fact, for four out of five contrasts in this example we did on the average arrive at
the same conclusion of activity using an unreplicated experiment as the original authors did
using three replicates of the design. We do not mean that the replication of the design was
unnecessary, only that with a powerful analysis method we would draw approximately the
same conclusions even if replication was not possible.
13
PAPER F
30
Round 1
Round 2
Round 3
Number of occurrences
25
20
15
10
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Rounds
Posterior probability
Figure 4. Histogram of the posterior probabilities of the CO contrast after round 1, 2 and 3 for the
leaf spring experiment based on the 200 contrast vectors. Each bar represents the number of
occurrences the posterior probability was within that class, e.g. 0 d P 0.05 .
14
PAPER F
in the literature survey were initiated since previous experiments had indicated that some of
the main factors and interactions were active. Hence, the results regarding the frequency of
active main effects should probably be seen as an upper limit in a screening experiment. A
trend supporting this notion can be noted in Table 2, where main effects are more seldom
active in the fractional factorials than in the full factorials.
In this article we outline a Bayesian approach for analysis of unreplicated two-level
factorials, which extends the Box and Meyer (1986) method to allow for individual prior
probabilities of all effects. Process knowledge can therefore be considered when determining
the prior probabilities for each effect. In this article we focus on the situation when process
knowledge is limited and instead we illustrate how knowledge about the governing principles
of effect sparsity, hierarchy, and heredity can be used to increase the power of the analysis. We
use the results from our literature study to illustrate how the principles can be used to increase
the power of the Box and Meyer method. Indeed, process knowledge and knowledge about
the principles can be combined to further increase analysis power.
Our proposed method uses three steps where effect sparsity, hierarchy, and heredity is
successively added by adjusting the prior probabilities for the effects. The posterior probabilities
for activity of each effect are then calculated using MCMC integration. The advantage of
performing all three steps is that the experimenter can compare the three posterior probabilities
for each effect in a formal procedure that successively considers the governing principles. We
agree with Box and Meyer (1986) that the Bayesian approach should be considered
complementary to a normal probability plot. Together, the approaches provide a strong tool
for deciding on the activity of effects of an unreplicated experiment.
Note that if the proposed three-step approach changes the prior probability for a specific
effect, this also results in slight changes of the posterior probability of all other effects. For
borderline effects (posterior probabilities around 0.5), this may push the posterior probability
past 0.5, as the distribution of random effects is updated. The step-wise approach updating the
prior probabilities does, therefore, raise the question of the assumption of independency of
effects is stretched too far, see also the discussion in Chipman et al. (1997) concerning the
choice of hierarchical priors. Since the prior probabilities of the effects in step 3 in our method
are based on the posterior probabilities in step 2, the priors depend on previous results of the
analysis. However, note that the selection procedure is based on the accumulated knowledge
of the likelihood of the effects being active. We thus argue that the three-step approach is a
better way to formally consider effect the hierarchy and heredity principles that alternatively
must be considered in a one-shot analysis of a normal probability plot.
As industrial experimentation is expensive, most experimenters will not settle for
conclusions such as the effect of treatment A was not quite significant on the 5 % significance
level. According to our experience, the experimenter will usually include process knowledge
in determining which factors should be considered active. If an effect corresponds to what was
expected, large trust may be put classifying an effect as active although the effect is not
statistically significant. However, there is a risk that experimenters go too far in this direction.
15
PAPER F
A formal elicitation method like the one we propose is a way to make engineering reasoning
and analysis results more transparent for unreplicated experiments.
Bayesian analysis methods provide excellent possibilities to incorporate prior knowledge
of different kinds for decision-making, but currently require mathematical and statistical
knowledge above that of the average user of designed experiments. To facilitate the use of
Bayesian analysis methods for unreplicated factorials, the methods probably need to be made
available in common statistical analysis software.
Acknowledgement
We gratefully acknowledge the financial support from the Swedish mining company LKAB, as
well as the County Administrative Board under grant 303-02863-2008, and the Regional
Development Fund of the European Union, grant 43206 which made this research possible.
References:
Berk, K. N. and Picard, R. R. (1991). Significance Tests for Saturated Orthogonal Arrays.
Journal of Quality Technology, 23(2): 79-89.
16
PAPER F
PAPER F
Li, X., Sudarsanam, N. and Frey, D. (2006). Regularities in Data From Factorial Experiments.
Complexity, 11(5): 32-45.
Lundquist, L., Arpin, G., Leterrier, Y., Berthold, F., Lindstrm, M. and Mnson, J.-A.E.
(2004). Alkali-Methanol-Anthraquinone Pulping of Miscantus x Giganteus For
Thermoplastic Composite Reinforcement. Journal of Applied Polymer Science, 92(4): 21322143.
Montgomery, D. C. (2005). Design and analysis of experiments, 6th ed., New York, NY, Wiley.
Pignatiello, J. J. Jr. and Ramberg, J. S. (1985). Discussion of Kackars Off-Line Quality
Control, Parameter Design and the Taguchi Method. Journal of Quality Technology,
17(4): 199-206.
Pedersen, W. and Ramulu, M. (2006). Facing SiCp/Mg Metal Matrix Composites With
Carbide Tools. Journal of Materials Processing Technology, 172(3): 417-423.
Pei, Z. J., Xin, X. J. and Liu, W. (2002). Finite Element Analysis for Grinding of Wire-Sawn
Silicon Wafers: A Designed Experiment. International Journal of Machine Tools &
Manufacture, 43(1) :7-16.
Poon, G. K. K. and Williams, D. J. (1999). Characterization of a Solder Paste Printing Process
and Its Characterization. Soldering & Surface Mount Technology, 11(3): 23-26.
Reche, F., Garrigs, M. C., Snchez, A. and Jemnez, A. (2000). Simultaneous Supercritical
Fluid Derivatization and Extraction of Formaldehyde by the Hantzsch Reaction. Journal
of Chromatography A, 896(1-2): 51-59.
Sandvik-Wiklund, P. and Bergman, B. (1999). Finding Active Factors From Unreplicated
Fractional Factorials Using the Total Time on Test (TTT) Technique. Quality and
Reliability Engineering International, 15(3): 191-203.
Saravanan, P., Selvarajan, V., Joshi, S. V. and Sundararajan, G. (2001). Experimental Design
and Performance Analysis of Alumina Coatings Deposited by a Detonation Spray
Process. Journal of Physics D: Applied Physics, 34: 131-140.
Schneider, H., Kasperski, W. J. and Weissfeld, L. (1993). Finding Significant Effects For
Unreplicated Fractional Factorials Using the n Smallest Contrasts. Journal of Quality
Technology, 25(1): 18-27.
Silva, C. J. S. M., Gbitz, G. and Cavaco-Paulo, A. (2003). Optimization of a Serine Protease
Coupling to Eudragit S-100 by Experimental Design Techniques. Journal of Chemical
Technology and Biotechnology, 81(1): 8-16.
Smith, S. D., Osborne, J. R. and Forde, M. C. (1995). Analysis of Earth-Moving Systems
Using Discrete-Event Simulation. Journal of Construction Engineering and Management,
121(4): 388-396.
Stephenson, W. R., Hulting F. L. and Moore K. (1989). Posterior Probabilities for Identifying
Active Effects in Unreplicated Experiments. Journal of Quality Technology, 21(3): 202-212.
Tan, B. L. and Tan, T. L. (2005). A Study of Si/SiGe Selective Epitaxial Growth by
Experimental Design Approach. Thin Solid Films, 504(1-2): 95-100.
Venter, J. H.and Steel, S. J. (1996). A Hypothesis-Testing Approach Toward Identifying
Active Contrasts. Technometrics, 38(2): 304-313.
Voss, D. T. (1988). Generalized Modulus-Ratio Tests For Analysis of Fractional Factorials
With Zero Degrees of Freedom For Error. Communications in Statistics - Theory and
Methods, 17: 3345-3359.
18
PAPER F
Pr i active T |
1
N
Pr i active
n 1
Ti ,V n
(A.1)
The samples, V n , are generated by creating a Markov Chain with stationary distribution
p V T . We apply the Metropolis algorithm (with a symmetric jumping distribution) which
in turn is a special case of the more general Metropolis-Hastings algorithm, see, for example,
Gelman et al. (2004):
x initialize V 1 (see below)
x for n 2! N
o propose an update of V n , V c , by adding a symmetrically distributed random
variable (we use a normally distributed variable), that is,
o V c V n 1 W , where W N 0, J .
o If V c ! 0 , calculate q p V c T p V n 1 T , else q 0 . Note that since q is a
ratio, it is sufficient to know p V T up to proportionality in Eq. (4).
o Draw a uniformly distributed random variable a between zero and one, that is,
a U 0,1 .
o If a q , (keep the new sample V c )
Vn V c
o else (keep the old sample V n 1 )
V n V n 1
x end.
The procedure will converge under fairly weak conditions, see Gelman et al. (2004).
Furthermore, Gelman et al. (2004) recommends an acceptance rate of 0.44 for new samples,
V c , for a one-dimensional problem. To achieve the wanted acceptance rate we first calculate a
reasonable starting value for V 1 (or Z1 after reparameterization, see below). We also
continuously adjust the standard deviation, J , of the symmetrical jumping variable, W , to
avoid the algorithm to get caught in, for example, heavy tails of the posterior distribution of V .
To obtain a reasonable start value V 1 , some of the effects with the smallest absolute
values are selected. V 1 is then calculated as the standard deviation of these effects. The effects
are sorted based on their absolute value and then the smallest half (rounded down to the
nearest integer) are selected and the standard deviation of the effects are calculated. For
example, the 3, 7, and 15 effects of smallest absolute value are selected for the cases with 7, 15,
and 31 effects (contrasts) respectively. Furthermore, an initial setting of J is required, here
chosen as 0.2 u V 1 . After every 100 samples of the Metropolis algorithm, the acceptance rate is
19
PAPER F
adjusted. If the acceptance rate is smaller (larger) than 0.44, J is decreased (increased) with 5
percent. The acceptance rate is recalculated after another 100 samples. This procedure
automatically calibrates the standard deviation of the symmetrical jumping variable J . We
have also found that a burn-in period of 1,000 samples before starting to sum the posterior
probabilities in (A.1) is useful to reduce the possible bias from the starting values of V 1 and J .
The total number of samples, N, required for stable approximations of the posterior
probabilities varies among examples we have tested. Using N = 100,000 has produced
posterior probability estimates stable down to the third decimals of the posterior probabilities.
Using these settings, the calculation time for each round in our proposed method is a few
seconds on a PC with a 1.7 GHz processor. Higher precision is achieved by increasing N.
For some parameter settings, the distribution in (4) becomes challenging to integrate.
Particular concerns include near-singularities and heavy tails. In these cases, a large number of
samples, N, are generally required to ensure proper convergence of (A.1). To limit these
effects, the problem was reparameterized using:
1
(A.2)
Hence, using this variable change in Eq. (5), (1), and (4) we now have:
Pr i active T
Pr i active
Ti , Z p Z T dZ
(A.3)
Pr i active Ti , Z
Z
T 2Z 2
exp i 2
2k
k 2S
2 2
Z
Z
T Z
T 2Z 2
exp i 2 1 D i
exp i
Di
2k
2
k 2S
2S
Di
Q
T 2Z 2
D
pZ T v Z Q 1 i exp i 2
i 1
2k
k
T 2Z 2
1 D i exp i
2
(A.4)
(A.5)
We can now calculate the posterior probability for the effects by generating samples from
p Z T with the Metropolis algorithm and approximate the integral in (5) by:
Pr i active T |
1
N
Pr i active
n 1
Ti , Zn
(A.6)
However, for cases with a large number of effects (large Q ) we encounter another difficulty.
The product of a large number of expressions of the kind a exp x b exp y in (A.5) may be
small and cause numerical problems, especially when calculating the ratio, q, in the Metropolis
algorithm. To solve this problem, p Z T was rewritten to:
20
PAPER F
Q D
T 2Z 2
T 2Z 2
pZ T v Z Q 1 explog i exp i 2 1 D i exp i
i 1 k
2
2k
Q
D
T 2Z 2
T 2Z 2
.
v Z Q 1 exp log i exp i 2 1 D i exp i
k
i 1
2k
2
(A.7)
We then use
l1 i, Z
log D i k
Ti 2Z 2
and l2 i, Z
2k 2
log 1 D i
Ti 2Z 2
2
(A.8)
(A.10)
Formula A.10 does not solve the problem completely but allows us to use the following
equality:
log exp x exp y
g x, y
(A.11)
We can use this when calculating the ratio, q, in the Metropolis algorithm and thereby create a
robust implementation. That is,
p Z c T p Zs 1 T !
Z cQ 1
Q
exp g l1 i, Z c , l2 i, Z c g l1 i, Zs 1 , l2 i, Zs 1
Q 1
Zs 1
i 1
21
(A.12)