You are on page 1of 25

Yellow For my calculated or written part

Red for communication

CHAPTER V

RESULT & IMPLEMENTATION

5.1 CASE STUDY

For Process Capability Index for two correlated quality characteristics following
Bivariate Exponential distribution.

The percentile method will be used to evaluate process capability for non-normal bivariate
characteristics. To use equation 4.20 one needs to calculate the probability of quality
characteristics falling between specification limits. In order to calculate this probability we
first need to know the distribution of the data.The following steps are involve in calculation:-

1. Select the sample data (X1, Y1) (X2, Y2)...... (Xn,Yn).A real data set is used from Wangs
paper[116] .Wang discussed a manufacturing product (called connector) from a computer
industry having multivariate (seven) quality characteristics. This data set contains a
sample of 100 parts that were tested on seven quality characteristics of interest to the
manufacturer. Based on the quality characteristics and the manufacturing processes, we
found that variables 1 (contact gap), 2 (contact loop), are correlated. We found that
variables (X1, X2,) are correlated. We selected the variables 1 and 2 for the study. The
specification limits for these characteristics are 0.100.04 , 00.50, respectively.

Table 5.1: The measurement data for the case study


S.No x1 x2
1. 0.1165 0.0614
2. 0.1259 0.0277
3. 0.1265 0.0762
4. 0.1185 0.0957
5. 0.1414 0.1319
6. 0.092 0.0476
7. 0.0804 0.044
8. 0.1103 0.0901
9. 0.1022 0.0918
10. 0.1103 0.0823
11. 0.1069 0.0943
12. 0.1075 0.095
13. 0.117 0.1177
14. 0.1276 0.1378
15. 0.1039 0.0645
16. 0.1251 0.0988
17. 0.1153 0.1553
18. 0.0961 0.0142
19. 0.117 0.0808
20. 0.1158 0.078
21. 0.1134 0.1511
22. 0.1356 0.0407
23. 0.1222 0.0654
24. 0.1124 0.1516
25. 0.1155 0.0695
26. 0.1223 0.0853
27. 0.113 0.0869
28. 0.1137 0.0652
29. 0.1112 0.1357
30. 0.1114 0.0909
31. 0.0978 0.097
32. 0.1209 0.0939
33. 0.1092 0.0419
34. 0.1161 0.1088
35. 0.1131 0.0698
36. 0.1193 0.1445
37. 0.1233 0.09
38. 0.089 0.1102
39. 0.1074 0.0507
40. 0.0947 0.0845
41. 0.1048 0.0661
42. 0.1192 0.0448
43. 0.1143 0.1009
44. 0.1102 0.0815
45. 0.1084 0.1617
46. 0.1071 0.0423
47. 0.1106 0.0585
48. 0.1051 0.1151
49. 0.1167 0.0827
50. 0.1062 0.0566
51. 0.1091 0.0221
52. 0.1138 0.119
53. 0.1309 0.0699
54. 0.1075 0.0417
55. 0.1157 0.0377
56. 0.1193 0.0752
57. 0.1239 0.0849
58. 0.1133 0.0418
59. 0.1206 0.0818
60. 0.122 0.0733
61. 0.1161 0.117
62. 0.1186 0.1044
63. 0.102 0.0529
64. 0.1116 0.0563
65. 0.1111 0.0352
66. 0.0986 0.0437
67. 0.1166 0.0651
68. 0.1187 0.0535
69. 0.1206 0.1073
70. 0.1139 0.0956
71. 0.1103 0.0332
72. 0.1089 0.0843
73. 0.1043 0.0558
74. 0.1056 0.0651
75. 0.1145 0.121
76. 0.1277 0.0945
77. 0.112 0.1285
78. 0.1134 0.0854
79. 0.1237 0.0752
80. 0.127 0.0933
81. 0.1154 0.1247
82. 0.0967 0.0685
83. 0.1056 0.0575
84. 0.1116 0.1168
85. 0.1113 0.0839
86. 0.1149 0.1019
87. 0.1091 0.017
88. 0.112 0.0734
89. 0.1074 0.0685
90. 0.1112 0.0807
91. 0.1097 0.0993
92. 0.1085 0.0882
93. 0.1009 0.1252
94. 0.1071 0.0686
95. 0.1157 0.0518
96. 0.1127 0.1203
97. 0.1124 0.1029
98. 0.1017 0.0753
99. 0.121 0.0299
100. 0.1107 0.1671

2. Data fitting.
To determine what distribution data follows which is an important step. It can easily put data
into a software package that will test many different distributions to find out which
distribution fits into data best. But it should have a reason for using a certain distribution it
must make sense in terms of best process. It makes sense that it follows an exponential
distribution.

*Please remove the title as Gama distribution because in this step we are not defined or select
distribution. Figure given below is from minitab software tool please check ........at this stage
if we are not transform the data then scattered plot or histogram plot should be same but here
they are different.
Figure 5.1: Scatter plot of variables X1, X2 by minitab.

We can give Descriptive Statistics: X1, X2 are

Variable N Mean StDev Variance


X1 100 0 0.11267 0.00937 0.000088
X2 100 0 0.08265 0.03341 0.00112

For X1,X2, CoefVar = ?????? This is from Minitab please verify it.

Once a distribution (with a particular set of parameters) has been fit to the data, a number of
additional important indices and measures can be estimated. It can compute the cumulative
distribution function (commonly denoted as F(t)) for the fitted distribution, along with the
standard errors for this function. Thus, it can determine the percentiles of the cumulative
survival (and failure) distribution for predict the time at which a predetermined percentage of
components can be expected to have failed.

The residual plot confirms Normality assumption of error. Conclusion in case error is non-
normal may be erroneous.
Figure 5.2: Error rate of Simple Exponential and proposed Bivariate Exponential -ML

This figure produces a weighted least squares fit of a straight line to a set of points with error
in both coordinates. It can handle bivariate regression where the errors in both coordinates are
correlated and is capable of performing force-fit regression. This figure produces a weighted
least squares fit of a straight line to a set of points with error in both coordinates. It can
handle bivariate regression where the errors in both coordinates are correlated and are capable
of performing force-fit regression.
Figure 5.3: Proposed Bivariate Exponentially ML and Weibul simple distribution of y.

The Weibull distribution is one of the most widely used lifetime distributions in reliability
engineering. It is a versatile distribution that can take on the characteristics of other types of
distributions, based on the value of the shape parameter, y.

Figure 5.4: Probability for Weibull plot for Data


The horizontal axis in Figure 5.4 is labeled as time (Data), and vertical axis of the graph
represents Probability.
In probability plots (or P- plots for short) the observed cumulative distribution function is
plotted against the theoretical cumulative distribution function. The values of the respective
variable are first sorted into ascending order. The i'th observation is plotted against one axis
as i/n (i.e., the observed cumulative distribution function), and against the other axis as F(x(i)),
where F(x(i)) stands for the value of the theoretical cumulative distribution function for the
respective observation x(i). If the theoretical cumulative distribution approximates the
observed distribution well, then all points in this plot should fall onto the diagonal line (as in
the graph below).

Figure 5.5: Best fitted data of Proposed Bivariate Distribution to Best curve

A good fit of the theoretical distribution to the observed values would be indicated by this
plot if the plotted values fall onto a straight line. The adjustment factors radj and nadj ensure
that the p-value for the inverse probability integral will fall between 0 and 1, but not
including 0 and 1.
3. Parameter estimation and bivariate statistics

Given a distribution or model for data, the next step is to fit the model to the data.
Typical probability distributions will have unknown parameters, numbers that change the
shape of the distribution. The technical term for the procedure of finding the values of the
unknown parameters of a probability distribution from data is estimation. During
estimation one seeks to find parameters that make the model fit the data the best. If this
all sounds a bit subjective, thats because it is. In order to proceed, we have to provide some
kind of mathematical definition of what it means to fit data the best. The description of how
well the model fits the data is called the objective function. Typically, statisticians will try
to find estimators for parameters that maximize (or minimize) an objective function. And
statisticians will disagree about which estimators or objective functions are the best.

In the case of the Gaussian distribution, these parameters are called the mean and standard
deviation often written as and . In using the Gaussian distribution as a model for some
data, one seeks to find values of and that fit the data.

**** Our proposed distribution is bivariate exponential for that parameters are two scale
parameters 1 and 2 this distribution has a third parameter indicating the correlation which
may take as constant .

Maximum Likelihood Estimation


The most commonly used objective function is known as the likelihood, and the most well-
understood estimation procedures seek to find parameters that maximize the likelihood.
These maximum likelihood estimates (once found) are often referred to as MLEs.

The likelihood is defined as a conditional probability: P(data|model), the probability of the


data given the model. Typically, the only part of the model that can change are the
parameters, so the likelihood is often written as P(X|) where X is a data matrix, and is a
vector containing all the parameters of the distribution(s). This notation makes explicit that
the likelihood depends on both the data and a choice of any free parameters in the model.

we can use the joint probability rule to write:

= (|) = (1|)(2 |) ( |) = ( |)
=1

Maximum likelihood estimation says: choose the parameters so the data is most probable
given the model or find that maximizes L. In practice, this equation can be very
complicated, and there are many analytic and numerical techniques to solve it.

****Please give your suitable code or function with description of MLE..

My Manual calculation

Then the Bivariate probability and density functions become [34] for our formula

y x y
F ( x, y) (1 e x )(1 e )[1 e ]; x 0, y 0, 1 1 (4.22)

By Bilodeau and Kariya[37],

f x, y e
x y
x

y
1 2e 1 2e 1
(4.23)

This is Gumbels Bivariate Exponential distribution Model Type II.

By Farlie [38],

Fx,y(x,y) = Fx(x)Fy(y)[1+ (1-Fx(x))(1-Fy(y))] (4.24)

For manual calculation MLE of this cdf is very complicated so I take trial and error method
to calculated C value which is very important for process capability measure ........otherwise
we can calculate it by MLE method.......

Manual method Let us have

Fx(x) =1-e-x = C1 (4.25)


Fy(y) = 1-e-y= C2 (4.26)
Let P = Fx,y (x,y) (4.27)

Then from equation (3.24) becomes,

P = C1 C2 {1+ (1- C1) (1- C2)} (4.28)

In this case, there would be many choices of (C1, C2) corresponding to any specified P and .
An optimum choice would be attempted by considering the variance expression for Ib and
minimizing it with respect to variation in C1 and C2 subject to the corresponding P = C1 C2
{1+ (1- C1) (1- C2)}.

If condition C1= C2= C then,

P = C2 {1+ (1-C) 2} (4.29)

This is a quadratic equation. For a fixed probability P, process capability interval or natural
process interval (C) for negative values of is greater than that of corresponding positive
values. Choice of C for a given P and is unique though the value of C has to be obtained
only numerically by solving equation (4.29).

For fixed P and , R ( ) can be calculated

An estimate of the process capability index (Ib) works out as

(from equation 4.20)

U1U 2 12
Ib U1U 2 R ( )12
log(1 C )
2

On numerical computation (by trial and error), we find that at fixed , if P increases, C
also increases. Obviously, of course, C depends upon P and .

Table 4.1: Relation of fixed probability and natural process interval

P C

0.99 0.5 0.995

0.995 0.5 0.9975

0.96 0.5 0.9797


Besides the two scale parameters 1 and 2 this distribution has a third parameter indicating
the correlation and whose range [1; 1]. There is another condition, that two characteristics x
and y having different distributional parameters 1 and 2, it would be worthwhile to consider
two different values C1 and C2 for Fx(x) and Fy(y) respectively. In this case, there would be
many choices of (C1, C2) corresponding to any specified P and . An optimum choice would
be attempted by considering the variance expression for Ib and minimizing it with respect to
variation in C1 and C2 subjected to the corresponding P from equation (4.28).

4 .Calculation of Cp Please give your suitable Software code or method with Values.....

***my manual calculation

By Using PCI formula 4.20 calculate the Ib U1U 2 R( )12

U1=0.14, U2=0.5.
We assume that L=0 for both the data set.

Then from Table given by Mukherjee and Singh[80] for Least values of Var ( I )

Table 5.2: Least values of Var ( I )

n
K p1 p2 f Least Var ( I ) X
(U L)2

3.5 0.0075 0.9825 4.336 0.532692

3.0 0.0040 0.979 4.348 0.389613

2.5 0.0006 0.9756 4.316 0.280669

2.0 0.0001 0.955 3.779 0.208278


1.5 0.0001 0.960 3.557 0.218330

1.0 0.0000 0.980 3.912 0.209214

0.75 0.0016 0.9976 8.137 0.034507

f =3.912 for k =1, p1=0.

Then for X1 and X2, f1 and f2 = 3.912, 1=0.009374, 2=0.033408.


From equation 4.19
U1U 2 U1U 212
Ib
f1 f 21 2 log 1 C1 log 1 C2

0.14 0.5
1.06
Ib 3.912 3.912 0.009374 0.033408

5 SOFTWARE IMPLEMENTATION OR SIMULATION (TOOL)

There are many popular statistical packages used by quality practitioners, like MATLAB
software, Minitab, Statistica, Retc. Minitab does have the provision of doing multivariate
normal capability analysis. Using that analysis, one can compare the results of multiple
variables. However, the analysis does not give the combined (where variables are dependent)
multivariate PCI. in the MATLAB,for multivariate analysis can be conducted by writing
specific object oriented code. We ues MATLAB software for our data analysis. it is cast-off
as an experimental and simulation software for the configuration of system established up &
for location up the data transmission among various nodes existing in the set-up. MATLAB is
an essential software design & commands are used as a replication device. For our second
case study MINITAB tool is used because its very user friendly and give good comparative
study.

5.1 SIMULATION ****** required to calculate.


The purpose of this section is to show the capacity of the proposed method for estimating the
Cp and Cpk value of non-normal bivariate processes. Simulation studies have been conducted

for bivariate non-normal processes. For this simulation study, bivariate non-normal
distributions such as Gamma, Beta and Weibull and Weibull- Gamma are used.

P-values

The P-value of the test statistic, is the area of the sampling distribution from the sample result
in the direction of the alternative hypothesis. If the null hypothesis is correct, than the p-value
is the probability of obtaining a sample that yielded your statistic, or a statistic that provides
even stronger evidence of the null hypothesis.The p-value is therefore a measure of statistical
significance. If p-values are very small, there is strong statistical evidence in favor of the
alternative hypothesis. If p-values are large, there is insignificant statistical evidence. When
large, you fail to reject the null hypothesis.

5.2 FORMULA ANALYSIS OF PNC METHOD


CDF PCI METHOD
Wierda [65] introduced a new approach toevaluate process capability for a non-normal data
using Cumulative Distribution Function (CDF). Castagliola [15] used CDF approach to
compute proportion of non-conforming items and then estimate the capability index using
this proportion. Castagliola showed the relationship between process capability and
proportion of non-conforming items and used CDF method to evaluate PCI for non -normal
data by fitting a weibull distribution to the process data. He used a polynomial approximation
to replace empirical function in the weibull distribution, and then used the proposed method
given by equation (5.1). To calculate Cpwe give a shortUsing CDF method Cpand Cpkare
defined.

Figure 5.7 presents flowchart of estimating PNC and PCIs using different methods and
different non-normal distributions. The exact PNC value (p) in this flow chart is obtained
using following equation.
(5.1)

(5.2)

(5.3)

(5.4)

where f(x) represents the probability density function of the process and T represents the
process mean for non-normal data and process median for non-

Three non-normal distributions; Gamma, Weibull and Beta have been used to generate
random data in this simulation. These distributions are used to investigate the effects of non-
normal data on the process capability index. These distributions are known to have the
parameter values that can represent mild to severe departures from normality. These
parameters are selected so that we can compare our simulation results with existing results
using the same parameters from the [12,19] . The probability density function of Gamma
distribution, with parameters and , is given by

(5.5)
The probability density function of Weibull distribution with shape ( ) and scale ( ) is
given by

(5.6)

The probability distribution function of Beta distribution with shape 1 ( ) and shape 2 ( ) is
given by
(5.7) (5.8)
where f (x) represents the corresponding distribution function of Gamma, Weibull and Beta
distributions.

Generate sample data using non-normal


distribution (e.g. Gamma, Weibull, Beta
etc.)

Compute Cpk using CDF method


(Equation (5.2)) and compute PNC for
the corresponding Cpu, (Equation (5.4)),
call it p1

Compute Cpu using wibull method and


compute PNC for the corresponding
Cpu, call it p2

Compute Cpu using gamma and beta method and


compute PNC for the corresponding Cpu, call it
p3Compute Cpu using Clements method and
compute PNC for the corresponding Cpu, call it p3

Access the efficacy of different methods


by comparing p1, p2, p3 and exact p

Figure 5.7: Simulation methodology flowchart


****** Please give a table for corresponding Bivariate Exponential,Gamma,Weibull,beta
distribution.and give the comparision on the basis of above mention PNC value .the Below
mention Table is not a suitable comparative study.

In below Table We can find that the quantiles obtained from this proposed bivariate
distribution are similar to those of the simulated gamma distribution, normal distribution and
Weibull distribution are as:
Methods P*(Gamma) PNC(Gamma) Specification Limits

Cp CPk

[0,0] [0.0025,0.005]

Proposed Bivariate 0.9771 0.57831 1.6637 0.9771


distribution
Bivariate Burr 0.0325 0.0199 0.7761 0.8645
Distribution
[ref
B. ABBASI, S.
AHMAD][12]
multivariate process None 0.0091 0.5283 0.7872
capability indices
[19]
Table 5.1: comparison table of result simulation

In all above result the quantiles obtained from this distribution are similar to those of the
simulated gamma distribution. Figure 5.6 shows the scatter plot of data obtained by
generating observations from the fitted distribution. In the above case = 1.6637 and
= 0.57831, p-gamma=0.9771 and CPk=0.9771. The results of the simulation study for various
distributions in listed in Table 5.1. The results from the simulation study indicate that the
values of Cp and Pk obtained are close to the values obtained from the exact distribution. A
real data set is used from Wangs paper [58]. Wang discussed a manufacturing product
(called connector) from a computer industry having multivariate (seven) quality
characteristics. These seven characteristics are 1 (contact gap ), 2 (contact loop Tp), 3
(LLCR), 4 (contact Tp), 5 (contact loop diameter), 6 (LTGAPY) and 7 (RTGAPY),
respectively. The specification limits for these characteristics can be two-sided or one-sided,
and they are 0.100.04 , 0+0.50 , 115 , 0+0.2 , 0.550.06 , 0.070.05
and 0.07 0.05 , respectively. We selected the variables 1 and 2 for the study.
Ahmad [12] in his paper used approach to determine the from the above data. He
obtained as 0.001. Using -and- distribution, we obtained the as 0.0001, which is
close to that obtained by using the method.
**** This PNC 0.0001 is not for our thesis......Please give me suitable comparison.

######################Similar case study by me Just for Your Reference But


Simulation Part is Not in Below .################
For Univariate exponential distribution Cp calculation by Minitab

Case study : Case study for Univariate exponential distribution:

By Owen and Li [] random sample of size n=20 from the Exponential data

0.029 0.483

0.046 0.528

0.133 0.606

0.194 0.789

0.265 0.940

0.287 1.681

0.322 1.766

0.433 2.014

0.441 3.088

0.464 3.279
2. Data fitting.
To determine what distribution data follows which is an important step. It can easily put data
into a software package that will test many different distributions to find out which
distribution fits into data best. But it should have a reason for using a certain distribution it
must make sense in terms of best process. It makes sense that it follows an exponential
distribution. As shown in Figure obtains from Minitab software for our data the best fit for
data is exponential distribution.
3. Parameter estimation

For univariate Exponential distribution Cumulative density function .

F(x;) = 1 ex

For univariate Exponential distribution probability density function.

f ( x; ) e x
Parameter estimation by MLE method.

The likelihood function is

The first order condition for a maximum isThe derivative of the log-likelihood is

By setting it equal to zero, we obtainNote that the division by is legitimate because


exponentially distributed random variables can take on only positive values (and
strictly so with probability 1).
Therefore, the estimator is just the reciprocal of the sample mean

The maximum likelihood estimator of is

= n/Xi

For Real data:

Mean=0.8894=

The sample variance is 0.935.

1. Calculation of Cp for Univariate exponential distribution.

If L=0 U =3,

(U L)
Then I e
[ ln(1 p1 c)] [ ln(1 p1 )]

=3/(3.912)(0.8894) =0.86
2. Comparative study:
U L
3. Process capability index based on Pearson system Cp
P0.99866 P0.00135

=0.82
(U L)
4. From Normal distribution Cp
6
Cp=0.73

S.No. Distribution Cp Nonconforming(ppm)

1 Normal 0.73 14630

2 Exponential 0.86 36430

3 Pearson 0.82 59815

You might also like