You are on page 1of 7

Proceedings of the 2012 9th International Pipeline Conference IPC2012 September 24-28, 2012, Calgary, Alberta, Canada

IPC2012-90255

CALCULATING PROBABILITY OF DETECTION USING EXCAVATION DATA


Jason Skow C-FER Technologies Edmonton, AB, Canada Maher A. Nessim C-FER Technologies Edmonton, AB, Canada

ABSTRACT The key to estimating the probability of detection (POD) of an inline inspection tool is to count existing defects that have not been detected by the tool. In an ideal experiment, the number of defects in the pipeline is known before running the tool and the rate of detection is calculated by comparing the number of defects detected by the inline tool with the total number of defects in the pipeline. In reality, the inline inspection is often the first method of identifying defects. Undetected defects are found during excavation only if they are near defects called by the inline tool. This paper describes a new model for probability of detection based on a combination of excavation data and the vendor claim. It addresses the limitations of existing models by assuming undetected defects can occur anywhere in the tool run. This is accomplished by modeling the rate of undetected defects using a Poisson distribution. Its main advantages are a more accurate representation of the probability of detection that considers all the available data, the flexibility to update the model with additional data sets as they become available and ability to quantify changes in the uncertainty of POD as additional data is uncovered through excavations. Overall, a more accurate and defensible process for determining probability of detection is proposed which can be used for managing pipeline integrity and risk. NOMENCLATURE d length of pipeline excavations length of ILI run dILI length of ILI run scaled by vendor specs dv rate of defects d

Ga POD p kv m n nv nd x

Gamma distribution probability of detection probability of success (also POD) count of defects in vendor pull test count of ILI detected defects in tool run count of undetected defects in dig set count of undetected defects from vendor data count of ILI detected defects in dig set count of ILI detected defects in dig set

INTRODUCTION Inspection uncertainties limit the ability of the measurement tool to detect all defects on the pipeline. The probability of detection is one of the parameters used to characterize inspection uncertainties; it is defined in API 1163 as the probability of a defect above a certain threshold size being detected by an inline inspection tool [2]. The API 1163 definition of POD is expressed as Equation (1), below

POD =

Defects detected by ILI (1) All Defects on pipline above threshold

The current practice for calculating POD involves calculating the rate of detection in the excavation length and extrapolating to other areas of the pipeline. In some variations, a binomial distribution is used to account for sample size uncertainty. In all variations, however, this approach is not statistically defensible. The next two sections detail the current practice and its drawbacks followed by a description of the new POD model, its advantages and some example applications. BACKGROUND A literature review of POD reveals an evolution of the use and approach to calculating POD. Earlier papers, such as Desjardin 2005 [11], quote a calculated POD as a single ratio

Bi Po

rate of undetected defects Binomial distribution Poisson distribution

Copyright 2012 by ASME

calculated from a dig set. In these cases, the ratio is simply the detected defects divided by all the defects found at dig sites. The ratio is then assumed to be consistent with the rest of the pipeline. McNealy et al 2009 [3] advanced the analysis of POD by using a 95% binomial confidence interval to account for small sample size. This methodology is also used in Tandon et al 2011 [7]. McCann et al [6,9] advance the ideas of confidence interval coverage using the Clopper-Pearson method. Although these two works do not specifically apply uncertainty to POD, other works that reference these papers such as [8,9,10] do apply binomial uncertainty to POD. Gao et al 2011 [8] use a binomial distribution to calculate POD ranges and argue that exceeding the 95% confidence limit should be classified as accepting the tool tolerance. Also, the term gray area is used to describe values that fall within the 95% confidence interval in which the tool tolerance can neither be accepted or rejected. These ideas are generally extensions of concepts described in API 1163, which outlines a process called null hypothesis significance testing, NHST, for sizing error. NHST allows the operator to either reject or not reject a tool vendor claim. It does this by calculating a confidence interval based on excavation data. The confidence level is usually chosen to be 95%, meaning that if the excavation program was to be carried out numerous times and confidence interval estimates are made on each occasion, the resulting set of intervals would contain the true pipeline parameter 95% of the time. Simply, this means that to reject the vendor claim, the operator must be 95% sure that the excavation data could not arise if the vendor claim was correct. The shortcoming of NHST is that it does not evaluate the validity of the data from the excavation set; it evaluates all of the possible combinations of outcomes that could occur if the excavation activity was repeated. As a result this process only allows the operator to either reject or not reject the vendor claim. It does not provide any guidance on the parameter value to use. To calculate the expected values based on a set of excavation data, another approach is needed. Specifically the problem of interest is to calculate the most likely distribution for the parameter of interest in the pipeline, given the excavation data. To solve this, an approach that includes both the excavation data and vendor claim is needed. Instead of testing against the vendor claim, as in the NHST process, it is desirable to include the vendor claim in the calculation such that the results represent the most likely range of values for the pipeline. In the POD model described in this paper, the calculation of the most likely results for the pipeline is performed using a model that includes both the excavation data and the vendor claim. CURRENT PRACTICE The methodology currently used to estimate POD is based on the NHST process described above. The proportion of detected defects in the excavation length is calculated, small sample size uncertainty is accounted for with binomial

confidence intervals and the results are used to reject or not reject the tool vendor specification. The main conceptual deficiency in this method is that the sample is not consistent with the definition of POD. The sample required to calculate POD includes the number of detected defects out of a known total number of defects. The sample being used consists of a subset of the number of detected defects out of a total number of defects exposed in the excavation length. Since the excavation lengths are chosen to target detected defects, they cannot be used to properly assess what was not detected. In other words, dig sites are selected based on tool results and POD depends on what is not in the tool results, namely undetected defects. Undetected defects are found at dig sites if they happen to be near called defects. This methodology implicitly assumes the locations of undetected defects are in the vicinity of detected defects. Since there is no reason to be confident that this dependency should hold, a more general POD model is needed. As a result of the inconsistency mentioned above, the current methodology does not consider the assessed excavation length. Intuitively, exposing longer lengths of pipeline increases the chance of finding undetected defects; however the current model ignores excavation length and tool run length. It is therefore not possible to estimate changes in POD accuracy based on potential excavation scenarios. To illustrate, imagine two scenarios for conducting excavations; one where the minimum length bell-hole is excavated to assess defects, and one where a full joint on either side of the defect is exposed. Defects not exposed in the first case are not included in the calculation; defects found in the additional length exposed of the second case are included in the calculation. A correct model must include the differences between these two cases by attributing a larger uncertainty when less data is available. Conversely, more available data should result in a more representative estimate of the probability of detection. This can be accounted for if undetected defects are modeled as a rate over the length of the pipeline rather than as a count within an arbitrary exposed length. A second problem with current practice is that vendor specifications are not included in the calculation of POD, they are used only as a benchmark to compare the results from excavations. Vendor specifications represent what is known about the inline tool before conducting excavations. They represent a data set that should be included in the calculation of POD resulting in a more accurate representation of how the tool performs in the pipeline. Calculating the POD without the vendor specification implicitly assumes that the vendor specification has no value. In other words, it assumes that prior to acquiring the dig data, the tool performance is equally likely to have a probability anywhere between zero and unity. In the new model, vendor specifications are incorporated in the calculation as prior information in a Bayesian process. When the excavation data set is small, the vendor specification is only marginally updated. When the excavation data is

Copyright 2012 by ASME

extensive, the vendor specification is significantly updated. This is done by expressing the vendor claim as a representative length of the pipeline and calculating the distribution of likely values for the probability of detection. SAMPLE SIZE UNCERTAINTY For counted events, sample size uncertainty can be accounted for by modeling the number of successes with a binomial distribution as is the case for the NHST process. To illustrate why this is needed, imagine repeating an excavation program on the same pipeline. It is likely the results from two excavation sets would not result in the same number of counted successes and failures. By random chance, the resulting success rate of either excavation program will be different than the average rate for the whole pipeline. This does not indicate that the sample is incorrect, only that random variations affect the results of any given sample. As more excavations are done, the uncertainty in the resulting calculations will be reduced because the field measurements are expected to approach the value of parameter being estimated for the entire pipeline. Since excavation data always represents a small sample of a larger population, it is desirable to consider the variations that could occur from repeated trials by considering sample uncertainty. The binomial distribution requires the following properties: repeated trials have only two possible outcomes (success or failure), the probability of success is the same for every trial and the outcome of one trial is independent of the outcome of other trials. The binomial distribution is written below in terms of the number of defects in the dig set, nd, the number of defects called by the inline tool, x, and the probability of detection, p.
Probability Density

Probability of Identification Distribution Binomial Distribution Binomial(x; 25 , 0.8 )


0.20 0.15

f (x| 25, 0.8) = 11% when x = 18

0.00 12

0.05

0.10

14

16

18

20

22

24

Successes Count of Correctly Identified Defects

Figure 1 Example Binomial Distribution Note that although 20 successes out of 25 trials is the most likely outcome, other outcomes are also possible. The vendor specification of 80% success in a simple deterministic calculation leads to 20 successes of 25 trials, but considering the range of results in the binomial distribution, an excavation program yielding 18 successes out of 25 trials is also possible. From Figure 1, above, 18 successes are expected to occur 11% of the time based on the randomness of the data sample, if the vendor claim of 80% POD is accurate. The same information is summarized in Figure 2 below, which is a graphical representation of the NHST process. The 95% confidence interval is shown as the region between the shaded tails of the beta distribution curve to the left of the bar chart. It represents the range of likely outcomes given 18 successes of 25 trials. The binomial distribution on the right of the bar chart is the same as Figure 1, above. Since the dotted line, representing 80% success, is contained within the confidence interval, the null hypothesis cannot be rejected. In other words, a tool specification of 80% cannot be rejected based on the dig results.
Count of Correct Bin Depth
25
1.0 0.9

n f ( x ) = Bi (x nd , p ) = d p x (1 p ) n x x

(2)

An example of a binomial distribution with 25 trials and a probability of success of 80% is displayed below.
Beta Distribution Alpha = 19 , Beta = 8

Incorrect 20 15 10
0.8 0.4 0.5 0.6 Possible Values 0.7

Correct 5 0
0 1 Probability Density 2 3 4

POI

Figure 2 NHST Results

Copyright 2012 by ASME

Probability Density

The Poisson distribution models random occurrences of events over a continuum of time or space and is therefore used to model defect counts over a length of pipeline. The Poisson distribution is a discrete probability distribution that assumes defects are located randomly along the pipeline. Equation (3) is the Poisson distribution with the undetected defect count, n, as the random variable, expressed in terms of undetected defect rate, ,and length of pipeline, d.

undetected defect rates in the pipeline is graphed in figure 4, with the 95% confidence interval between the two shaded tails.
Defect Rate Distribution Gamma( 5 , 1 )
0.20 0.00 0 0.05 0.10 0.15

e d (d ) f (n ) = Po(n , d ) = n!

(3)

An example of the Poisson distribution is graphed in figure 2. In this example, the undetected defect rate is known to be 5 per kilometer. The figure shows the probability of finding n undetected defect counts in 1.5 kilometers of pipeline. The probability of finding exactly 10 undetected defects in 1.5 kilometers is 8.6%.
Defect Count Distribution Poisson(x; 7.5 )
0.15

10

15

Defect Rate (defects per km)

Figure 4 Example Gamma Distribution PROBABILITY OF DETECTION MODEL The new model described here takes into consideration the fraction of the pipeline exposed by modeling the undetected defect count as a Poisson distribution, equation (3). The range of undetected defect rates is modeled as a gamma distribution, equation (4). The vendor specification is included in the model by converting it to a rate of undetected defects over a pipeline length. Once converted, the vendor specification is treated as prior information that characterizes what is known about the detection capability of the tool before excavating. Although POD is derived from a combination of pull tests (where the defects are known beforehand) and field validations (where the defects are measured by the tool and verified with digs), the derivation of prior information in this paper considers only the pull tests. If the details of the field validation are known, they can also be included in the assessment by updating the results with the field validation results using the Bayesian process described here. The pull test is a binomial experiment where each defect is a trial with POD chance of success. To equate this to a Poisson process, the defect rate, d , is calculated by combining the tool specification with the inline tool results as follows

Probability Density

0.00 0

0.05

0.10

10 Defect Count

15

20

Figure 3 Example Poisson Distribution Sample uncertainty is accounted for by treating the rate as a random variable. The rate is modeled by a gamma distribution, which gets tighter as the sample size increases. The gamma distribution is used because it provides a simple way to incorporate additional data (i.e. it remains gamma when it is updated with new data using Bayes theorem). For a given pipeline and excavation set with n undetected defects and length d, the gamma distribution expressed as a rate of undetected defects, , is shown in equation (4).

d =
where m dILI

m POD d ILI
count of ILI detected defects in tool run length of the tool run

(5)

f ( ) = Ga( n, d ) =

(d ) (n )

n 1

(4)

Continuing the same example used in figure 3, if 5 undetected defects are found in 1 km, the range of expected

Assuming the vendor claimed POD is based on trial runs in which a total of kv defects were assessed, the scaled length of the tool run is simply the number of defects in the vendor pull test divided by the defect rate.

Copyright 2012 by ASME

k d POD dv = = v ILI d m kv

POD =
(6)

m m + d ILI

(10)

This can be interpreted as the length of this particular pipeline that would have produced the data used by the vendor in determining POD. This methodology assumes that POD is the same for this pipeline as it was in the pull test. The probability distribution for the rates of undetected defects is solved using Bayes theorem, which can be used to update the probability model of a parameter given a set of data. Bayes theorem is used to get from the probability of the data given the model to the probability model of the parameter given the data. Equation (7), below, is a form a Bayes theorem [5].

The rate of undetected defects, , in equation (10) is a random variable. A simple method to solve equation (10) is to use a Monte Carlo simulation that results in the distribution of the probability of detection. The example that follows the next section solves equation (10) using a Monte Carlo simulation. VENDOR SAMPLE SIZE FOR POD API 1163 requires the vendor to specify the detection threshold and the probability of detection for the inline tool. It states that these values should be statistically derived for each type of anomaly, but does not provide further guidance on how this should be done. For the POD model described in the next section, an estimate of the sample size used by the vendor is required. This section describes reasonable estimates for these values if they are not available from the vendor. To estimate sample size, a confidence level for the vendor specification is needed. Unfortunately, it is not common practice for vendors to provide a confidence level for POD. A common confidence level referenced in API 1163 and other works [12] is the 95% confidence level. A POD of 90% with a confidence level of 95% is abbreviated as POD90 (CL95). The objective is to determine the sample size required to estimate the lower detection limit of 90% at the confidence level of 95%. The solution can be solved for pairs of integers, x and nd, required to achieve the POD at the stated confidence where x is the number of detected defects and, nd, is the total number of defects. Summing all the binomial probabilities from equation (2) produces the cumulative distribution function. Setting the lower confidence limit to 5% and solving produces pairs of x and nd values that result in a POD of 90% and a confidence interval of 95%. Rewritten in terms of the desired confidence interval, the equation is
nd n 0.95 = 1 d p i (1 p ) nd i i=x i

p(A B ) =

p(B A) p( A)
A

p (B A) p( A)

(7)

Applying Bayes theorem to POD, allows inferences about the rate of undetected defects, , to be made from known dig data; the count of undetected defects, n, and the length of excavation, d.

f ( n, d ) =

p(n, d ) f ( ) d
0

p (n, d ) f ( )

(8)

The prior information is a representation of what is known about the rate of undetected defects before any excavations. This is included in equation (8) as the second term in the numerator. It is modeled as a gamma distribution because it reasonably reflects the vendor data available and it simplifies the calculations, leading to an updated distribution that has a gamma form. Substituting the rate and vendor specification model into equation (8) and solving results in the following gamma distribution, Ga, which requires only two inputs, a count of defects and a length of inspection as follows

(11)

f ( ) = Ga( n + nv , d + d v )
where nv dv

(9)

count of undetected defects from vendor data length of ILI run scaled by vendor specs

One of the combinations of x and nd that solves equation (11) is 29 and 29, respectively (29 measured defects with 0 misses). This represents the smallest sample size that can be used to claim a POD of 90% with a confidence of 95%. If there is one miss, then a total sample size of 46 is required to achieve the same results. Table 1 summarizes combinations of results from equation (11) that result in a POD90 (CL95).

Once the rates of undetected defects are known, a distribution of the probability of detection can be calculated as the rate of detected defects divided by the total rate of defects. The inline tool report provides the number of detected defects, m, and equation (9) provides an estimate of what the inline tool did not detect.

Copyright 2012 by ASME

# Misses 0 1 2 3 4 5 6 7 8 9 10

# Defects 29 46 61 76 89 103 116 129 142 154 167

The count of detected defects is known from the tool run results, and equation (9) provides an estimate of the undetected defect rate. Equation (10) is used to solve for POD. The results are shown in figure 6.

mean = 0.922

Table 1 90% POD at 95% Confidence The estimates used in the POD example that follows contrast two scenarios; the minimum value of 29 defects and a maximum value of 103 defects. The maximum of 103 is chosen as an upper practical level for a vendor pull test. It assumes an extensive data set of 103 defects of a particular type and size category. EXAMPLE APPLICATION A tool run conducted on a length of 290 km of pipeline detects 220 crack defects. The vendor stated POD is 90% and the vendor pull test assessed 103 crack defects. From equation (6), the equivalent length of pipeline represented by the vendor specifications is
95% CI 0.872
0.80 0.85 0.90 0.95

0.968
1.00

POD

Figure 6 The Probability of Detection Distribution In figure 6, the calculated lower confidence limit, 87%, compares with the vendor specification of 90%. In this case, incorporating the excavation results lead to calculated POD that is 3% lower than the claim at a 95% confidence level. The next two figures show the probability of detection for two hypothetical situations. Figure 7 shows the change in POD with increasing length exposed assuming that new undetected defects are found at the same rate per unit length as in the original dig set. Figure 8 shows the change in POD with increasing length exposed assuming no new undetected defects are found. The solid lines represent the mean POD and the dotted lines represent the 95% confidence interval.
POD vs Length Exposed
1.0

dv =

103 290 90% = 122 km 220

(12)

The excavations had a total length of 0.25 km and the number of undetected defects found was 3. Substituting these values into equation (9) gives the probability distribution of the rate of undetected defects. The results are displayed graphically in Figure 5.
Undetected Defect Rate Distribution

Probability of Detection

kv = 103 k = 103 k = 29 kv= 29

15

Probability Density

10

0.2 0

0.4

0.6

0.8

10

Length Exposed (km)

Figure 7 POD vs Length - New Defects Found


0.00 0.05 0.10 0.15 Undetected Defect Rate (per km)

Figure 5 The Rate of Undetected Defects

The POD in figure 7 starts close to the vendor spec of 90% and decreases with additional length exposed. When the exposed length is small, there is not enough information to move the POD away from the vendor claim, even though the

Copyright 2012 by ASME

rate of undetected defects is high. If the rate remains high as more length is exposed, the calculated POD drops to appropriately reflect this information.
POD vs Length Exposed
1.0

REFERENCES [1] Stephens, M., Nessim, M. 2009. Guidelines for Reliability Based Pipeline Integrity Methods, PR-244-05302 [2] API Standard 1163 First Edition, August 2005, In-line Inspection System Qualification Standard. [3] McNealy, R., Gao, M., Krishnamurthy, R. 2009. Evaluation of Current In-line Inspection Technologies for Mechanical Damage Detection, 17th JTM Blade Energy Partners [4] Gelman, A., Carlin, J., Stern, H., Rubin, D. 2004. Bayesian Data Analysis, Chapman & Hall / CRC [5] Jaynes, E.T. 1974. Probability Theory with Applications in Science and Engineering, Notes for Book Manuscript, Washington University, St. Louis, Missouri. [6] McCann, R., McNealy, R., Gao, M. 2008. In-Line Inspection Performance, II, Validation Sampling, NACE 2008 Paper No. 08151. [7] Tandon, S., Cazenave, P., Gao, M., Yan, B., Feng, Q. 2011. In-Line Inspection Performance Verification Benchmark Study. IBP1557_11. [8] Gao, M., Krishnamurthy, R. 2011. A Review: Statistical Methods for INE Inspection Performance Evaluation. IBP2093_11. [9] McCann, R., McNealy, R., Gao, M. 2007. In-Line Inspection Performance Verification. NACE 2007 Paper No. 07132. [10] Blade Energy Partners. 2008. Investigative Fundamentals and Performance Improvements of Current InLine Inspection Technologies for Mechanical Damage Inspection. PRCI PR-328-063502 [11] Desjardins, G. 2005. Assessment of ILI Tool Performance. NACE 2005 Paper No. 05164. [12] Yee, B., Chang, F., Couchman, J., Lemon, G. 1974. Assessment of NDE Reliability Data. NASA CR-134991.

Probability of Detection

0.8

= 103 kv= 103

0.2 0

0.4

0.6

10

Length Exposed (km)

Figure 8 POD vs Length - No New Defects The POD in figure 8 starts close the vendor spec of 90%. Since the new information from additional length exposed supports the vendor claim, the calculated POD changes very little with increasing exposed length. The contrasting results in figure 7 and 8 indicate a need to further investigate the effect of increased data on the calculated POD. Figures 7 and 8 also show that the 95% confidence interval does not significantly narrow with increasing excavation data due to the large amount of data assumed from the vendor specification. CONCLUSIONS The POD model considers unexcavated sections of pipeline by calculating a rate of undetected defects. Undetected defects are assumed to be equally likely to occur anywhere within the tool run length. Two data sources are used in the calculation; excavation data and vendor specifications. The results are a weighted POD that is between the vendor claim and the excavation results. If the proportion of exposed pipeline length is significant, the POD is weighted towards the dig results. Alternatively, when the proportion of pipeline exposed is small, the POD is weighted towards the vendor claimed value. Although it would be difficult for a single operator to expose enough pipeline to greatly influence the calculated POD, the model described here could be used to assess several tool run results from various operators to calculate a tool performance that is representative of operator experience. This could be especially useful for field evaluation of newer technologies. Overall, a more accurate and defensible process for determining probability of detection has been proposed which can be used for managing pipeline integrity and risk.

Copyright 2012 by ASME

You might also like