Professional Documents
Culture Documents
Stephen Tyson
Univariate Statistics
Bivariate Statistics
Multivariate Statistics
September 2006 1
1
Advanced geostatistics in Reservoir Modeling
Univariate Statistics
A Porosity Dataset
12.64 14.56 15.89 16.26 16.85 17.68 18.55 19.31 19.94 20.74
12.85 14.74 15.90 16.50 17.15 17.72 18.62 19.32 20.04 21.12
13.56 15.03 15.96 16.54 17.18 17.75 18.85 19.33 20.07 21.14
13.62 15.28 16.04 16.58 17.24 17.78 18.88 19.36 20.08 21.50
14.03 15.36 16.08 16.58 17.27 17.82 18.90 19.42 20.12 21.75
14.09 15.39 16.09 16.59 17.42 18.04 18.93 19.59 20.17 22.38
14.13 15.42 16.15 16.75 17.50 18.04 19.00 19.62 20.34 22.43
14.21 15.43 16.17 16.79 17.58 18.06 19.08 19.76 20.35 22.53
14.25 15.43 16.23 16.83 17.62 18.24 19.24 19.84 20.49 23.31
14.51 15.67 16.25 16.85 17.66 18.48 19.25 19.90 20.58 23.34
September 2006 2
2
Advanced geostatistics in Reservoir Modeling
Porosity Frequency 20
12 0 18
13 2
16
14 2
14
15 8 Frequency
16 11 12
17 18 10
18 14 8
19 11 6
20 15 4
21 10
2
22 4
0
23 3
12 13 14 15 16 17 18 19 20 21 22 23 24
24 2
Poros ity
Cumulative Plots
60%
16 11 23 23%
17 18 41 41%
40%
18 14 55 55%
19 11 66 66%
20 15 81 81% 20%
21 10 91 91%
22 4 95 95% 0%
23 3 98 98% 12 13 14 15 16 17 18 19 20 21 22 23 24
24 2 100 100% Poros ity
September 2006 3
3
Advanced geostatistics in Reservoir Modeling
Probability Distribution
20
18
16
14
Frequency
12
10
8
6
4
2
0
12 13 14 15 16 17 18 19 20 21 22 23 24
Por os ity
Frequency % Prob Distribution
20%
15%
Frequency %
10%
5%
0%
12 13 14 15 16 17 18 19 20 21 22 23 24
Por os ity
Measures of Locations
Mode: The most frequent value (the value with the tallest bar)
1 n
Arithmetic mean: m= xi
n i =1
( x )
1
n
Geometric mean: m= i =1 i
n
1
1 n 1
Harmonic mean: m = i =1
n xi
September 2006 4
4
Advanced geostatistics in Reservoir Modeling
Measures of Dispersion
1 n
Variance: 2 = (xi m) 2
n i =1
Standard deviation = 2
Coefficient of variation:
Cv =
m
Measures of Shape
s>0
(x m )
n 3
1
i s<0
s= i =1
n
3
Kurtosis (degree of peakedness):
k>0
(x m )
n 4
1
i k=0
k= i =1
3
n
4 k<0
September 2006 5
5
Advanced geostatistics in Reservoir Modeling
IQR
x min x max
Q1 Q2 Q3 Variable
r
Box Plots
Outliers
Extremes
Lower Upper
hinge hinge
IQR
3 x IQR
1.5 x IQR
x x oo o oo o o xx
Lower Upper
whisker whisker
Variable
Q1 Q3
Q2
September 2006 6
6
Advanced geostatistics in Reservoir Modeling
12
10
8
Deep Resisitivity
0
N= 16 67 82 23 25 18
A B C D E F
Facies
Bivariate Statistics
Scatterplots
Correlation coefficient
Regression
September 2006 7
7
Advanced geostatistics in Reservoir Modeling
Dependency
1
0.8
Independent
Variable B
0.6
0.4 no
0.2 Correlation
0
0 5 10 15
Variable A 20 25 30 35
300
250
Perfect
Variable D
200
150
Correlation
100
50
0
0 4 8 12 16
Variable C
September 2006 8
8
Advanced geostatistics in Reservoir Modeling
Partial Dependency
20
18
16
Variable F
14
12
10 This is usually
8
the real world
6
4
0 5 Variable E
10 15 20 25
Scatterplots
True versus Estimate
16.0
True Value
0.0 16.0
Estimate
September 2006 9
9
Advanced geostatistics in Reservoir Modeling
3.0
2.0
1.0
0.0
0 10 20 30 40 50
Porosity, %
Scatterplots
Log10 Permeability
Marginal
Histogram of
Permeability
Porosity
Marginal
Histogram of
Porosity
September 2006 10
10
Advanced geostatistics in Reservoir Modeling
25 0 0 5 42 45 6 1 0 99
30 0 0 0 16 37 20 4 0 77
35 0 0 0 5 18 20 4 0 47
40 0 0 0 0 6 8 8 0 22
45 0 0 0 0 1 4 3 0 8
50 0 0 0 0 0 1 2 0 3
Total 0 7 96 182 132 61 22 0 500
Bivariate Histogram
60
50
40
Number 30
50
20
35
10
20 Porosity, %
0
4.0 3.5
3.0 2.5 5
2.0 1.5
1.0 0.5
Log10 Permeability, md
September 2006 11
11
Advanced geostatistics in Reservoir Modeling
Covariance
The degree to which x and y go up and down together is quantified by a
calculation known as the covariance.
n
1
c(x,y)=Covxy= xy =
n
(x
i =1
i X )( y i Y )
= XY X Y
Types of correlation
September 2006 12
12
Advanced geostatistics in Reservoir Modeling
r=
( x i X )( yi Y )
=
xy n X Y
n x y n x y
where
x = standard deviation of variable x
y = standard deviation of variable y
Used by most software
Microsoft Excel: CORREL(array1, array2)
Dividing by the product of the std devs makes r unitless
And -1 <= r <= 1
Example Correlation
Coefficient
Porosity Perm Porosity Perm Porosity Perm
September 2006 13
13
Advanced geostatistics in Reservoir Modeling
Example Correlation
Coefficient
Porosity Perm
xy 847292.145
n 30
September 2006 14
14
Advanced geostatistics in Reservoir Modeling
r = -1
r = +1
r low r high
r=0
r = undefined
September 2006 15
15
Advanced geostatistics in Reservoir Modeling
Non-linear Correlation
Non-linear Correlation
September 2006 16
16
Advanced geostatistics in Reservoir Modeling
So,
xy Cov xy
r= = That is, a dimensionless covariance
x y x y
Correlation Coefficient
Rank Correlation
When Linear correlation is a poor measure, we can correlate
the Ranks of the values instead
Rank is the position of a data value when sorted in
ascending order.
Smallest has Rank=1 and largest has Rank =N
Sort in ascending order of first (independent) variable
Cov RxRy ( R xi R x ) * ( R yi R y )
rR = =
Rx Ry n Rx Ry
What is relationship between the mean of Rx and mean of Ry?
If n is large, what is a good approximation of the means
September 2006 17
17
Advanced geostatistics in Reservoir Modeling
39.8 27.4 19.1 20.9 5.1 35.6 22.8 34.2 17.9 23.4
37.5 29.4 29.3 25.5 16.2 19.5 28.2 28.1 26.8 38.1
14.7 21.4 31.7 24.3 26.5 34.9 14.3 5.7 22.2 37.0
23.7 26.0 29.6 28.4 11.5 17.8 22.1 23.0 7.6 13.3
25.0 29.9 26.1 15.1 10.8 26.3 26.0 18.4 20.7 22.4
33.8 29.2 31.9 34.6 11.3 24.4 9.5 4.1 15.8 27.2
12.0 24.0 39.1 12.9 42.1 35.1 11.7 14.7 43.6 12.2
20.5 26.9 20.1 29.5 31.5 32.5 16.5 17.3 21.2 13.0
7.8 9.1 25.9 8.0 2.5 21.9 11.1 28.3 12.4 18.3
729.6 328.2 93.9 153.2 17.9 179.2 101.7 125.1 69.2 65.6
436.1 133.7 630.9 853.4 58.6 48.2 102.5 545.1 99.0 2949.7
44.2 42.7 1072.7 64.1 97.6 923.8 67.2 26.9 192.4 1196.4
35.3 139.9 282.8 206.5 22.0 54.7 143.0 160.9 36.3 33.4
72.1 203.2 80.2 31.1 76.1 40.6 59.3 89.0 151.9 46.0
2715.6 474.6 318.8 137.9 52.3 110.8 30.1 12.0 105.2 625.9
44.0 100.7 1979.1 92.1 2467.2 865.1 49.6 33.4 404.4 30.9
157.3 781.0 46.4 303.1 307.1 683.6 61.5 19.9 297.4 29.2
24.3 55.6 107.5 36.8 14.9 111.7 33.7 136.2 56.4 90.0
September 2006 18
18
Advanced geostatistics in Reservoir Modeling
2500
r = 0.564
rrank = 0.816
Permability, md
2000
1500
1000
500
0
0 10 20 30 40 50
Porosity, %
September 2006 19
19
Advanced geostatistics in Reservoir Modeling
Porosity (%)
20
another dependent variable Y.
Fit a line
^ y = mx + b 15
Velocity (km/s)
One that minimizes the error in the
prediction. error = ^
y-y
Define the error as the sum of y
squared differences between
prediction and true value ^
y m
^2
Minimize (y-y)
b
x
Regression
The m and b that minimize the sum of squared deviations from
the line
Y = mX +b
are given by y
m=r b = Y mX
or
x
m=
xy n X Y
n x2
A measure of goodness-of-fit, R2, is given by
( y Y )
n 2
Which is algebraically equivalent to the
R = in=1 i
2
i =1 ( y i Y )
2 correlation coefficient squared! hence earlier
statement that r was a measure of how close the
data fitted to a straight line
September 2006 20
20
Advanced geostatistics in Reservoir Modeling
residual = = y - y^
Vel-to-PHI Plot Residual Plot
25 4
y = -8.38x + 46.85 3
R2 = 0.527 2
20
Residual (Data)
1
Porosity
-1
15
-2
-3
10 -4
3.0 3.2 3.4 3.6 3.8 4.0 3.0 3.2 3.4 3.6 3.8
Velocity (km/s)
Velocity (km/s)
September 2006 21
21
Advanced geostatistics in Reservoir Modeling
Regression
Application to what generic types of variables? And some specific
examples?
y di
y*
MA RMA
x x* x x
In what circumstances would you use each?
September 2006 22
22
Advanced geostatistics in Reservoir Modeling
25
15
Y-on-X
RMA
X-on-Y MA
10
3.0 3.2 3.4 3.6 3.8 4.0
Velocity (km/s)
Correlation vs Regression
r=
xy n X Y The only
difference
n x y between
correlation &
regression is the
m=
xy n X Y denominator
n x2
September 2006 23
23
Advanced geostatistics in Reservoir Modeling
Conditional Expectations
Two issues
Not only do we want to estimate a true value of one variable from
another, we want to know is variability (uncertainty)
Relationship might not be linear (especially when the variables are the
same attribute (eg porosity) but at different locations
Distribution of
possible
Permeability (md)
permeability
values at a known
porosity value
Porosity, %
Prediction of conditional distributions is at the heart of geostatistical
algorithms
500
400
300
200
100
0 10 20 30 40 50
September 2006 24
24
Advanced geostatistics in Reservoir Modeling
ascending order of x
Choose a window of M 400
September 2006 25
25