1 Pengenalan Geostatistik PDF

GEOSTATISTIK
Syahrul
26 September 2018
Silabus kuliah
• Introduction to geostatistics Today’s

• Non-spatial statistics lecture
• Spatial statistics
• Estimation
• Simulation
Introduction to Geostatistics
CONTENT
• What is geostatistic?
• Application of spatial statistics
• Basic assumptions in spatial statistics
• Key concepts in geostatistics
• Exploratory data analysis (EDA) for non-spatial
statistics
• Spatial description
Geostatistics
• Geostatistics: branch of statistics that
deal with spatially correlated data
• Basic assumptions:
– Sample values are not independent
– Spatial dependency exists
• Goal of geostatistics:
– Spatial continuity model
– Use the model for estimation and/or
simulation of spatial data distribution
Geostatistics
• Geostatistics Term used by Hart (1952) - Application of
Statistics in a Geographic Context
• Matheron (1962, 1963) Used Term in a Geological
Context for Inferring Ore Reserves from Data Spatially
Distributed Within an Ore Body :
- Developed Theory of Regionalized Variables
– Formal Introduction of New Statistic - the
Semivariogram
– Used Kriging to Obtain Best Estimate of a Property (i.e.
Ore Grade) at Some Location in an Ore Deposit
– Built Theory on Practical Work of Krige (1951, 1960)
What is special about spatial data?
• Location of a sample  intrinsic part
of its definition
• All data sets  implicitly related by
their coordinates (models of spatial
structure)
• Data values may be related to their
coordinates  spatial trend
What is special about spatial data?
• Values at sample points can NOT be

assumed to be independent
• That is, there may be a spatial structure
to the data
– Classical statistics  independence
– Implications for sampling design
Key Concepts
• Spatial dependence: the value of a

variable at a point in space is related
to its value at nearby points
• Spatial structure: the nature of the
spatial relation
• Support of a sample: the physical
dimensions it represents
Geostatistic application
Reservoir Property Distribution Using the Available Well
Log, Core, and/or Seismic Data
Geostatistical
Analysis
Raw
Data
Selection of
Model
Appropriate
Estimation or
Stochastic Algorithm
• Quantify Uncertainty Using Multiple Geologically and
Statistically Valid Models
Individual
Reservoir
Simulation
Runs Are
Numbered
n
RESERVOIR
1 3
1 FLOW 6
2
SIMULATOR
3
5
4
4 2
5
6 OUTCOME
PROPERTY (PHI, K) (RECOVERY)
DISTRIBUTIONS
3D Static Earth Model
Well-log
Geostatistic and Earth Modelling
Limitations of geostatistics
Geostatistics does NOT :
• Create Data or Eliminate the Value of
Obtaining Additional Good Data
• Replace Sound Qualitative
Understanding and Expert Judgment
• Necessarily Save Time, At Least in the
Short Term.
• Work Well as a “Black Box”
Some useful sites
• The central information server for
Geostatistics and Spatial Statistics
http://www.ai-geostats.org/
• gstat: http://www.gstat.org
• ArcGIS Geostatistical Analyst:
http://www.esri.com/software/arcgis/
arcgisxtensions/geostatistical/
• Geostatistical analysis tutor (Colorado
School of Mines) :
http://uncert.mines.edu/tutor/
The first law of geography was put forward by Tobler,
stating that everything is interconnected with one
another, but something close has more influence than
something far away (Anselin, 1988)
Geostatistics: Prediction and Interpolation
 The process of estimating data at a location that can’t be

sampled (data missing) requires a model
 But in some studies have problems including no model, there is
only one data sample or no inferencing technique that can be
used to estimate data that can’t be sampled.
 Geostatistics plays a role in this, namely using the estimation
method while still being based on the model.
 Prediction or estimate data missing:
• Nearest Neighbour
• Inverse Distance
• Tren surface analysis
• Kriging
• Co Kriging
Variogram dan Semivariogram
 modelling data that will be calculated
Types Spasial Data
 Point Data (Point Pattern Analysis)
Indicates the location in the form of a point, for
example in the form :
 Longitude dan latitude
 x and y
 Line Data (Geostatistical Data)
 Continuous spatial surface
 Area Data (Polygons or Lattice Data)
Shows the location in the form of area, such as a
country, district, city etc.
Point Data
Line Data
Data Area
Spatial Pattern
Form of Spasial Pattern
clustered
random uniform clustered

random
uniform clustered
Non-spatial statistics
Exploratory Data Analysis (EDA)
Exploratory Data Analysis (EDA) is an approach/philosophy

for data analysis that employs a variety of techniques
(mostly graphical)
Exploratory Data Analysis (EDA)
For example, multidimensional scaling is an EDA that uses
visual representations of distances or similarities between sets
of objects; It’s up to the user to interpret exactly what the
distances represent
The purpose of exploratory data analysis is to:
1. Check for missing data and other mistakes.

2. Gain maximum insight into the data set and its
underlying structure.
3. Uncover a statistic model, one which explains the
data with a minimum number of predictor variables.
4. Check assumptions associated with any model
fitting or hypothesis test.
5. Create a list of outliers or other anomalies.
6. Find parameter estimates and their associated
confidence intervals or margins of error.
7. Identify the most influential variables.
Univariate description
Measure the characteristic of data population
• Mean
• Variance/standard deviation
• Histograms
• Spread/central tendency
• Skewness
Frequency Table
values:
2, 4, 1, 5, 2, 3, 6.9, 2, 5, 7, 2.1, 3.4, 4.2, 2.2, 2.9, 1.7, 3.5, 6.2
Cumulative
Interval Frequency Frequency
1 - 1.999 2 2
2 - 2.999 6 8
3 - 3.999 3 11
4 - 4.999 2 13
5 - 5.999 1 14
6 - 6.999 2 16
7 - 8.000 1 17
Histograms
6
5
frequency
4
3
2
1
0 1 2 3 4 5 6 7 8
data value
relative frequency
0.36
0.30
0.24
0.18
0.12
0.06
0 1 2 3 4 5 6 7 8
data value
Histograms
• Shape varies with Number of Bins
• Rule Of Thumb
Number of Bins = Number of Samples

Cumulative Distributions
cumulative
cumulative frequency
distribution
1.0
0.8
0.6
0.4
0.2
0 1 2 3 4 5 6 7 8
data value
• number of samples below bin maximum

• relative frequency below bin maximum
• probability of grade below bin maximum
Central Tendency Measurements
• Arithmetic Mean = Sum of values

No of values
• Mode = Highest Probability (i.e. ‘tallest’ bin in
histogram)
• Median = 50 percentile (i.e. 50% of values
are below the median)
Spread Measurements
• How different are values from the central
value?
– Range
• Maximum - Minimum
– Variance or Standard Deviation
– Inter-Quartile Range (IQR)
• 75 percentile - 25 percentile
• 90 percentile - 10 percentile
Skewness Measurements
• How symmetrical is the distribution?
– Skewness
– Kurtosis
Skewness and kurtosis are more visible measures

for viewing data distribution graphically
Skewness Measurements
Normal Distribution
• Many biological characteristics (e.g. height,
weight) follow a symmetrical distribution with
a predictable shape
• Called a Normal (or "Gaussian") distribution
Frequency
Normal Distribution
• Defined by mean and variance
Same Mean, Same Variance,
Different Variances Different Means
• Examples:
– Grain size, porosity, permeability, etc
Normal Distribution
Frequency
mean grade
mode
median
• Shape has known equation

• Mean = Median = Mode
Normal Distribution
• Where μ = Mean, σ = Standard Deviation

• From equation can calculate proportions
within Standard Deviation(s) of Mean
Probability Plots
• Straight line if data is Normally distributed

Skewed Distribution
• Unfortunately most variables in geology
follow a skewed, non symmetrical shape
Positively Negatively
Skewed Skewed
% %
grade grade
LogNormal Distribution
• Some variables in geology have a
– Logarithm of values have Normal Distribution
– Sometimes its Logarithm of (value + constant)
f%
f%
grade log-grade
RAW LOG-TRANSFORMED
• Mean ≠ Median ≠ Mode

• Mean is NOT antilog of Mean of Logs!
– Antilog (Mean + 0.5 x Variance)
Mixed Distributions
Frequency
Grade
• More than one mode

• Could be due to
– mixed domains
– multiple phases of mineralisation
Sample Support
• The characteristics of a sample:
– sample size (core diameter, sample length)
– sampling method (diamond, reverse circ)
– assay method (fire assay)
Volume-Variance
• Variance of data set changes according to
the support of the data (i.e. the volume of
material)
• The larger the sample volume, the lower the
variance of the samples
Volume-Variance
• Variance is inversely proportional to Volume
(size)
• Blasthole samples have a higher variance
than mining blocks
• Small model blocks have a higher variance
than bigger model blocks
Samples
Mining Blocks
Model Blocks
Outliers
• Outlier values may be cut to reduce their
impact on arithmetic mean and estimation
g/t gAu %
2 20,000 1%
4 40,000 3%
3 30,000 2%
7 70,000 4%
90 900,000 58%
15 150,000 10%
10 100,000 6%
20 200,000 13%
• E.g. if Top-Cut = 20 a value > 20 is changed

to 20
• Derive Top-Cut from Probability Plot
De-Clustering
• Statistics relies on samples being random
and un-biased
• In mining, we are more interested in ore
than waste
• Usually in mining, more drillholes are
located in high-grade areas
• So sampling is inherently biased - more
samples in high grade areas
De-Clustering
• Overcome this bias with de-clustering
• Put (3D) grid over data
• Within each cell take sample closest to
centre (or average of all samples)
Use average of Use single
4 samples sample
 Continue using this single value per cell

De-Clustering
• Any geological knowledge to split into
separate domains must be used
Only decluster if no geological separation

possible
Bivariate description
• Comparing two distributions
• Scatterplot
• Correlation
• Linear regression
Scatterplot
Regression line
  0.7
x
different values of correlation coefficient
(Picture taken from Dubrule, 2003)

1 Pengenalan Geostatistik PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 Pengenalan Geostatistik PDF

Uploaded by

Copyright:

Available Formats

GEOSTATISTIK

• Introduction to geostatistics Today’s

• Values at sample points can NOT be

• Spatial dependence: the value of a

 The process of estimating data at a location that can’t be

random uniform clustered

Exploratory Data Analysis (EDA)

Exploratory Data Analysis (EDA) is an approach/philosophy

1. Check for missing data and other mistakes.

Measure the characteristic of data population

Number of Bins = Number of Samples

• number of samples below bin maximum

• Arithmetic Mean = Sum of values

Skewness and kurtosis are more visible measures

• Shape has known equation

• Where μ = Mean, σ = Standard Deviation

• Straight line if data is Normally distributed

• Mean ≠ Median ≠ Mode

• More than one mode

• E.g. if Top-Cut = 20 a value > 20 is changed

 Continue using this single value per cell

Only decluster if no geological separation

(Picture taken from Dubrule, 2003)

You might also like