Professional Documents
Culture Documents
surprise
Description
Hypothesis
validate
Theory
Laws
formaliz
e
Model
surprise
Description
Hypothesis
validate
Theory
Laws
formalize
Model
Mathematical Notation
Summation Notation
Pi Notation
Factorial
Combinations
Terminology
Population
A collection of items of interest in research
A complete set of things
A group that you wish to generalize your
research to
An example All the trees in Battle Park
Sample
A subset of a population
The size smaller than the size of a population
An example 100 trees randomly selected
from Battle Park
Terminology
Representative An accurate reflection of the
population (A primary problem in statistics)
Variables The properties of a population that
are to be measured (i.e., how do parts of the
population differ?
Constant Something that does not vary
Parameter A constant measure which describes
the characteristics of a population
Statistic The corresponding measure for a
sample
Descriptive Statistics
Descriptive statistics Statistics that describe
and summarize the characteristics of a dataset
(sample or population)
Descriptive methods Fall within the class of
exploratory techniques
The most common way of describing a variable
distribution is in terms of two of its properties:
central tendency & dispersion
Descriptive Statistics
Measures of central tendency
Measures of the location of the middle or the
center of a distribution
Mean, median, mode
Measures of dispersion
Describe how the observations are distributed
Variance, standard deviation, range, etc
Population mean:
N
xi
i 1
x
i 1
x
i 1
(8 4 2 6 10)
6
5
Example II
Sample: 10 trees randomly selected from Battle Park
Diameter (inches):
9.8, 10.2, 10.1, 14.5, 17.5, 13.9, 20.0, 15.5, 7.8, 24.5
10
14.38
10
10
i 1
x 59.70
Example IV
Mean
1198.10 (mm)
Mean
58.51 (F)
Chapel Hill, NC
(1972-2001)
Disadvantage
Very sensitive to outliers
#
Tree Height
(m)
Tree Height
(m)
5.0
5.3
6.0
7.1
7.5
25.4
8.0
7.5
4.8
10
4.5
Source: http://www.forestlearn.org/forests/refor.htm
Mean = 6.19 m
Mean = 8.10 m
(x, y)
xi
i 1
y
i 1
Map Coordinates
Geographic coordinates The geographic
coordinate system is a system used to locate points
on the surface of the globe (degrees of latitude
and longitude)
Geographic
coordinates of
Chapel Hill, NC
Lat: 35 54 25N
Lon: 79 02 55W
Source: Xiao & Moody, 2004
Source: http://shookweb.jpl.nasa.gov/validation/UTM/default.htm
Source: http://www.geocities.com/CapitolHill/Lobby/3162/HiPlains/GeoCenter/hiplains_geocenter.htm
Source: http://www.cia.gov/cia/publications/factbook/geos/us.html
Weighted Mean
We can also calculate a weighted mean using
some weighting factor:
n
x
factor
variable
w x
i 1
n
i i
w
i 1
Avg. Income
Population
$23,000
100,000
$20,000
50,000
$25,000
150,000
i
Here, population is the weighting
and the average income is the
of interest
w x
i 1
n
i i
w
i 1
w y
i 1
n
w
i 1
Here we weight
by area
Center of Population
Source: http://www.census.gov/geo/www/centers_pop.pdf
Source: http://rst.gsfc.nasa.gov/Sect13/Sect13_2.html
About 850m
R f i ri
i 1
(mean: 6)
median: 6
Example II
Sample: 10 trees randomly selected from Battle Park
Diameter (inches):
9.8, 10.2, 10.1, 14.5, 17.5, 13.9, 20.0, 15.5, 7.8, 24.5
(mean: 14.38)
7.8, 9.8, 10.1, 10.2, 13.9, 14.5, 15.5, 17.5, 20.0, 24.5
Source: http://www.forestlearn.org/forests/refor.htm
Tree Height
(m)
Tree Height
(m)
4.5
7.1
4.8
7.5
5.0
7.5
5.3
8.0
6.0
10
25.4
Tree Height
(m)
Tree Height
(m)
5.0
5.3
6.0
7.1
7.5
25.4
8.0
7.5
4.8
10
4.5
Mean = 6.19 m
Mean = 8.10 m
Tree Height
(m)
Tree Height
(m)
4.5
7.1
4.8
7.5
5.0
7.5
5.3
8.0
6.0
10
25.4
30
40
25
50
45
50
55
45
48
61
60
75
70
45
72
24
45
200 205
65
65
39
58
65
45
24, 25, 30, 39, 40, 45, 45, 45, 45, 45, 48, 50, 50,
55, 58, 60, 61, 65, 65, 65, 70, 72, 75, 200, 205
mean: 63.28
median: 50
mode:
45
mean (without outliers): 51.17
C. Scales of Measurement
Data is the plural of a datum, which are generated
by the recording of measurements
Measurements involves the categorization of an
item (i.e., assigning an item to a set of types) when
the measure is qualitative
or makes use of a number to give something a
quantitative measurement
C. Scales of Measurement
The data used in statistical analyses can divided
into four types:
1. The Nominal Scale
2. The Ordinal Scale
3. The interval Scale
4. The Ratio Scale
As we progress through
these scales, the types
of data they describe
have increasing
information content
Multi-modal distribution
Unimodal skewed
Unimodal symmetric
Unimodal skewed