You are on page 1of 29

Computing in

Archaeology
Session 12. Multivariate
statistics
© Richard Haddlesey www.medievalarchitecture.net
Aims

 To introduce the techniques of multivariate


analysis
• Cluster analysis
• Correspondence analysis
• Principal components and factor analysis
• Multiple regression
• Discriminant analysis

 Key text
• Fletcher & Lock 2005 Digging Numbers
Introduction to
multivariate analysis
 In earlier lectures we have seen examples
of univariate analysis using such
techniques as simple bar charts, frequency
tables of one variable and calculations of a
simple sample mean
 When 2 variables are involved such as in
clustered bar charts, scatterplots, when
we comparing the mean of 2 groups or
when we are asking is the any association
between 2 variables, then we are using
such techniques of bivariate analysis
Introduction to
multivariate analysis

 More than two variables, however,


we are dealing with multivariate
analysis
SPSS
 These techniques require the use of
suitable statistical packages, such as
SPSS, because of the considerable
computation involved

 Consequently, the approach of working


examples by hand used in earlier lectures
is not relevant here and we will not be
going into the statistical and mathematical
details behind the techniques
Techniques discussed
 Type A: reduction and grouping
• Given several measurements (ordinal interval
or presence/absence) on each of many objects
(i.e. several variables and many cases) is it
possible to reduce the number of variables, still
maintaining the information in the data?

• Using either the original variables or the new


reduced set can these objects be put into
groups or clusters so that within each group
the objects are similar but between groups
there are interpretable differences
Techniques discussed
 Type B: prediction
• Given several measurements (ordinal
interval or presence/absence) on each
of many objects (i.e. several variables
many cases) with one of the variables of
particular interest, is it possible to
predict this variable from the others and
if so which variables are important in
this prediction?
Type A techniques

 Cluster analysis

 Correspondence Analysis

 Principal Components and Factor


Analysis (PCA)
Type B techniques

 Multiple regression

 Discriminant analysis
Type A:
1. reduction and grouping
2. cluster analysis
 We may wish to ask
• Can spearheads be grouped or clustered, so
that those within a cluster are similar to each
other but there are important differences
between the clusters?

• i.e. if we group by dimension, thus creating


clusters of like sized spearheads, will it show a
difference between various size clusters?
Hierarchical cluster analysis
 Most stats packages offer a standard clustering
method called hierarchical cluster analysis

 It starts by making each spearhead a single


cluster. We then tell it how we want the clusters
produced and SPSS will reduce the single clusters
into one big cluster

 It will then output the data and provide


information on cluster membership and indicate
how good the clustering has been (i.e. how
similar the members are)
Dendrograms

 The way to “visualise” the clusters as


they are formed, as an aid to
deciding how many are “significant”,
is by asking the software to produce
a dendrogram
Type B:
1 prediction
2 multiple regression
 We have already covered the theory
of prediction and regression in the
previous lecture. Although we are
now talking about multiple
regression, the principle is the
same and is best understood through
the practical session to follow
Type B:
1 prediction
2 multiple regression
 We may ask
• Can the length of a spear be predicted if
the tip is missing?

 Previously we discussed correlation


and regression between two
variables, multiple regression allows
to use multiple variables
Multiple regression

 Multiple regression will produce a linear


equation relating spear length, the
dependant variable, to several
independent variables such as socket
length, maximum width, width of upper
socket and width of lower socket.

 Both the dependant variable (the one to


be predicted) and the individual variables
(the ingredients for this prediction) must
be measured on an interval scale or be
presence/absence data

You might also like