You are on page 1of 21

Journal of Housing Research • Volume 9, Issue 1 113

q Fannie Mae Foundation 1998. All Rights Reserved.

GIS Research Infrastructure for Spatial Analysis of Real


Estate Markets

Luc Anselin*

Abstract
This article outlines the type of research infrastructure needed to complement existing commercial
Geographic Information System (GIS) environments to perform state-of-the-art spatial analysis of real
estate markets. The emphasis is on the relevance of a spatial data analytic perspective and on the
operational setting in which this can be implemented. These ideas are part of an overall framework
that distinguishes four spatial analysis functions: selection, manipulation, exploration, and confir-
mation.
The relevance of spatial econometrics and spatial statistics for empirical analysis of real estate markets
is illustrated with respect to three specific examples: efficient survey design, pattern recognition, and
the estimation of hedonic models. Particular attention is paid to the linkages between the spatial data
analytic functions and the more traditional GIS analysis functions, both conceptually and in existing
software environments.
Keywords: GIS; Spatial econometrics; Spatial statistics; Real estate analysis

Introduction
One of the four classic textbook characteristics that differentiates housing from other com-
modities is locational fixity (the others are high cost of supply, durability, and heterogeneity;
see Quigley 1979). This implies that access to employment opportunities, shopping, public
service facilities, and other centers of activity is obtained jointly with the dwelling unit’s
physical characteristics. Hence, the importance of the spatial aspects of real estate markets
is unquestioned. Despite widespread recognition by both theorists and practitioners of the
complex roles of location and spatial interaction and the resulting geographically segmented
nature of real estate markets, however, an explicit ‘‘spatial’’ treatment of these markets in
empirical real estate research is still in its infancy. There have been two important imped-
iments to such a spatial treatment, one methodological and the other operational.
From a methodological perspective, a proper treatment of space requires the recognition of
the importance of the two-dimensional nature of spatial interaction (or spatial auto-
correlation) and its implications for statistical analysis. In other words, the prevalence of
spatial dependence in the cross-sectional data used in real estate analysis requires the ap-
plication of appropriate techniques of spatial statistics and spatial econometrics. It has been
amply demonstrated that ignoring the special nature of spatial data in econometric analysis
may lead to biased or inefficient estimates and misleading inference (see, e.g., Anselin 1988,
1990a; Anselin and Griffith 1988). Until recently, however, such an awareness was not prev-

* Luc Anselin is Director of the Bruton Center for Development Studies and Professor of Economics, Geography,
and Political Economy at the University of Texas, Dallas.

Part of the research reported on in this article was supported by grant SBR-9410612 from the U.S. National Science
Foundation. I thank Ayşe Can for providing me with early drafts of her unpublished and forthcoming manuscripts.
114 Luc Anselin

alent in empirical work outside geography (as indicated by the results of the literature
surveys in Anselin and Griffith 1988 and Anselin and Hudak 1992). Early efforts to imple-
ment spatial regression models in urban and real estate analysis include Griffith (1981) and
Anselin and Can (1986), which focused on urban density functions, and Dubin (1988, 1992)
and Can (1990, 1992a), in the context of hedonic models for house prices. These studies were
characterized by the use of fairly small data sets (in contrast to more ‘‘mainstream’’ micro-
econometric cross-sectional analyses) and a focus on methodological issues. More recently,
an appreciation of spatial econometric approaches has started to appear in the public policy
literature as well, undoubtedly motivated in part by the emphasis on spatial outcomes in a
number of laws, such as those concerned with equal access to mortgage credit and the en-
forcement of fair lending laws (see Buist, Megbolugbe, and Trent 1994 for a more extensive
discussion).

From an operational perspective, the initial lack of dissemination of the methods of spatial
econometrics and spatial statistics to the practice of empirical real estate (and other policy)
research has often been attributed to a dearth of software tools (see, e.g., Haining 1989).
Although this may have been an important factor in the 1980s, it is clearly no longer the
case. For example, the SpaceStat software package for spatial data analysis, which contains
descriptive statistics to test for spatial autocorrelation as well as spatial econometric meth-
ods (Anselin 1992a, 1995a), has considerably eased the application of these techniques and
is increasingly used by practitioners. In addition, a number of software development efforts
have led to the implementation of basic spatial econometric techniques in existing commer-
cial statistical software. Examples are routines for the estimation of spatial regression mod-
els in Systat by Bivand (1992) and in SAS by Griffith (1993); specification tests and esti-
mators for spatial econometric models in Shazam, Limdep, Gauss, and S-Plus by Anselin,
Hudak, and Dodson (1993); and geostatistical methods for use with S-Plus in Venables and
Ripley (1994). There are also several freestanding software packages to test for spatial au-
tocorrelation (for an overview, see Legendre 1993).

More important, the spectacular growth in the technology of Geographic Information Sys-
tems (GIS) has allowed a much more realistic and detailed measurement and representation
of the features of the urban economic geography relevant to the analysis of real estate mar-
kets. Not only does a GIS allow the collection and integration of very large databases for
use in real estate marketing (e.g., the GeoData service of Environmental Systems Research
Institute [ESRI] and the National Association of Realtors [‘‘GeoData Product Will Trans-
form’’ 1995]), but it also extends the frontiers of the types of analytical studies that can be
carried out in a realistic setting (e.g., the assessment of residential quality in Can 1992b
and the study of underserved mortgage markets in Can and Megbolugbe 1996). The current
state of the art allows for the application of sophisticated spatial statistical techniques in
conjunction with an operational GIS environment that is geared to support policy analysis
as well as business decision making, as illustrated in the recent studies of Anselin and Can
(1995) and Can and Megbolugbe (1997).

In this article, I focus on the research infrastructure needed to complement existing com-
mercial GIS environments in order to support effectively real estate business decision mak-
ing and policy analysis. I emphasize two aspects of the research infrastructure: the types of
analyses that are most relevant and how they can be implemented in an operational setting.
The outcome is an integrated set of GIS functions and spatial data analysis operations.
Because the focus is on generic problems, the level of abstraction is purposely kept fairly
GIS Research Infrastructure for Spatial Analysis of Real Estate Markets 115

high. I stress the overall framework, rather than the peculiarities of specific applications.
The scope of the discussion is limited to statistical and econometric approaches. A similar
treatment could be extended to the application of optimization and other operations research
techniques, such as those needed to solve location-allocation problems in a spatial decision
support system (for an overview, see Densham 1991).

This article is motivated by the need to qualify the general impression (in part created by
vendors’ claims) that off-the-shelf GIS software is sufficiently equipped to address the com-
plex questions faced in the spatial analysis of real estate markets. Undoubtedly, great strides
have been made since Goodchild’s (1987) call to make GIS the premier tool for spatial anal-
ysis. However, the emphasis in most current commercial GIS software is still on the database
management and mapping functions of a transactions processing system. This has created
important new opportunities in terms of the scope and scale of studies that can be carried
out operationally, but as such it is insufficient to enable the full array of spatial data analyses
necessary to support decision making and policy evaluation in the real estate sector. The
available GIS functions must be augmented with a system of customized spatial analysis
tools to form an integrated modular framework for spatial decision support. The design of
such a framework is the topic of the present article.

In the following sections, I first review the special nature of spatial data and its relation to
GIS. Next, I focus more specifically on the research infrastructure for spatial analysis, both
in terms of what vendors customarily refer to and in terms of spatial data analysis and GIS.
I illustrate the importance of spatial data analysis in real estate analysis with three ex-
amples of research questions related to efficient survey design, pattern recognition (clusters
and outliers), and hedonic regression. This is followed by a brief discussion of operational
issues faced in the integration of spatial data analysis and GIS software, illustrated with
an example of a pilot project to interface the SpaceStat software for spatial data analysis
with ESRI’s ArcView GIS. I close with some concluding remarks.

Spatial Data and Geographic Information Systems

Spatial Data

Because of the importance of location in real estate analysis, observations on variables used
in empirical studies (e.g., house prices, mortgage lending rates, neighborhood characteris-
tics, access to public facilities) overwhelmingly tend to be spatial or georeferenced. Such
data often do not satisfy the requirements of independence and homogeneity required in
classical statistics. In fact, the so-called First Law of Geography (Tobler 1979) posits that
‘‘everything is related to everything else, but near things are more related than distant
things.’’ This link between locational similarity and value similarity—spatial autocorrela-
tion—may be the result of spatial interaction processes, externalities, spatial diffusion, copy-
catting, spillovers, poor choice of unit of observation, and so on. The main issue is that the
predominance of dependence rather than independence in geographic data has important
implications for their statistical analysis (Anselin 1988, 1990a, 1992b). It will tend to result
in observations that are spatially clustered; in other words, it will yield data samples that
are not independent and that therefore contain less information (in the case of positive
spatial autocorrelation) than similarly sized independent samples.
116 Luc Anselin

The presence of spatial dependence in cross-sectional georeferenced data has two important
consequences. On the one hand, if the focus is on obtaining proper statistical inference (es-
timation, hypothesis tests, predictors) from the dependent data, spatial autocorrelation can
be considered a nuisance. In such an instance, the main objective is to correct standard
statistical procedures for the effect of the spatial dependence, for example, by increasing the
sample size or by using robust methods or adjustments that incorporate the spatial auto-
correlation in a regression error term. Alternatively, when one is intent on discovering the
form of spatial interaction—the precise nature of spatial spillover and the economic and
social processes that lie behind it—the spatial dependence can be considered substantive. In
this case, the focus is on how to incorporate the structure of spatial dependence into a sta-
tistical model and how to estimate and interpret it.

Similar to spatial dependence, the intrinsic uniqueness of each location (another distinguish-
ing characteristic of housing units) may create a form of locational (regional) differentiation,
or spatial heterogeneity. Again, this heterogeneity may be considered a nuisance and cor-
rected for, or it may be the focus of interest in itself and be modeled explicitly. Spatial
heterogeneity is a special case of structural instability. From a methodological viewpoint, it
only merits special attention in that it is the spatial structure that creates the instability.

In contrast to spatial heterogeneity, dealing with spatial autocorrelation requires a special-


ized set of statistical and econometric methods (for overviews, see, e.g., Anselin 1988; Cliff
and Ord 1981; Cressie 1991; Haining 1990). The two-dimensional (and multidirectional)
dependence across space is not a straightforward extension of one-dimensional (and one-
directional) dependence in time. Always central in geographical analysis, the need to apply
spatial econometric methods in empirical work that uses georeferenced cross-sectional data
has become increasingly accepted in the mainstream social sciences as well. Recent examples
of economic analyses in which spatial dependence was formally incorporated include Case,
Hines, and Rosen (1993), Murdoch, Rahmatian, and Thayer (1993), Holtz-Eakin (1994), and
Besley and Case (1995), which examine local public finance and public economic issues, and
Benirschka and Binkley (1994), an analysis of midwestern agricultural land values. While
sophisticated in their econometric methodology, these applications tend to be limited to a
fairly simple set of locational configurations, such as the 48 contiguous U.S. states or a set
of counties. In applied real estate research, the typical number of locations of interest, such
as parcels or neighborhoods (census blocks), is considerably larger (thousands to hundreds
of thousands). This volume precludes a manual approach and requires the functionality of
a complex information system to deal with the locational and topological aspects of the
spatial analysis. Such an information system is referred to as a Geographic Information
System, or GIS.

Geographic Information Systems

Simply put, the core of a GIS is a database that efficiently combines value (attribute) infor-
mation on the objects of interest (e.g., price, square footage, and amenities of a housing unit)
with locational and topological (spatial arrangement) information (e.g., street address, lat-
itude and longitude, and census block to which the housing unit belongs). In a much broader
sense, a GIS has been referred to as ‘‘a powerful set of tools for collecting, storing, retrieving
at will, transforming and displaying spatial data from the real world for a particular set of
purposes’’ (Burrough 1986, 6; see also Maguire 1991 for alternative definitions). In the ex-
GIS Research Infrastructure for Spatial Analysis of Real Estate Markets 117

treme, this view of GIS as a set of tools tends to identify it with specific (commercial) software
and obfuscate its role as an encompassing framework to address generic geographic research
questions—GIS as ‘‘geographic information science’’ (Goodchild 1992).

For the purposes of this article, and to avoid getting embroiled in the debate about what
GIS is and is not, the focus is on spatial analysis, which is commonly considered to be one
of the four major functions of a GIS (the others are input, storage, and output of spatial
information; see Anselin and Getis 1992; Goodchild 1987). The role in GIS of spatial analysis
in general and spatial data analysis in particular has been the topic of much recent research,
as reported in Anselin and Getis (1992), Goodchild, Haining, and Wise (1992), Fischer and
Nijkamp (1993), and Fotheringham and Rogerson (1994), among others. Rather than re-
viewing this literature, I will elaborate on the classification of spatial analysis functions into
selection, manipulation, exploration, and confirmation, which was suggested in Anselin and
Getis (1992). In Anselin, Dodson, and Hudak (1993), the first two of these functions were
considered to form a ‘‘GIS module’’ and the latter two a ‘‘Data Analysis module’’ to emphasize
the practical division of labor between GIS software and spatial data analysis software.
However, such a distinction has become increasingly irrelevant, as statistical software adds
mapping and GIS capabilities to its functions, and a growing number of statistical operations
are included in commercial GIS software. More important than classifying these functions
as belonging to one or the other is to consider their interaction. This is approached here at
a fairly general level. The selection and manipulation functions are discussed first, followed
by the spatial data analysis functions. The interaction between the various functions is
schematically summarized in figure 1.

Spatial Analysis Functions in GIS

Current commercial GIS systems contain hundreds and even thousands of operators. It is
therefore important to keep in mind that the classification of spatial analysis functions into
selection, manipulation, exploration, and confirmation is only one of many suggested tax-
onomies. The main point is that the first two capabilities (shown on the left in figure 1) are
present in virtually all systems and have become known as spatial analysis in the commer-
cial world (e.g., ESRI 1995a, lesson 8). The spatial data analysis functions (shown on the
right in figure 1) are much less prevalent in commercial systems.

Spatial Data Selection

The selection function includes operators necessary to obtain a set of variables for particular
locations from a spatial database. This capability ranges from simple zooming and browsing
functions and traditional relational database queries to so-called spatial queries. In the
latter, the outcome is the result of a Boolean operation that includes both attribute infor-
mation and locational information. An example would be to construct a query for all recent
home purchases in a particular price range (nonspatial query) that occurred within a 30-
minute drive from a shopping center (spatial query). This operation involves finding the
appropriate value combinations and the selection of all addresses (coordinates, or points in
a polygon) that fall within a ‘‘buffer’’ constructed around a given point.
118 Luc Anselin

Figure 1. Spatial Analysis Functions

Another aspect of the selection function is to perform spatial sampling, that is, the selection
of locations (e.g., house sales transactions) for future statistical analysis. Because of the
prevalence of spatial autocorrelation, standard random sampling techniques may not be
appropriate for spatial data sets. To carry out proper spatial sampling, often an initial data
analysis is needed to assess the range and significance of spatial autocorrelation. This would
be the result of an exploratory spatial data analysis, as illustrated by the link between global
spatial association and spatial sampling in figure 1. This link is discussed in more detail in
the following.

Spatial Data Manipulation


The manipulation function includes all operations to create spatial data. The virtually lim-
itless ability of GIS to produce maps of data at any scale and for any level of areal aggregation
GIS Research Infrastructure for Spatial Analysis of Real Estate Markets 119

is often seen as its most powerful analysis feature. However, though typically hidden from
the user, such operations are themselves based on specific algorithms and models and often
involve a prior statistical analysis of the data.

The data manipulation operations can be broadly classified into three groups. The first con-
tains those pertaining only to attribute values—that is, traditional data summaries and
transformations (aggregation, averaging, etc.). A second group consists of those operations
that pertain only to spatial information, that is, a manipulation of the coordinates of the
points, lines, and polygons in a spatial database to perform spatial transformations, map
abstraction, spatial aggregation and dissolution, the computation of topology (determination
of neighbors), centroids, area, perimeter, and so on. The most important aspect of this second
group for data analysis is the construction of topology or spatial arrangement for a set of
areal units. This information is crucial for the computation of any statistic for spatial au-
tocorrelation, for which each location (observation) needs to be associated with a set of
‘‘neighbors’’ (typically adjacent units) in a so-called spatial weights matrix or spatial lag
operator. This is illustrated by the links in figure 1 between topology and global and local
spatial association in the exploratory module and estimation and diagnostics in the confir-
matory module. Most commercial GIS include information on the topology of the data in one
form or another, although this is not always easily accessible to spatial data analysis func-
tions. This is considered in more detail in the following.

A third group of functions combines both spatial and nonspatial information and is com-
monly referred to as data integration. This capability allows for the construction of data for
a particular unit of analysis (e.g., per capita income for a school district) by combining in-
formation on different variables (e.g., proxy variables, such as retail sales) and at different
levels of spatial aggregation (e.g., sales data for specific stores, income data at the census
tract level) by means of polygon overlay and spatial interpolation operations. Such data
integration is itself based on the use of models and statistical analysis (e.g., the predicted
values of a spatial regression model or a kriging operation) and is not without associated
error (see Goodchild and Gopal 1989 and Veregin 1995 for a review of the salient issues).
Although it is often hidden from the user or highly simplified for display purposes, the
explicit consideration of the structure of spatial error embedded in a spatial database is
essential for any sound statistical analysis. Simplistic approaches, such as area-based spa-
tial interpolation, have only limited relevance in realistic policy analysis contexts. Instead,
custom solutions need to be developed in each particular application, based on a careful
interpretation of the results of both exploratory and confirmatory analysis, as suggested by
the links in figure 1.

Spatial Data Analysis and GIS

In contrast to what prevails in the commercial GIS world, the view in academia tends to be
that the core of the spatial analysis capacity in GIS consists of the exploratory and confir-
matory spatial data analysis functions (see, e.g., Anselin and Getis 1992; Bailey 1994). Of
the two, exploratory spatial data analysis (ESDA) is ideally suited to be integrated with the
other GIS functions, given its emphasis on interaction with the data and visualization (An-
selin 1994; Fotheringham and Charlton 1994; Openshaw 1991). On the other hand, confir-
matory data analysis provides the modeling capacity in a GIS environment, which is spatial
due to the nature of the data used in the statistical analysis. Clearly, the interaction between
120 Luc Anselin

modeling and the visualization capabilities of a GIS is less direct, and much spatial modeling
has been (and can be) carried out without a GIS. Increasingly, however, the size and com-
plexity of spatial data sets used in such modeling require the sophisticated data handling
capabilities of a GIS (see Bailey and Gatrell 1995; Batty and Xie 1994a, 1994b).

Exploratory Spatial Data Analysis

Exploratory spatial data analysis is data-driven and can be considered an extension of ideas
from mainstream exploratory data analysis (EDA) (see, e.g., Good 1983; Tukey 1977) to the
spatial domain. Consequently, ESDA consists of techniques to describe and visualize spatial
distributions, discover patterns of spatial association (spatial clustering), suggest different
spatial regimes or other forms of spatial instability (nonstationarity), and identify atypical
observations (outliers) (Anselin 1994). The description of spatial distributions is increasingly
integrated into modern techniques of dynamic graphics by including the map as an addi-
tional view of the data (in addition to box plots, scatterplots, and other standard EDA de-
vices), as in the Spider-Regard software of Haslett, Unwin, and associates (Haslett, Wills,
and Unwin 1990; Haslett et al. 1991; see also Monmonier 1989 for the extension of scatter-
plot brushing with a map view). However, this approach is not yet part of the standard
visualization functions of commercial GIS software.

Central to ESDA is the analysis of spatial association or spatial autocorrelation. A simple


taxonomy of different perspectives and techniques is presented in table 1. The classification
follows two dimensions (see also Anselin 1994). First, there is the distinction between
neighborhood- and distance-based statistics. In the former, spatial interaction is viewed as
a step function, wherein a location interacts with a given set of neighbors. The overall in-
teraction in the observed data is then obtained by imposing (assuming) a particular form of
a spatial stochastic process. This approach is predominant in the so-called lattice perspective
on spatial statistics, which is prevalent in the social sciences. It requires the formalization
of neighborhood structure for each observation (i.e., the topology or spatial arrangement of
the data) in the form of a spatial weights matrix (for a more extensive discussion, see Anselin
1988, 1992b). For realistically sized problems, the construction of such a weights matrix
cannot be carried out by visual inspection of a map and must rely on the data structures
present in a GIS (carried out as part of the manipulation function). The other, distance-
based, viewpoint is that typically taken in geostatistics, where spatial interaction is assumed
to follow a smooth function of distance between pairs of observations. A required input for
geostatistical indicators is thus a distance measure for each pair of locations, which is easily

Table 1. Classification of Spatial Autocorrelation Measures

Global Local

Neighborhood • Moran’s I • Local Moran, local Geary,


• Geary’s c local Gamma
• Gamma • Gi and Gi*
• Spatial correlogram • Moran scatterplot

Distance • Variogram • Pocket plot


GIS Research Infrastructure for Spatial Analysis of Real Estate Markets 121

computed as part of the manipulation functions shown in figure 1 (though in practice not
always straightforward to access for statistical computations). Both perspectives require
that a number of assumptions are satisfied in order for their interpretation to be meaningful
(a technical discussion of these is beyond the current scope and can be found in, among
others, Cliff and Ord 1981 and Cressie 1991).

The second dimension of the classification in table 1 is the distinction between global and
local indicators of spatial association. The former is the traditional approach to spatial auto-
correlation, in which the overall pattern of dependence is summarized into a single indicator.
This indicator is either a single statistic, such as the familiar Moran’s I, Geary’s c, or Gamma
indicators of spatial association (for details, see Cliff and Ord 1981; Haining 1990; Upton
and Fingleton 1985), or a function, such as the variogram (for details, see Cressie 1991).
Global measures of spatial association can be used to assess the range of spatial interaction
in the data, either from a careful study of a spatial correlogram (a series of spatial autocor-
relation measures for different orders of contiguity, i.e., increasingly wider bands around
the observations) or from an estimation of the distance to the sill in an empirical variogram
(the sill is the point at which the variogram no longer increases with distance). This infor-
mation provides an important input into the spatial sampling function in the selection mod-
ule of figure 1.

Recently, attention has focused on local indicators of spatial association, or LISA. Following
the definition of Anselin (1995b), a LISA is an indicator that achieves two objectives: it allows
for the detection of significant patterns of local spatial association (i.e., association around
an individual location), and it can be used as a diagnostic for stability of a global diagnostic
(i.e., to assess the extent to which the global pattern of association is reflected uniformly
throughout the data set). Not all local statistics suggested in the literature fit the two re-
quirements. For example, based on the neighborhood perspective, the Gi and Gi* statistics
of Getis and Ord (1992) (see also Ord and Getis 1995) primarily satisfy the first objective,
while the pocket plot of Cressie (1991) (based on the distance perspective) and the Moran
scatterplot (Anselin 1997) are geared to the detection of local ‘‘pockets’’ of nonstationarity in
the variogram and Moran’s I, respectively. LISA statistics are particularly suited for visu-
alization on a map (for examples, see Anselin, Dodson, and Hudak 1993). In addition, a
judicious overlay of LISA maps for different variables may suggest the types of variables
that should be included in a spatial regression model (as indicated in figure 1) in the true
spirit of EDA. Whereas global indicators of spatial association, such as Moran’s I, have
recently been included as part of the statistical features in a few commercial GIS, this is not
the case for LISA statistics, even though they are much more relevant to ESDA.

Confirmatory Spatial Data Analysis

Confirmatory spatial data analysis is model-driven and includes the full range of estimation
methods, specification tests, model validation procedures, and so on, necessary to implement
multivariate models for which the observations are cross-sectional and georeferenced. In
current GIS environments, the standard implementation of these techniques (to the extent
that they are present) are nonspatial in that the complicating effects of spatial auto-
correlation are typically ignored (Anselin and Getis 1992).
122 Luc Anselin

An explicit spatial regression approach consists of four main steps (see also figure 1): model
specification, estimation, diagnostics, and model prediction. The model specification is the
selection of variables to be included in the model and the functional form through which
they are related. When there are no strong prior theoretical foundations for the choice of
the model, the indications given by an exploratory analysis of the data (such as a series of
LISA statistics) can be extremely useful in this respect. Typically, a model is first estimated
without incorporating spatial effects, but the results of this estimation (and its residuals)
form the starting point for the diagnostics for spatial effects. Following the same distinction
as in the previous section, ideally these diagnostics aid in detecting and distinguishing be-
tween substantive and nuisance spatial autocorrelation, although not all popular diagnostics
satisfy this requirement (for a technical discussion, see Anselin 1988; Anselin and Florax
1995; Anselin and Hudak 1992).
More precisely, when autocorrelation is present as a nuisance, it is formally incorporated
into the error structure of the regression model. Alternatively, when it is present in sub-
stantive form, it is incorporated as one of the explanatory variables of the model, as a so-
called spatially lagged dependent variable. More complex specifications are sometimes use-
ful as well, incorporating higher order forms of dependence with both error and lag
components. The estimation of such models must be carried out by means of specialized
methods, such as those based on the maximum likelihood principle or on the general method
of moments (for a technical discussion, see Anselin 1988, 1990b). In turn, the estimated
spatial methods should be subject to diagnostic tests and model validation procedures until
the ‘‘best’’ model is selected.
In real estate analysis, the use of regression models is often restricted to the interpretation
of the significance and magnitude of the coefficients of variables of interest (e.g., variables
that represent particular policy instruments such as interest rates and regulatory controls).
In a GIS environment, however, the results of spatial regression analysis may also be use-
fully applied to create ‘‘predicted’’ values at locations or for areal units for which no obser-
vations are available. This is an important aspect of the overlay and interpolation operations
in the manipulation module of figure 1. This approach is most familiar in geostatistics, where
the kriging methodology provides an optimal predictor for values at unsampled points or
areas, based on the spatial relations estimated for a set of sampled points (for reviews of the
methodological issues, see Cressie 1988, 1989, 1991; examples of GIS applications are
treated in Mason, O’Conaill, and McKendrick 1994 and Oliver and Webster 1990).

In real estate analysis, the smooth continuous variation over space that underlies the geo-
statistical approaches is often less appropriate as a model. Instead, spatial prediction must
rely on the transposition of statistical relationships estimated at one level (or for one set of
areal units) to another level (or a different set of areal units). This issue is increasingly
encountered in policy analysis, in which a wide range of socioeconomic indicators must be
estimated (predicted) for any designated ‘‘target zone’’ (e.g., when the target zone boundaries
do not match census units). This problem is tackled by means of the methods of small area
estimation (for a review, see Ghosh and Rao 1994) and areal interpolation (see Flowerdew
and Green 1994; Goodchild, Anselin, and Deichmann 1993; Moxey and Allanson 1994).

Spatial Data Analysis in Real Estate Research


To illustrate some of the points just raised, I will take a closer look at three types of research
issues in applied econometric analysis of real estate markets, where the special nature of
GIS Research Infrastructure for Spatial Analysis of Real Estate Markets 123

spatial data is central. Through such an examination, the need to deal with the presence of
spatial autocorrelation and/or spatial heterogeneity as an essential part of any empirical
work that involves these research issues will become clear. The three examples consider
efficient survey design by means of spatial sampling, the detection of significant patterns
(clusters, outliers) using local indicators of spatial association (LISA), and the efficient es-
timation of hedonic models by means of spatial regression. While the last two are central to
the exploratory and confirmatory modules shown in figure 1, the first is auxiliary to the
selection function. It is included here to stress the close integration of the data analysis
functions (ESDA and spatial regression analysis) with the more traditional GIS spatial anal-
ysis functions.

Efficient Survey Design

In most empirical studies, the efficient design of surveys intended to obtain detailed infor-
mation on the behavior, actions, and intentions of individual actors is the starting point of
the statistical analysis. In many instances, federal and state regulations define the type of
information that is collected, which is increasingly available in digital and geocoded form.
Such databases are often very large and can sometimes be considered to contain the entire
‘‘population.’’ They thus may seem to make statistical inference unnecessary. However, they
often are too large or do not include particular items of interest to allow for meaningful and/
or cost-effective analysis. In such instances, a proper sampling procedure must be designed,
either to select records from the database or to complement it with additional information.

When dealing with georeferenced cross-sectional data, it is crucial to determine the extent
to which the sampled ‘‘observations’’ are spatially dependent. This is a familiar problem in
sampling from geographical units (see Cochrane 1963). For example, if one is interested in
explaining the effect of the existence of particular mortgage instruments or regulatory con-
trols on the loan application behavior of individual households, one may want to correct for
the influence of the neighbors’ actions. Since actors that are in spatial proximity often show
similar behavior (following the first law of geography), a degree of positive spatial auto-
correlation may be expected. A typical database of mortgage transactions may contain sev-
eral hundred thousand observations. Sampling households from this set without recognizing
the potential of a neighborhood effect would be inefficient. Hence, one may want to design
the sampling procedure so that only observations that are outside each other’s range of
spatial interaction are sampled, such as representative households from different neighbor-
hoods. In this case spatial autocorrelation is considered a nuisance. To include only the
desired observations, the sampling procedure should include analyzing the pattern of spatial
autocorrelation for a subset of the data, either for the variable of interest or for a proxy
variable. This information would then be used in a second stage to determine the needed
distance between sampled units or some other adjustment (as in the so-called DUST—de-
pendent areal units sequential technique—sampling procedure in Arbia 1993). This is easily
accomplished in a GIS environment by an interaction between the exploratory and selection
modules of figure 1.

Alternatively, when one is more interested in the specific spatial range of interactive behav-
ior itself (e.g., adoption by households of a new loan type, or by banks of a new processing
technology)—that is, in the case of interest in substantive spatial dependence—one has to
make sure that sufficient spatially proximate observations are sampled (an example of this
124 Luc Anselin

is given in a study of the adoption of new harvesting techniques by Indonesian farmers in


Case 1992). Again, an understanding of the structure and strength of spatial autocorrelation
is a precondition for a proper design of the survey instrument.

Pattern Recognition (Clusters and Outliers)

Recent policy concerns regarding the availability of mortgage credit and its link to urban
and community development have focused on the identification of so-called underserved
areas. This involves the selection of an appropriate spatial scale of analysis (e.g., a census
block, block group, or census tract) and the identification of areal units or collections of areal
units that differ significantly from the others showing lower than expected mortgage activity
(Can and Megbolugbe 1996).

The detection of significant spatial clusters (contiguous spatial units that are more alike—
showing high values or low values—than would be expected under spatial randomness) or
spatial outliers (spatial units that are more different from their neighbors—showing signifi-
cantly higher or lower values—than under spatial randomness) is an ideal application of
LISA. For example, using the local Moran statistics in combination with a Moran scatterplot
(Anselin 1995b, 1997), the type of spatial association of each unit with respect to its neigh-
bors can be categorized into two forms of positive association (high value surrounded by high
value; low value surrounded by low value) and two forms of negative association (high value
surrounded by low value; low value surrounded by high value). The latter are particularly
suited to detect so-called spatial outliers—locations that show low values and are sur-
rounded by neighbors with high values to such an extent that it cannot be attributed to
randomness. Outliers can indicate the presence of underserved areas or suggest other local
aberrations in the functioning of the market that merit further investigation in a multi-
variate context (see Anselin and Can 1995). Since LISA statistics are location-specific, they
can be easily visualized in a map and/or related to other variables in a spatial database.

Estimation of Hedonic Models

Hedonic models of house prices are a mainstay of empirical research in real estate analysis.
Spatial effects, though typically ignored, are relevant in two respects. On the one hand,
neighborhood characteristics that cannot be captured in the explanatory variables of the
model tend to affect transactions in geographic proximity in a similar manner, causing spa-
tial autocorrelation in the error term of the hedonic regression (see Can 1992a; Dubin 1992).
Alternatively, transactions occurring near each other may exhibit an adjacency effect, which
could be incorporated as a spatial lag in the model (see Can and Megbolugbe 1997). In both
instances, the application of spatial econometric estimation methods and specification tests
for spatial dependence is needed, which requires a close interaction with the topology build-
ing functions in a GIS. This interface is further considered in the next section.

Operational Integration of Spatial Data Analysis and GIS

From the previous discussion it should be clear that no current commercially available GIS
contains the range of functions required for a state-of-the-art analysis of spatial aspects of
GIS Research Infrastructure for Spatial Analysis of Real Estate Markets 125

real estate markets. Similarly, commercial statistical and econometric software lack the data
storage and manipulation functions expected for spatial data handling. Although there are
increasing commercial efforts to build more effective links between the two types of software,
they are likely to remain deficient. The incorporation of state-of-the-art spatial statistical
methods requires considerable effort, and it is not at all clear that this can be accomplished
in a profit-driven and highly competitive environment. For example, the methods themselves
are constantly improving and changing, programmers and support personnel proficient in
the methodology are scarce (and expensive), and the market is small and quite specialized.
Hence, there is a tendency to either provide generic development tools, putting most of the
burden of application on the user, or to limit the range of techniques to a lowest common
denominator, which is unlikely to be satisfactory for state-of-the-art analysis.

Partially in response to the lack of off-the-shelf solutions, the integration of spatial statistical
routines with GIS software has received a great deal of attention in the research community.
Early attention focused on the type of integration, such as whether it should be closely or
loosely coupled, embedded or modular (see, e.g., Anselin, Dodson, and Hudak 1993; Anselin
and Getis 1992; Goodchild, Haining, and Wise 1992; Maguire 1995), and which techniques
would be most appropriate for inclusion (see Bailey 1994). Paralleling these conceptual dis-
cussions, a number of efforts were aimed at developing operational (albeit mostly proto-
typical) spatial analysis functionality within a commercial GIS environment, such as Ding
and Fotheringham (1992), Batty and Xie (1994a, 1994b), and Bao et al. (1995). These efforts
tended to be very specialized and used the available macro languages and script facilities of
the GIS in combination with specialized routines executing outside the GIS.

The development of customized spatial analysis functionality as part of or linked with a


commercial GIS environment is likely to be the only practical way to implement state-of-
the-art methods of spatial statistics and spatial econometrics for some time to come. There
are a few generic issues that need to be addressed in such an implementation, concerning
the way the spatial information is transferred between the GIS operations and the statistical
analysis and vice versa. I turn to these now, first in general terms and then illustrated with
specific examples of the integration of the software packages ArcView and SpaceStat, which
are specifically geared to ESDA.

Transfer of Spatial Information

Anselin, Dodson, and Hudak (1993) discuss in some detail several issues pertaining to the
operational interface between a GIS and a spatial data analysis module. The most interest-
ing aspect is how the spatial information needed to construct spatial weights and distance
matrices (to compute indicators of spatial autocorrelation, estimate spatial regression mod-
els, or compute a variogram) can be effectively extracted from the GIS for use in spatial data
analysis. I focus on this more closely here and extend the earlier framework. The information
transfer between the data analysis module and the GIS for mapping and storing is straight-
forward and should only deal with those statistics and results (e.g., LISA, regression resid-
uals) that are location-specific (see Anselin, Dodson, and Hudak 1993 for details and illus-
tration).
126 Luc Anselin

Spatial objects are stored in a vector-based GIS as points, lines, and polygons (areas) and
in a raster-based GIS as cells. For real estate analysis, points and polygons are most rele-
vant. For example, the former could represent the locations of house sales transactions while
the latter could pertain to census blocks or tracts. Points are stored as x, y coordinates
(latitude and longitude, or some projected map coordinates), whereas areas are defined as a
closed series of line segments between points. For the purposes of spatial analysis, the dis-
tinction between points and areas is not crucial because a set of points can be converted to
a system of areal units by means of a tessellation, and polygons can be reduced to their
centroid.

A general framework to move from points and polygons to the distance and spatial weights
needed in spatial data analysis is sketched in figure 2. The simplest method is to deal with
point data for which a distance matrix (using Euclidean, great circle, or network distance)

Figure 2. Transfer of Spatial Information


GIS Research Infrastructure for Spatial Analysis of Real Estate Markets 127

is easily computed. This distance matrix can then be used to select observation pairs in given
distance bands, needed for a variogram, or can be converted into a binary (zero-one) distance-
based contiguity matrix. In the latter, two locations are considered to be neighbors if they
are within a given distance from each other. In practice, this is accomplished by applying a
mask to the distance matrix.

The technical difficulty in constructing a contiguity matrix for polygons depends on the data
structure and data model used in the GIS. For raster-based GIS, a simple scanning of raster
elements that contain polygon identifiers is sufficient (see Anselin, Hudak, and Dodson 1993,
appendix 4, for technical details). In vector-based GIS, the difficulty is a function of whether
the topology of the areal units is contained in the database. In most commercial-grade GIS
systems, this is the case, but in many desktop mapping ‘‘pseudo-GIS,’’ boundaries are stored
as so-called spaghetti files, without any topological information attached. When the topology
is present, as in the Arc/Node data structures used in ARC/INFO and GISPlus, it is relatively
straightforward to derive the neighborhood structure from the left-right polygons associated
with each arc (for technical details, see Anselin, Hudak, and Dodson 1993, appendix 4; Can
1996). When the topology is not present, it needs to be constructed by a series of searches
and comparisons to match x, y coordinates in the boundary sets of pairs of polygons. Although
tedious and time consuming, this is not a complex procedure.

Once a first-order contiguity weights matrix is obtained, it can be used to construct higher-
order contiguity weights. A practical issue related to this is the storage requirement asso-
ciated with the weights. Since the dimension of the matrix equals the number of observa-
tions, this requires considerable computer storage for large data sets (N by N elements);
hence the need to use sparse data formats and efficient algorithms (Anselin and Smirnov
1996). Other transformations of the weights that are needed for the data analysis operations
are row standardization and the computation of eigenvalues (for technical details, see An-
selin and Hudak 1992).

SpaceStat and ArcView

An operational implementation of an integrated system for ESDA, using SpaceStat Version


1.80 (Anselin 1995a) and ArcView Version 2.1 (ESRI 1995b) is outlined in Anselin and Bao
(1997). The translation of the general framework in figure 2 into the specific context of these
two software packages is illustrated in figure 3. In ArcView, spatial data are contained in a
so-called shape file, which is nontopological (a technical description of the shape file struc-
ture is given in ESRI 1995c). To build the spatial weights for any selected set of observations,
the topology must be built, which is a two-step procedure. First, the data in the shape file
are converted into ASCII format using a customized C program (shape to boundary). In the
second step, this file is read by SpaceStat and manipulated in the Tools-GIS module either
to construct a weights file directly (in sparse format) or to compute centroids, which can
then be converted into a distance-based spatial weights matrix in the Tools-Distance Weights
module. Once spatial weights are available, a range of global and local spatial autocorrela-
tion statistics can be computed in SpaceStat’s Explore module.

The feedback of results from the statistical analysis in SpaceStat to ArcView for mapping
and visualization is a two-step process. The output from SpaceStat is contained in an ASCII
comma delimited report file that can be added to an ArcView project by means of the ‘‘add
128 Luc Anselin

Figure 3. Transfer of Spatial Information between ArcView and SpaceStat

table’’ command. The table contains an identifier variable that matches the identifier in the
base table for a current ‘‘view.’’ After the report file table is joined with the attribute table
for the view, the relevant themes (e.g., significant LISA statistics) can be mapped or other
charts can be constructed. This is accomplished in a near-seamless way by means of Avenue
scripts in ArcView (for technical details, see Anselin and Bao 1997).

Conclusions

In this article, I have outlined in broad terms the type of research infrastructure that is
needed to complement existing GIS environments to perform state-of-the-art spatial anal-
ysis of real estate markets. I have stressed two aspects of this research infrastructure in
particular. One is the scope and relevance of spatial econometric and spatial statistical meth-
ods for real estate analysis. The other is the operational setting in which the techniques of
analysis can be implemented. There are currently no off-the-shelf solutions for a broad
GIS Research Infrastructure for Spatial Analysis of Real Estate Markets 129

range of problems encountered in real estate policy analysis and business applications. In-
stead, there is a continuing need to develop customized interfaces between existing GIS
capabilities and software applications to carry out spatial data analysis.

References

Anselin, Luc. 1988. Spatial Econometrics: Methods and Models. Dordrecht, Netherlands: Kluwer Ac-
ademic.

Anselin, Luc. 1990a. What Is Special about Spatial Data? Alternative Perspectives on Spatial Data
Analysis. In Spatial Statistics: Past, Present, and Future, ed. Daniel A. Griffith, 63–77. Ann Arbor, MI:
Institute of Mathematical Geography.

Anselin, Luc. 1990b. Some Robust Approaches to Testing and Estimation in Spatial Econometrics.
Regional Science and Urban Economics 20:141–63.

Anselin, Luc. 1992a. SpaceStat, a Program for the Analysis of Spatial Data. Santa Barbara: University
of California, Santa Barbara, National Center for Geographic Information and Analysis.

Anselin, Luc. 1992b. Spatial Data Analysis with GIS: An Introduction to Application in the Social
Sciences. Technical Report 92–10. Santa Barbara: University of California, Santa Barbara, National
Center for Geographic Information and Analysis.

Anselin, Luc. 1994. Exploratory Spatial Data Analysis and Geographic Information Systems. In New
Tools for Spatial Analysis, ed. Marco Painho, 45–54. Luxembourg: Eurostat.

Anselin, Luc. 1995a. SpaceStat Version 1.80 User’s Guide. Morgantown, WV: West Virginia University,
Regional Research Institute.

Anselin, Luc. 1995b. Local Indicators of Spatial Association: LISA. Geographical Analysis 27:93–115.

Anselin, Luc. 1997. The Moran Scatterplot as an ESDA Tool to Assess Local Instability in Spatial
Association. In Spatial Analytical Perspectives on GIS, ed. Manfred Fischer, Henk Scholten, and David
Unwin, 111–25. London: Taylor and Francis.

Anselin, Luc, and Shuming Bao. 1997. Exploratory Spatial Data Analysis Linking SpaceStat and
ArcView. In Recent Developments in Spatial Analysis, ed. Manfred Fischer and Arthur Getis, 35–59.
Berlin: Springer.

Anselin, Luc, and Ayşe Can. 1986. Model Comparison and Model Validation Issues in Empirical Work
on Urban Density Functions. Geographical Analysis 18:179–97.

Anselin, Luc, and Ayşe Can. 1995. Spatial Effects in Models of Mortgage Origination. Paper presented
at the 91st annual meeting of the Association of American Geographers, March 14–18, Chicago.

Anselin, Luc, Rustin Dodson, and Sheri Hudak. 1993. Linking GIS and Spatial Data Analysis in Prac-
tice. Geographical Systems 1:3–23.

Anselin, Luc, and Raymond Florax. 1995. Small Sample Properties of Tests for Spatial Dependence in
Regression Models: Some Further Results. In New Directions in Spatial Econometrics, ed. Luc Anselin
and Raymond Florax, 21–74. Berlin: Springer.
130 Luc Anselin

Anselin, Luc, and Arthur Getis. 1992. Spatial Statistical Analysis and Geographic Information Sys-
tems. Annals of Regional Science 26:19–33.

Anselin, Luc, and Daniel A. Griffith. 1988. Do Spatial Effects Really Matter in Regression Analysis?
Papers of the Regional Science Association 65:11–34.

Anselin, Luc, and Sheri Hudak. 1992. Spatial Econometrics in Practice: A Review of Software Options.
Regional Science and Urban Economics 22:509–36.

Anselin, Luc, Sheri Hudak, and Rustin Dodson. 1993. Spatial Data Analysis and GIS: Interfacing GIS
and Econometric Software. Technical Report 93–7. Santa Barbara: University of California, Santa
Barbara, National Center for Geographic Information and Analysis.

Anselin, Luc, and Oleg Smirnov. 1996. Efficient Algorithms for Constructing Proper Higher Order
Spatial Lag Operators. Journal of Regional Science 36:67–89.

Arbia, Giuseppe. 1993. The Use of GIS in Spatial Statistical Surveys. International Statistical Review
61:339–59.

ARCNews. 1995. GeoData Product Will Transform Home Buying: ESRI Builds National GIS for the
Real Estate Industry. Winter.

Bailey, Trevor C. 1994. A Review of Statistical Spatial Analysis in Geographical Information Systems.
In Spatial Analysis and GIS, ed. Stewart Fotheringham and Peter Rogerson, 13–44. London: Taylor
and Francis.

Bailey, Trevor C., and Anthony C. Gatrell. 1995. Interactive Spatial Data Analysis. Harlow: Longman
Scientific and Technical.

Bao, Shuming, Mark Henry, David Barkley, and Kerry Brooks. 1995. RAS: A Regional Analysis System
Integrated with ARC/INFO. Computers, Environment, and Urban Systems 18:37–56.

Batty, Michael, and Yichun Xie. 1994a. Modelling inside GIS, Part I: Model Structures, Exploratory
Spatial Data Analysis, and Aggregation. International Journal of Geographical Information Systems
8:291–307.

Batty, Michael, and Yichun Xie. 1994b. Modelling inside GIS, Part II: Selecting and Calibrating Urban
Models Using ARC/INFO. International Journal of Geographical Information Systems 8:451–70.

Benirschka, Martin, and James K. Binkley. 1994. Land Price Volatility in a Geographically Dispersed
Market. American Journal of Agricultural Economics 76:185–95.

Besley, Timothy, and Anne Case. 1995. Incumbent Behavior: Vote-Seeking, Tax-Setting, and Yardstick
Competition. American Economic Review 85:25–45.

Bivand, Roger. 1992. Systat Compatible Software for Modeling Spatial Dependence among Observa-
tions. Computers and Geosciences 18:951–63.

Buist, Henry, Isaac F. Megbolugbe, and Tina R. Trent. 1994. Racial Homeownership Patterns, the
Mortgage Market, and Public Policy. Journal of Housing Research 5:91–116.

Burrough, Peter. 1986. Principles of Geographical Information Systems for Land Resources Assessment.
Oxford, England: Oxford University Press.
GIS Research Infrastructure for Spatial Analysis of Real Estate Markets 131

Can, Ayşe. 1990. The Measurement of Neighborhood Dynamics in Urban House Prices. Economic
Geography 66:254–72.

Can, Ayşe. 1992a. Specification and Estimation of Hedonic Housing Price Models. Regional Science
and Urban Economics 22:453–74.

Can, Ayşe. 1992b. Residential Quality Assessment: Alternative Approaches Using GIS. The Annals of
Regional Science 26:97–110.

Can, Ayşe. 1996. Weight Matrices and Spatial Autocorrelation Statistics Using a Topological Vector
Data Model. International Journal of Geographical Information Systems 10:1009–17.

Can, Ayşe, and Isaac F. Megbolugbe. 1996. The Geography of Underserved Mortgage Markets. Paper
presented at the 1996 midyear meeting of the American Real Estate and Urban Economics Association,
May 29, Washington, DC.

Can, Ayşe, and Isaac F. Megbolugbe. 1997. Spatial Dependence and House Price Index Construction.
Journal of Real Estate Finance and Economics 14:203–22.

Case, Anne. 1992. Neighborhood Influence and Technological Change. Regional Science and Urban
Economics 22:491–508.

Case, Anne, James R. Hines, and Harvey S. Rosen. 1993. Budget Spillovers and Fiscal Policy Inter-
dependence: Evidence from the States. Journal of Public Economics 52:285–307.

Cliff, Andrew, and J. Keith Ord. 1981. Spatial Processes: Models and Applications. London: Pion.

Cochrane, B. D. 1963. Sampling Techniques. London: Wiley.

Cressie, Noel. 1988. Spatial Prediction and Ordinary Kriging. Mathematical Geology 20:405–21.

Cressie, Noel. 1989. The Many Faces of Spatial Prediction. In Geostatistics, Vol. 1, ed. M. Armstrong,
163–76. Dordrecht, Netherlands: Kluwer Academic.

Cressie, Noel. 1991. Statistics for Spatial Data. New York: Wiley.

Densham, Paul. 1991. Spatial Decision Support Systems. In Geographical Information Systems: Prin-
ciples and Applications, ed. David J. Maguire, Michael F. Goodchild, and David W. Rhind, 403–12.
London: Longman.

Ding, Yueming, and Stewart Fotheringham. 1992. The Integration of Spatial Analysis and GIS. Com-
puters, Environment, and Urban Systems 16:3–19.

Dubin, Robin A. 1988. Spatial Correlation. Review of Economics and Statistics 70:466–74.

Dubin, Robin A. 1992. Spatial Autocorrelation and Neighborhood Quality. Regional Science and Urban
Economics 22:433–52.

Environmental Systems Research Institute. 1995a. Understanding GIS, the ARC/INFO Method. Red-
lands, CA.

Environmental Systems Research Institute. 1995b. ArcView, the Geographic Information System for
Everyone. Redlands, CA.
132 Luc Anselin

Environmental Systems Research Institute. 1995c. ArcView Version 2 Shapefile Technical Description:
White Paper. Redlands, CA.

Fischer, Manfred F., and Peter Nijkamp. 1993. Geographic Information Systems, Spatial Modelling,
and Policy Evaluation. Berlin: Springer.

Flowerdew, Robin, and Mick Green. 1994. Areal Interpolation and Types of Data. In Spatial Analysis
and GIS, ed. Stewart Fotheringham and Peter Rogerson, 121–45. London: Taylor and Francis.

Fotheringham, Stewart, and Martin Charlton. 1994. GIS and Exploratory Spatial Data Analysis: An
Overview of Some Research Issues. Geographical Systems 1:315–27.

Fotheringham, Stewart, and Peter Rogerson. 1994. Spatial Analysis and GIS. London: Taylor and
Francis.

Getis, Arthur, and J. Keith Ord. 1992. The Analysis of Spatial Association by Use of Distance Statistics.
Geographical Analysis 24:189–206.

Ghosh, M., and J. N. K. Rao. 1994. Small Area Estimation: An Appraisal. Statistical Science 9:55–93.

Good, I. J. 1983. The Philosophy of Exploratory Data Analysis. Philosophy of Science 50:283–95.

Goodchild, Michael F. 1987. A Spatial Analytical Perspective on Geographic Information Systems.


International Journal of Geographical Information Systems 1:327–34.

Goodchild, Michael F. 1992. Geographical Information Science. International Journal of Geographical


Information Systems 6:31–45.

Goodchild, Michael F., Luc Anselin, and Uwe Deichmann. 1993. A Framework for the Areal Interpo-
lation of Socioeconomic Data. Environment and Planning A 25:383–97.

Goodchild, Michael F., and Sucharita Gopal. 1989. Accuracy of Spatial Databases. London: Taylor and
Francis.

Goodchild, Michael F., Robert Haining, and Stephen Wise. 1992. Integrating GIS and Spatial Data
Analysis: Problems and Possibilities. International Journal of Geographical Information Systems
6:407–23.

Griffith, Daniel A. 1981. Modeling Urban Population Density in a Multi-Centered City. Journal of
Urban Economics 9:298–310.

Griffith, Daniel A. 1993. Spatial Regression Analysis on the PC: Spatial Statistics Using SAS. Wash-
ington, DC: Association of American Geographers.

Haining, Robert. 1989. Geography and Spatial Statistics: Current Positions, Future Developments. In
Remodelling Geography, ed. Bill Macmillan, 191–203. Oxford. England: Blackwell.

Haining, Robert. 1990. Spatial Data Analysis in the Social and Environmental Sciences. Cambridge,
England: Cambridge University Press.

Haslett, John, Ronan Bradley, Peter Craig, Antony Unwin, and Graham Wills. 1991. Dynamic Graphics
for Exploring Spatial Data with Applications to Locating Global and Local Anomalies. American Stat-
istician 45:234–42.
GIS Research Infrastructure for Spatial Analysis of Real Estate Markets 133

Haslett, John, Graham Wills, and Antony Unwin. 1990. Spider: An Interactive Statistical Tool for the
Analysis of Spatially Distributed Data. International Journal of Geographical Information Systems
4:285–96.

Holtz-Eakin, Douglas. 1994. Public-Sector Capital and the Productivity Puzzle. Review of Economics
and Statistics 76:12–21.

Legendre, Pierre. 1993. Spatial Autocorrelation: Trouble or New Paradigm? Ecology 74:1659–73.

Maguire, David J. 1991. An Overview and Definition of GIS. In Geographical Information Systems,
Principles and Applications, ed. David J. Maguire, Michael F. Goodchild, and David W. Rhind, 9–20.
London: Longman.

Maguire, David J. 1995. Implementing Spatial Analysis and GIS Applications for Business and Service
Planning. In GIS for Business and Service Planning, ed. Paul Longley and Graham Clarke, 171–91.
Cambridge, England: Geoinformation International.

Mason, D. C., M. O’Conaill, and I. McKendrick. 1994. Variable Resolution Block Kriging Using a Hi-
erarchical Spatial Data Structure. International Journal of Geographical Information Systems 8:429–
49.

Monmonier, Mark. 1989. Geographic Brushing: Enhancing Exploratory Analysis of the Scatterplot
Matrix. Geographical Analysis 21:81–84.

Moxey, Andrew, and Paul Allanson. 1994. Areal Interpolation of Spatially Extensive Variables: A Com-
parison of Alternative Techniques. International Journal of Geographical Information Systems 8:479–
87.

Murdoch, James C., Morteza Rahmatian, and Mark A. Thayer. 1993. A Spatially Autoregressive Me-
dian Voter Model of Recreation Expenditures. Public Finance Quarterly 21:334–50.

Oliver, Margaret A., and Richard Webster. 1990. Kriging: A Method of Interpolation for Geographical
Information Systems. International Journal of Geographical Information Systems 4:313–32.

Openshaw, Stan. 1991. Developing Appropriate Spatial Analysis Methods for GIS. In Geographical
Information Systems, Principles and Applications, ed. David J. Maguire, Michael F. Goodchild, and
David W. Rhind, 389–402. London: Longman.

Ord, J. Keith, and Arthur Getis. 1995. Local Spatial Autocorrelation Statistics: Distributional Issues
and Applications. Geographical Analysis 27:286–306.

Quigley, John. 1979. What Have We Learned about Housing Markets? In Current Issues in Urban
Economics, ed. Peter Mieszkowski and Mahlon Straszheim, 391–429. Baltimore: Johns Hopkins Uni-
versity Press.

Tobler, Waldo. 1979. Cellular Geography. In Philosophy in Geography, ed. Steven Gale and Gunnar
Olsson, 379–86. Dordrecht, Netherlands: Reidel.

Tukey, John. 1977. Exploratory Data Analysis. Reading, MA: Addison-Wesley.

Upton, Graham, and Bernard Fingleton. 1985. Spatial Data Analysis by Example. New York: Wiley.

Venables, W. N. and B. D. Ripley. 1994. Modern Applied Statistics with S-Plus. New York: Springer.

Veregin, Howard. 1995. Developing and Testing of an Error Propagation Model for GIS Overlay Opera-
tions. International Journal of Geographical Information Systems 9:595–619.

You might also like