Archaeological Spatial Analysis - A Methodological Guide-Routledge (2020)
Archaeological Spatial Analysis - A Methodological Guide-Routledge (2020)
Effective spatial analysis is an essential element of archaeological research; this book is a unique guide
to choosing the appropriate technique, applying it correctly and understanding its implications both
theoretically and practically.
Focusing upon the key techniques used in archaeological spatial analysis, this book provides the
authoritative, yet accessible, methodological guide to the subject which has thus far been missing
from the corpus. First is a richly referenced introduction to the particular technique, followed by a
detailed description of the methodology, then an archaeological case study to illustrate the application
of the technique, and conclusions that point to the implications and potential of the technique within
archaeology.
The book is designed to function as the main textbook for archaeological spatial analysis courses at
undergraduate and post-g raduate level, while its user-friendly structure makes it also suitable for self-
learning by archaeology students as well as researchers and professionals.
Piraye Hacıgüzeller is a senior postdoctoral researcher at the Ghent Centre for Digital Humanities and the
Archaeology Department of Ghent University. Her research interests are the theory and practice of digital
archaeology and, more generally, digital humanities, specifically in the cases of geospatial data visualisation,
management and analysis. She is the co-editor of a recent book on archaeological mapping, Re-mapping
Archaeology: Critical Perspectives, Alternative Mappings (Routledge, 2018).
Gary Lock is Emeritus Professor of Archaeology at the University of Oxford where he has spent 35 years
teaching and researching several areas of archaeology. One of his specialisms is the British Iron Age, especially
hillforts, and he was Co-PI of the Atlas of Hillforts of Britain and Ireland. His other main area of interest is
computer applications in archaeology, especially GIS and spatial archaeology, in which he has published several
books. He has recently retired as Chair of the Computer Applications in Archaeology conference.
Archaeological Spatial
Analysis
A Methodological Guide
Piraye
To Mike Fletcher who sparked my interest in statistics many years ago and to Jude for
continuing support and love.
Gary
Mark
Contents
List of figures x
List of tables xxiv
List of contributors xxvi
3 Spatial sampling 41
Edward B. Banning
5 Percolation analysis 77
M. Simon Maddison
Index 475
Figures
2.1 Sources and types of errors in data collection and compilation, data processing and data
usage that result in final global error, adapted from Hunter and Beard (1992). 19
2.2 The archaeological workflow in terms of a computational pipeline from data
acquisition to unpublished data, and re-use. Problems of quality impact data in each
stage. Black boxes in this workflow occur wherever archaeologists employ software
and tools whose code are unavailable to review and modify and that do not enable
documentation of transformations. 20
2.3 Parsing options in OpenRefine for a file in keyhole markup language. 28
2.4 A spreadsheet style interface on OpenRefine that shows information in columns. 29
2.5 The cleaned version of the file ready with coordinates for mapping. 29
2.6 The cleaning sequence or ‘recipe’ for converting kml into comma separated value (csv)
format. The code can be exported, modified and re-used. 30
2.7 An overview of the location and estimated size of the study area in Saint-Pierre, France. 31
2.8 A map showing points from two surveys that were collected on a total station.
Location of the total station or origin is represented as a star, survey on archaeological
features on surface is marked in green, and the survey of topography is in brown. 32
2.9 The survey points overlaid on a scanned map that is geo-rectified to WGS-UTM 21.
A Python script was developed to enable rotation and transformation of points in
a local coordinate system to a global coordinate system (UTM) using two known
coordinate pairs. 33
3.1 Examples of random and systematic spatial samples using points, rectangles, and
transects as the sample elements. (a) random point sample, (b) systematic transect
sample (walking north), and (c) systematic, stratified, unaligned sample of small squares. 44
3.2 Example of a random Probability Proportional to Size (PPS) sample of agricultural
fields used as sampling elements. Any field that contains one or more of the random
points is included in the sample (hatched). Note how larger fields are over-represented,
but this may have practical advantages in fieldwork in terms of survey costs. 46
3.3 Map of the survey region of the Ayl to Ras an-Naqab Survey in southern Jordan, with
three strata and 500 m × 500 m sample elements. 52
Figures xi
3.4 Map of a portion of the Wadi Quseiba survey region in northern Jordan, showing the
ephemeral stream channels and the population of landscape elements or “polygons”
that constituted the sampling frame for Stratum 2 of this survey. 54
3.5 Decline in the Relative Standard Error (RSE) of micro-refuse counts with increasing
sample size in the use of sequential sampling in Wadi Ziqlab. Sampling stopped after
the three-point slope was less than 0.03 for three consecutive measures of RSE. 56
4.1 Examples of the first-order spatial intensity of a point pattern and its summaries: (a) a
random point distribution (n = 100, the study area is notionally 10 × 10 map units
in size), (b) a quadrat count of the same, (c) the histogram of observed quadrat counts
and the expected Poisson distribution if the pattern is random, (d) an inhomogeneous
point distribution where the intensity of points is higher in the top-r ight corner,
(e) a quadrat count of the same, (f) the histogram of observed quadrat counts and the
expected Poisson distribution if the pattern is random, (g) kernel density estimate of
the inhomogeneous pattern in (d) using a Gaussian kernel with a standard deviation
of 0.5 map units, (h) the same, but with a kernel standard deviation of 1 map unit, and
(i) the same, but with a kernel standard deviation of 2 map units. 62
4.2 Three hypothetical distributions (a–c) and how they manifest as K functions (d–f) and
pair correlation functions (g–i), the x-axes in (d–i) are in metres and refer to the radius of
the circles around each point within which the respective K or Pair Correlation Function
(PCF) statistic is calculated; the critical envelope encompasses 95% of 999 simulations. 64
4.3 Multi-scale second-order effects: (a) a simulated process with small-scale regularity
and medium-scale clustering (b) the pair correlation function of a (the x-axis is in
metres and refers to the radius of the circles around each point within which the pair
correlation statistic [y-axis] is calculated; the critical envelope encompasses 95% of 999
simulations). 66
4.4 Mapping the spatial probability of a point subset: (a) a hypothetical example of 200 points
that combines the random and inhomogeneous point patterns from Figures 4.1 (a, d).
(b) the local spatial probability (out of 1) of finding a point from the inhomogeneous
surface Figure 4.1d, with a much higher probability to the top-r ight (c) UK Portable
Antiquities Scheme data showing Iron Age gold and silver coins of ‘Dobunni’ style,
and (d) the local spatial probability of finding gold coins with an area of much higher
probability on the western borders. 68
4.5 Archaeological survey evidence from the Greek island of Antikythera: (a) individual
houses and field huts of approximately 19th-early 20th century AD date, (b) surface
pottery of approximately 19th-early 20th century AD date collected during
fieldwalking of the whole island, (c) access to flat land (a count of how many flatland
cells are within a radius of 500m), and (d) distance to the nearest large freshwater
spring (square root-transformed). 72
4.6 PCFs for three stages of model fitting, calculated in the same way for both buildings
and surface pottery (the x-axis is in metres and refer to the radius of the circles around
each point within which the pair correlation statistic [y-axis] is calculated; the critical
envelope encompasses 95% of 999 simulations): (a–b) observed PCF and 95% critical
envelope constructed from random simulations of a homogeneous Poisson process
(i.e. a null model of complete spatial randomness); (c–d) the same as (a–b), but with
envelopes now constructed from conditional random simulations from a fitted first
order model using the two covariates in Figure 4.5(c–d); and (e–f) the same as (c–d),
xii Figures
but with envelopes now constructed from simulations conditioned on both the two
first order covariates and an additional clustering model. 73
5.1 Discrete City Clustering Algorithm applied to population density in the UK. This
shows the step by step approach of the cluster being identified on a given lattice. Top
left shows a populated lattice and top right a cell is chosen as the starting point, and its
immediate neighbours are then incorporated. In the final bottom right quadrant the
process has been reapplied to those neighbours as well. 78
5.2 Continuum City Clustering Algorithm (CCA). With Continuum CCA, the technique
is applied in a continuous space (as opposed to a lattice) and neighbours are defined
as falling within a given radius ‘l’. The technique is applied sequentially, starting with
an arbitrarily selected point, and is then applied repeatedly to the newly included
neighbours until the cluster grows no more. 79
5.3 Percolation transition plot – max. cluster size vs. percolation radius. 81
5.4 Percolation cluster transitions for Domesday settlement. Evolution of the largest cluster
in the percolation process of Domesday settlement, overlaid on the transition plot (as
in Figure 5.3). Maps of the clusters at the distance threshold for each transition are
depicted. Each vector point colour represents membership, when two or more nodes
are close enough to be part of the same cluster. 83
5.5 Domesday vill clusters at 3km and 2.9km overlaid on English coastline and
Domesday counties. 84
5.6 Domesday vill and 19th-century settlement clusters. (a) Domesday vill clusters at
3.2km overlaid on coastline and Domesday counties, generated from Domesday
vill datasets provided by Stuart Brookes; (b & c) Roberts and Wrathmell’s 19th
Century Settlement Nucleation dataset at 3km and at 3.5km overlaid by Roberts and
Wrathmell’s central province. 84
5.7 Hillfort clusters in Britain, at (a) 34km, (b) 12km and (c) 9km percolation radius. 86
5.8 Central Wales cluster at 9km with sites plotted according to area, and the rivers
Wye and Severn. 87
5.9 Cotswold cluster at 10km radius with sites plotted according to area, and the rivers
Wye, Severn and Thames. 88
6.1 Transect with paired points selected for lags of 1 and 2 units. 96
6.2 Selection of paired observations for directional variograms. 97
6.3 Omnidirectional experimental variogram with fitted model. 98
6.4 Bounded variogram model. 99
6.5 Location of GPS measurements at Ballyhenry rath. 104
6.6 Experimental variogram of GPS measured heights, Ballyhenry rath. 105
6.7 Experimental directional variogram of GPS measured heights, Ballyhenry rath. 106
6.8 Experimental detrended directional variogram of GPS measured heights, Ballyhenry rath. 107
6.9 Experimental detrended variogram of GPS measured heights, Ballyhenry rath, with
fitted model (Bessel model with a sill of 0.257 and a range of 10.379 m). 108
6.10 Elevation estimates (in metres), derived using kriging with a trend model. 109
6.11 Kriging variances. 110
6.12 ‘2.5D’ representation of (a) kriged elevations and (b) conditionally simulated values
(viewed from the southwest). 111
6.13 Radiate of Allectus: C mint percentages in 5 km grid cells. 112
6.14 Directional variogram of C mint percentages. 113
Figures xiii
6.15 Directional variogram of C mint percentages: 90º clockwise from north (east-west);
with fitted model. 114
6.16 Kriged map of C mint percentages. 115
7.1 Simple interpolation examples: (a) with two known point values, using linear
interpolation estimates D = 15; (b) adding a third sample location and using inverse
distance weighted squared interpolation estimates D = 12.5. 121
7.2 Spline as a concept: (a) regularized with high weighting, allowing the interpolation
estimates to exceed the z-values to maintain smoothness at points marked by arrows;
(b) a tension spline, which adheres to the original data values at the expense of smoothness. 123
7.3 A variogram showing increasing variance between samples of values drawn from
increasing distances apart. After a distance of 60 m there is no increase in variance. 124
7.4 Anisotropy in a hypothetical sample of semi-regularly spaced test units. The isolines
depict sherd counts in 5-sherd intervals, illustrating how the rate of change is greater
on the north-south axis than on the east-west axis. 125
7.5 Archaeological point sample. (a) location of samples and artifact counts; (b) sample
with border-area edge correction. The random test samples are designated by an ‘x’; the
building location by the rectangle. 127
7.6 Visual differences in the surfaces of nine interpolation methods at 1 m resolution. RMS
errors for each model are provided in Table 7.2. 129
7.7 Interpolation example modified from Salisbury (2013, Figure 5). Note that the
high resolution (small pixel size) has exceeded the limits of the original data. The
interpolation is thus unstable and noisy where there is higher local variance, for
example in the area north of the ‘trample zone’ (arrows). 131
7.8 Interpolation example modified from Fort (2015, Figure 1). An interpolated surface
model of radiocarbon dates from Neolithic sites (black dots) depicting the space-time
process of the spread of agriculture across Europe. Note the areas indicated by arrows
in the southwest and northeast of the model showing where data sparseness causes
instability in estimates. 132
8.1 Cost-distance surface with 150 km isopleths having (a) Buran Kaya III (BK);
(b) Geissenklösterle (GEISSE); (c) Krems-Hundssteig (KRE-H) as origin sites. 143
8.2 Linear regression models created with the Ordinary Least Squares (OLS) method
to determine the association between the time difference for the appearance of the
Gravettian techno-complex at different sites and their least-cost distance to three origin
sites. (a) model with Buran Kaya III as origin; (b) model with Geissenklösterle as origin;
(c) model with Krems-Hundssteig as origin. 145
8.3 (a) Map illustrating the difference between x value (i.e. least-cost distance in
kms) at each location and average x value in order to give an indication of spatial
autocorrelation. (b) Map illustrating the difference between y value (i.e. time difference
in years) at each location and average y value in order to give an indication of spatial
autocorrelation. 147
8.4 Map showing residuals at each location in order to give an indication of spatial
autocorrelation. 149
9.1 Screenshot depicting the distribution of radiocarbon dates available from the Canadian
Archaeological Radiocarbon Database, version 2.1. 157
9.2 Simulated point patterns with associated observed and expected L function (a variant of
Ripley’s K function where the theoretical expectation of Complete Spatial Randomness
xiv Figures
(CSR) is a straight line): (a) homogeneous Poisson process; (b) clustered point process;
(c) spatially inhomogeneous Poisson process with different intensities between left
and right sides of the window of analysis; (d) second-order spatial heterogeneity with
a combination of regular and clustered patterns. The function suggests aggregation
(clustering) when the observed L function is above the expected value and segregation
(regular spacing) when below. 159
9.3 Lithic distribution analysis from the Sebkha Kelbia survey, Tunisia, showing contrasting
results between global and local bivariate L functions of stone tools divided by their
raw material (Gafsa flint vs. flint sourced elsewhere): (a) distribution of the analysed
stone tools (filled circle: Gafsa sourced flint; hollow circle: flint sourced from elsewhere);
(b) bivariate L function showing significant segregation between the two classes
between 20 and 320 meters (MC: Monte-Carlo); (c) local bivariate L function scale
showing evidence of aggregation at 100 meters (black dots indicate location of Gafsa
sourced flints with a statistically significant proportion of neighbours composed by flint
sourced from elsewhere). 161
9.4 Local spatial permutation test of the summed probability distribution of radiocarbon dates
(SPDRD) from Neolithic Europe showing locations with higher or lower geometric
growth rates than the expectation from the null hypothesis (i.e. spatial homogeneity in
growth trajectories) at the transition period between 6500–6001 and 6000–5501 cal BP. 164
10.1 Illustration of a possible fuzzy definition of the concept ‘young’ for humans. 172
10.2 Interpretation examples of membership values, and illustration of the α-cut concept
in the case where the fuzzy set is representing a possibility distribution. For instance,
the domain value subset [v9, v10] is the core (the α-cut A1) of the fuzzy set and
means very possible, while [min, max] is the support (the α-cut A0) and means almost
impossible. Any domain values outside [min, max] are impossible. 173
10.3 Illustration of a fuzzy wooded area. Experts determine 3 main areas: area 1, where the
concept of the wooded area is plainly respected (the α-cut A1); area 2, which encloses
area 1, where the concept is partially respected (the α-cut A0.5); and area 3, which
encloses the two previous areas, representing the limits for which the concept can at least
partially be defined (the α-cut A0). Considering several α-cuts (at least 2, A1 and A0), the
membership degree of each domain value can be obtained: it is at least the highest degree
of the α-cuts it belongs to and can potentially be obtained by spatial interpolation as well. 174
10.4 Illustration of a connected spatial α-cut. 175
10.5 Membership functions of two fuzzy periods/dates (f and g); a, b, c and d are the areas
of non-overlap. 176
10.6 Visualization of Roman streets in Reims that were anterior to the period “around 200
AD” according to the confidence we have in the results. 177
10.7 Visualization of entities which have an activity dated to the “Middle of the 2nd
century” and which belong to site PC 88. 178
10.8 Simulated map of Reims’ streets during the 3rd century AD generated from the fuzzy
spatiotemporal data stored in FGISSAR (Fuzzy Geographic Information System for
Spatial Analysis in aRchaeology) with an adaptation of a pattern recognition method
(the Hough Transform). The darker the object, the higher the possibility of its presence
during the 3rd century AD. 179
10.9 Spatiotemporal trajectories and imperfection of archaeological data in Syrian Arid
Margins during the Bronze Age. 180
Figures xv
dam samples with boxplot and cumulative graphs, (d) slope data and rock piles with
cumulative graphs, (e) sampling distribution of 9,999 sample means from the region
with indication of realized sample mean. 222
12.2 Slope data and rock pile locations in the Hohokam agricultural field complex: (a) slope
in eight categories with midpoints plotted against rock pile density, (b) rock piles
and slopes within a circumscribed study region with cumulative distribution plots,
(c) elevation, slopes, and a logistic regression model for rock piles based on these data. 224
12.3 A northwest Arkansas historic data set from 1892: (a) the 18 × 27 km study region
with 589 historic farmsteads and roads plotted over topography with towns outlined,
(b) maps of the four principal components of historic settlement with central values of
legend indicating most preferred locations. 226
13.1 Southern portion of the coastal Georgia study area: maximum available calories for
white-tailed deer (Odocoileus virginianus) for the month of September (ca. 500 BP). 238
13.2 Southern portion of the coastal Georgia study area: maximum available calories for all
shellfish species for the month of September (ca. 500 BP). 239
13.3 Southern portion of the coastal Georgia study area: returnable calories for all resources
combined for the month of January (ca. 500 BP). 240
13.4 Southern portion of the coastal Georgia study area: returnable calories for all resources
combined for the month of September (ca. 500 BP). 241
14.1 Schematic illustration of the features of an Agent Based Model (ABM) with cognitive
agents, based on the model described (Lake, 2000a). 249
14.2 Example of the realistic rendering of a simulated landscape. 250
14.3 Graphed Agent Based Model (ABM) simulation results which collectively illustrate
several aspects of experimental design: (a) plotted points of the same colour and k value
differ due to stochastic effects alone; (b), two different parameters s and k are varied;
and (c) two different agent rules, “CopyTheBest” and “CopyIfBetter” are explored. 260
14.4 Comparison of Long House Valley simulation results with archaeological evidence. 263
14.5 Population curves produced by 100 runs of the calibrated Long House Valley Model,
differing only in random seed. 266
15.1 Four different network data representations of the same hypothetical Mediterranean
transport network. (a) adjacency matrix with edge length (in km) in cells
corresponding to a connection; (b) node-link-diagram where edge width represents
length (in km). Please refer to the colour plate for a breakdown by transport type
where red lines = sea, green = river, grey = road); (c) edge list; (d) geographical layout.
Once again, please refer to the colour plate for a breakdown of transport type. 274
15.2 A planar network representing transport routes plotted geographically (a) and
topologically (b). A non-planar social network representing social contacts between
communities plotted geographically (c) and topologically (d). Note the crossing edges
in the non-planar network. 278
15.3 Examples of three different node centrality measures: (a) nodes scaled by degree
centrality, (b) nodes scaled by betweenness centrality with path segment lengths shown,
(c) nodes scaled by closeness centrality with path segment lengths shown. 279
15.4 Examples showing relative and Gabriel graph neighborhood definitions: (a) A is a
relative neighbor of B because there are no nodes in the shaded overlap between the
circles around A and B, (b) A and B are not relative neighbors because C falls within
the shaded overlap. (c) A and B are Gabriel neighbors because there are no nodes
Figures xvii
within the circle with a diameter AB, (d) A and B are not Gabriel neighbors because C
falls within the circle with a diameter AB. 281
15.5 Network representation of the Orbis network: geographical layout (a, c) and topological
layout (b, d). Node size and colour represent betweenness centrality weighted by
physical distance in (a) and (b), and they represent unweighted betweenness centrality
in (c) and (d): the bigger and darker blue the node, the more important it is as an
intermediary for the flow of resources in the network. By comparing (a, b) with (c, d),
note the strong differences in which settlement is considered a central one depending
on whether physical distance is taken into account (a, b) or not (c, d). 284
15.6 Geographical network representation of the Orbis network: geographical layout (a) and
topological layout (b). Node size and colour represent increasing physical distance over
the network away from Rome: the larger and darker the node, the further away this
settlement is from Rome following the routes of the transport system. Note the fall-off
of the results with distance away from Rome structured by the transport routes rather
than as-the-crow-flies distance. 286
15.7 Nearest neighbour network results of the Orbis set of nodes. Node size represents
degree. Insets show degree distributions. Note how the network only becomes
connected into a single component when assuming 4-nearest-neighbours. 288
15.8 Maximum distance network results of the Orbis set of nodes. Node size represents
degree. Insets show degree distributions. Note how the network only becomes
connected into a single component when assuming 440 km as the maximum distance. 290
15.9 Results of the Orbis set of nodes; (a) relative neighbourhood network and (b) Gabriel
graph. Node size represents degree. Insets show degree distributions. Note how the
networks, as compared to the results shown in Figures 15.7 and 15.8, better succeed in
representing the shape of the Orbis transport network and the long-distance maritime
routes crossing the Mediterranean. 291
16.1 London Tube maps: (a) the 1908 version superimposed on a city plan. (b) the 1933
version featuring H. Beck’s topological redesign. 297
16.2 Schloss Friedeburg, Saxony-Anhalt, Germany. The ground floor of the main residential
building (17th c. CE): (a) the state plan of 1930. (b) a simplified plan with points of
access marked by arrows, topological graph superimposed. (c) the justified graph with
room types, rings and depth from carrier indicated. (d) the path matrix with sums of
path lengths. 299
16.3 (a–c) Simplified ground floor plan of the main residential building of Schloss
Friedeburg: (a) with the non-convex rooms highlighted and a suggestion for
(approximately convex) subdivision of non-convex rooms 1 and 6. (b) with the axial
map superimposed and line segments indicated for the longest and most integrated
axial line. (c) with three overlapping isovists and their centre-points indicated. (d) the
axial map of the ground floor of the main residential building of Schloss Friedeburg,
with the topological graph of convex break-up superimposed. (e) examples of
diamond-shaped topological graphs. (f) justified graph of the axial break-up of the
ground floor of the main residential building of Schloss Friedeburg. 302
16.4 The Palace of Pylos (Ano Englianos), Messenia, Greece (13th c. BCE): (a) a simplified
plan of the earlier building state with results of VGA and the shortest convex routes to
throne room 6 superimposed as a partial topological graph. (b) a simplified plan of the
later building state with results of VGA and the shortest convex routes to throne room
xviii Figures
6 superimposed as a partial topological graph. (c) a simplified plan of the later building
state with shading indicating areas of convex spaces most easily accessible from the
three different points of access and separate j-g raphs for access through each of the
latter. (d) a simplified plan of the later building state with shading indicating areas of
convex spaces most easily accessible from the three main courts 58, 63 and 88, “A”
marking the archive, “NEB” the Northeastern Building (the presumed clearing-house)
and “P” pantries (courts assumed to be served from these are indicated by subscript
nos.). (e) J-g raph of the later building state. 306
17.1 A single, binary viewshed designed to explore the visual impact of a 5 m high wooden
post that had been erected at the Avebury monument complex in Wiltshire, England.
A substantial structure, the post had been raised in the early Neolithic period at a
location that would eventually be traversed by a megalithic setting of paired standing
stones. The viewshed identifies those areas of the Avebury landscape where a 1.65 m
high viewer could have theoretically seen the post. 315
17.2 (a) Conceptualisation of the basic line-of-sight (LOS) algorithm: LOS between two
locations in an altitude matrix can be established by comparing the height of each cell
that intersects the line with the height of the line at that location, interpolating where
necessary. (b) Note that view-to and view-from are not necessarily reciprocal because
they represent different assumptions about the location of the viewer. An R3 viewshed
algorithm essentially repeats this calculation for every cell in the altitude matrix (except
the viewpoint) and records the result(s) for each cell. 317
17.3 The concept of the R2 viewshed algorithm which operates by: (a) generating a ‘view
horizon’, noting the visibility of each cell on that horizon, and storing the elevation
angle of view to the observer a1, a2 etc.; (b) expanding the horizon by one cell and
calculating new angles of view B for each cell on the new view horizon; (c) the angle
of intersection with the previous horizon A is inferred (from a2 and a3) and the new
angle B is compared with A to determine whether a new cell is visible or not. 318
17.4 Buffering to avoid edge effects. The map depicts a group of Roman coin hoards in
the Don Valley in northern England. A series of visibility analyses were carried out
in order to determine whether the hoards were preferentially placed in relatively
concealed (or hidden) locations. (a) In the centre is the convex hull bounding
the group of hoard locations. (b) Assuming a maximum view radius of 3,440 m
(corresponding to Ogburn’s (2006) limit of normal 20/20 vision for a 1 m wide
object) we would need to process the area included in this buffer to avoid edge effects.
(c) If we increased this to 6,880 m (the limit of human acuity for a 1 m wide object)
we would need to extend our processing area accordingly – in this case to the outer buffer. 319
17.5 Using a scaling factor to compensate for edge effects. 320
17.6 (a) A binary viewshed generated from the prehistoric post setting at Avebury depicted
in Figure 17.1 (circled in white) shown over a shaded relief model. (b) A probabilistic
viewshed calculated from the same location (digital elevation model (DEM) errors are
modelled as normally distributed with a root mean square error (RMSE) of 3 m).
White areas represent 100% probability, with the probability declining as the shading
becomes darker. 321
17.7 (a) The cumulative viewshed generated by summing the binary viewsheds of the 17
coin hoard locations depicted in Figure 17.4 with a maximum view radius of 6,880 m.
Colours at the red end of the green-red scale indicate locations from which higher
Figures xix
numbers of mounds are visible. (b) The total viewshed calculated for the entire study
region (in this case, the convex hull depicted in Figure 17.4 with a 500 m buffer –
8,938 viewpoint locations). This encodes views-from the individual viewpoints, where
the red end of the green-red scale indicates those locations from which a larger area is
modelled as being visible. (c) The above analysis repeated with viewpoint/target offsets
adjusted to encode views-to the viewpoint locations. 324
17.8 Viewsheds generated for each of the tower-kivas. The green zone represents the view
from ground level and the red the top of the tower. Blue dots indicate Puebloan
archaeological sites in the landscapes of the tower kivas. The radiating buffers extend
for 20 km around each site – the maximum viewing range used for the analyses. 326
18.1 Cost functions estimating walking time; on the x-axis downhill slopes are negative. 338
18.2 Cost functions estimating walking time: uphill and downhill costs are averaged. 339
18.3 Simple example of an isotropic cost grid (left) and the corresponding accumulated cost
surface (ACS) (right). The origin of the accumulation process is the centre of the cost
grid (cost value = 10). 341
18.4 Dijkstra’s algorithm applied to a cost grid. 341
18.5 (a) Simple isotropic cost grid, (b) possible moves starting at the origin, (c) traversing the
barrier cells by long moves, (d) subdividing long moves. 342
18.6 (a–f) Depict ACS results based on the cost grid shown in Figure 18.5(a). The outcomes
of an inadequate barrier radius of merely 5 m are shown in (a–c). For (d–f) an
adequate barrier radius of 7.5 m was chosen. The N values indicate the number of
nearest neighbouring cells that can be reached from the origin without detour. Images
(g) and (h) illustrate the impact of different N values by grids showing the differences
in accumulated costs. 343
18.7 Small digital elevation model (DEM) with a cell size of 10 m and a constant slope
value of 10% with the corresponding ACS for the cost function Q(ŝ) = 1 + (ŝ/š)²,
with š = 10. 344
18.8 ACS for different slope dependent cost functions (nos. 1, 6, and 9 in Table 18.2) on
three gradients, with an additional barrier as in Figure 18.5(a) (radius 7.5 m). For each
cost function, the costs vary from 0 at the origin, depicted in white, to the largest
accumulated cost value depicted in black. (a) Ericson & Goldstein, (b) Tobler, (c) Q(12). 345
18.9 A path built to be level on a steep slope in the hilly area east of Cologne, Germany. 346
18.10 Least-Cost Paths (LCPs) (black lines) from the origin in the centre to five different
targets. The outcome of the LCP algorithm depends on the number of nearest
neighbours that can be reached without detour (N) and the width of the barrier
((a) width = 5 m; (b, e, g) width = 7.5 m; (c, f, h) width = 10 m; cf. Figure 18.6). 347
18.11 The study area southwest of Cologne covering approximately 13 × 10 km. 348
18.12 LCPs based on the formulas by Tobler, Irmischer, and Herzog/Minetti as well as
quadratic slope dependent cost functions combined with costs for traversing water
courses and/or wet soils. 349
18.13 (a) Comparison of the Q(10) LCPs with the Agrippa road, (b) comparison of the local
prominence for these two routes (white = low, black = high prominence) (c) LCPs
with increased isotropic costs in areas of low prominence. 350
18.14 The best performing LCPs for the models considered and the Least-Cost Site
Catchments (LCSCs) derived from the Q(14) cost model for the Görresburg temple
and the villa. 351
xx Figures
19.1 The electric field remains perpendicular to the magnetic field as the electromagnetic
energy propagates. 361
19.2 A simplified depiction of electromagnetic radiation. The visible light, which is mainly
made up of red, green and blue, constitutes only a small fraction of the full spectrum. 362
19.3 The main steps in a satellite remote sensing analysis. The workflow has a hierarchical
structure. Evaluation of results may reset the workflow until a satisfactory solution is
achieved. 363
19.4 A CORONA scene (DS1102–1025DF007, December 1967) showing the location and
extent of hollow ways radiating from Tell Brak, Syria. The terminal points of hollow
ways can be mathematically modelled in order to estimate the area of agricultural
production around sites. 368
19.5 Multi-spectral data analysis with vegetation indices provides a detailed and dynamic
representation of agricultural landscapes. These models surpass static descriptions of
agro-economic zones, which are usually based on strict assumptions about productivity.
The circles in the figure show production boundaries of Bronze Age settlements. 369
19.6 Scatter plots revealing the strength of the relationship between settlement size and
estimates of total production. The plot (a) shows a weak relationship when estimated
production values are directly compared with settlement size. However, when a
biennial fallowing strategy is introduced for settlements smaller than 50 hectares (b),
the relationship becomes much stronger. 369
20.1 Multisensor (8 sensors) vertical magnetic gradient survey with SENSYS at Pella, North
Greece. Left image indicates the original data suffering from various spikes, traverse
striping and grid mismatching. Right image indicates the results of processing that
tried to remove those specific effects. 378
20.2 Application of the Fast Fourier Transform (FFT) power spectrum analysis of the
magnetic data obtained from the Bronze Age cemetery of the Békés Koldus-Zug
site cluster – Békés 103 (BAKOTA project). The depth of the various targets (h) is
easily determined by measuring the slope of the power spectrum at different segments
and dividing it by 4π (Spector & Grant, 1970). The radially averaged spectrum was
calculated and used to separate the magnetic signals coming from deep sources (h=2.87 m)
and shallow sources (h=0.73 m) below the sensor. The spectrum was also used as a
guide to define a bandwidth filter in order to eliminate the sources with wavenumber
more than 550 radians/m and less than 100 radians/m respectively, and enhance the
magnetic signal coming from the potential archaeological structures. 381
20.3 (a) 3D resistivity model. (b) Three dimensional distribution of the calculated apparent
resistivity results from model A. (c) Pseudo-3D slices of the resistivity resulting from
the 2D inversions along the X, Y and XY axes. (d) 3D resistivity model from the three
dimensional inversion. Due to the wide range of the resistivity values, a logarithmic
scale is used. 384
20.4 An example of processing approaches applied to a radargram obtained at Lechaion
archaeological site with an antenna of 250MHz: (a) Raw data without any processing,
(b) application of Dewow, (c) Spreading Exponential Compensation (SEC) gain with
an attenuation of 6.16, start gain of 2.56 and maximum gain of 542, (d) application of
automatic gain with a window width of 1.5 and maximum gain of 500, (e) application
of regional background removal, (f) migration process, (g) application of a high pass
filter, and (h) envelope transformation. 386
Figures xxi
20.5 Gravity residual anomalies recorded above two tombs (Tombs 4 (above) and 8 (below))
of the Roman cemetery on the Koutsongila Ridge at Kenchreai on the Isthmus of
Corinth, Greece. The centres of the tomb chambers are located approximately at the
middle of the transects. According to the resulting graphs, it is estimated that both
tombs have a width of about 4.5–5 m. The gravity signature of tomb T4 is better
presented compared to the one of tomb T8, probably, because T4 is located within a
more homogeneous geological unit (valley fill deposits), whereas T8 is located at the
border between the valley fill deposits and conglomerate outcrops that extend to the
central section of the ridge. Both tombs have created a well-defined gravity anomaly
with at least 0.04–0.08 mGal maximum variation with respect to the average background. 391
20.6 Results of a seismic refraction survey at the area of the assumed ancient port of
Priniatikos Pyrgos in East Crete, Greece: (a) 2D image representing the depth to
the bedrock, which reaches about 40 m below the current surface. The black dots
represent the position of the geophones along the seismic transects. The area has
been completely covered by alluvium deposits and other conglomerate formation
fragments as a result of past landslide and tectonic activity. The interpretation of the
velocity of propagation of the acoustic waves revealed the spatial distribution of (b) the
alluvium deposits at the top (velocity of 491 m/sec), (c) the lower and upper terrace
deposits (velocity of 1830 m/sec), (d) the medium depth sandstones and conglomerates
(velocity of 2400 m/sec) and (e) the deeper weathered limestone or cohesive
conglomerates (velocity of 4589 m/sec) 393
20.7 Results of the geophysical surveys at Velestino Mati. The magnetic data (a) indicates the
nucleus of the settlement at the west top of the magoula with some expansion towards
the east top. A number of high dipolar magnetic anomalies are associated with burnt
daub foundations that were also confirmed from the Electromagnetic Induction (EMI)
soil magnetic susceptibility (b) and the soil resistance data (c). Magnetic susceptibility
also confirmed the existence of enclosures around the tell. 397
20.8 Results of the geophysical surveys at Almyriotiki. The magnetic data (a) presented a
clear image of the internal planning of the settlement: Burnt daub structures follow
a circular orientation around the top of the tell. The houses expand further to the
south, where some weaker magnetic anomalies representing stone houses with internal
divisions are also present. An irregular wide ditch system encloses the settlement
from the east and the north and it is confirmed from the EMI magnetic susceptibility
(b) and soil conductivity measurements (c). The high soil conductivity to the north
coincides with an area susceptible to periodic flooding. The above were also confirmed
from the soil viscosity measurements (d) as an indicator of the soil permittivity. 399
20.9 Results of the geophysical surveys at Almyros 2. The magnetic data (a) depict clearly
the concentration of burnt daub structures at the centre of the tell, expanding
further to the south. The settlement is surrounded by a double ditch system, which
is confirmed by both EMI magnetic susceptibility (b) and soil conductivity data
(c). A number of breaks in this double enclosure are most probably associated with
multiple entrances to the settlement. Soil conductivity seems also to increase outside
the settlement to the south and west directions (north to the top), namely in the area
which is most susceptible to flooding. 403
21.1 Lin and Mark’s conceptual data models, where raster datasets perhaps based upon spatial
interpolation (SI)/generalisation (SG) methods are converted into voxels, which may in
xxii Figures
4.1 Nearest neighbour (NN) test for the three hypothetical distributions in Figure 4.2(a–c)
(with an edge correction applied, as proposed by Donnelly (1978)). 65
4.2 Summary results of final models (after adjustment for the correlation of the clustering
component). 74
7.1 Interpolation methods and parameters used in the analysis. 128
7.2 Ranked RMS results by method and resolution. 128
8.1 Early Gravettian calibrated Accelerator Mass Spectrometry (AMS) dates of sites
included in the study together with least-cost path distances from the three earliest sites
to the sites included in each correlation and regression. 144
8.2 Details of calculations for the numerator of Equation 8.1 where Geissenklösterle is
taken as the origin. 146
8.3 Details of calculations for the Autocorrelated Errors Model (ρ = 0.36, Equation 8.11). 150
11.1 Geographic assignments for Burial K from Duggleby Howe ranked by highest
probability density and lowest Euclidean distance, using regions based on National
Character Areas (England), National Landscape Character Areas (Wales) and
Landscapes of Scotland (Scotland). 206
12.1 Descriptive slope (percent grade) statistics for rock piles, check dams, and background
samples. 223
12.2 Global correlation matrix (left) for the four variables measured in the agricultural field
complex and logistic regression parameter estimates (right) indicating multivariate
relationships between rock pile presence and the four variables. 225
12.3 Lowest four principal components derived from 10 environmental variables in historic
Northwest Arkansas with largest absolute coefficients of eigenvectors shown in
boldface for interpretive purposes. 227
14.1 Rules for choosing new farming and settlement locations (from Axtell et al., 2002, Table 2). 265
14.2 Original ‘base’ parameter values for the Long House Valley model (from Axtell et al.,
2002, Table 4). 266
Tables xxv
15.1 Top 20 highest ranking towns according to the topological betweenness centrality
measure and the distance weighted betweenness centrality measure. Towns highly
ranked according to both measures are highlighted. 285
15.2 Results of global network measures for all tested models and the undirected Orbis
network (in bold). Highlighted results show some similarity in global network
measures with the Orbis network. 289
18.1 Cost components applied in selected archaeological least-cost studies published in 2010
or later. 335
18.2 Slope-dependent cost functions, with ŝ percent slope, and s = ŝ/100 mathematical
slope. If Δd (or ΔD) is missing in the cost formula, the result of the cost formula is
to be multiplied by the distance covered. Rows 1 to 7 list cost functions estimating
time, the formulae in rows 10 to 12 estimate energy consumption. The cost functions
listed in rows 8 and 9 measure abstract cost units, which can best be understood by
comparing estimates resulting from movement on a gradient with that on level ground. 337
18.3 Published terrain factors for cost functions measuring time (unit: hour) or energy
consumption (unit: joule) of a walker. Note: ‘m asl’ refers to metres above sea level. 340
18.4 DEM data provided by the ordnance survey institution (Geobasis NRW) responsible
for this part of Germany 348
18.5 Comparison of the Agrippa Road section and the LCP generated based on the cost
function Q(10) with a critical slope of 10 percent (see no. 9 in Table 18.2) combined
with a penalty factor of 5 for crossing streams. For the two routes, the percentage in
each prominence category is given. 350
18.6 Areas included within the LCSCs in hectares. 351
21.1 Table summarising the three key computational approaches to the integrated
conceptual modeling of spatiotemporality in spatial technologies. 411
21.2 Table summarising the seven baseline temporal operators of Allen’s interval algebra
(1983), which, along with their inversions, define a total of 13 relationships between
two temporal intervals. 423
Contributors
Johanna Fusco – UMR IMBE (Aix-Marseille University, FR); UMR 7300 ESPACE (University of Nice, FR)
Irmela Herzog – The Rhineland Commission for Archaeological Monuments and Sites
explicit acceptance that space and spatial relationships are a fundamental part of ‘doing archaeology’. It
has also been highly proactive in co-opting and developing the methodological tools needed to explore
this spatiality, culminating in the rich variety of approaches available to us today; a consequence of devel-
opments in theory and practice alongside changing analytical and technological opportunities.
In fact, space, spatiality and spatial awareness are such fundamental parts of being human that we often
take them for granted at the bodily level of moving through and experiencing the world. Developments
in digital geospatial technology and the increasingly pervasive presence of locational media have increased
this familiarity only further (see Wood, 2012, p. 280). It is this very familiarity that risks blinding us to
the spatial formations of social life and to how we actively manipulate space and spatial relationships to
shape the world around us through the activities, behaviours and structures that give our lives meaning.
These spatial manipulations and interventions are what make us distinctive and different to other cultural
groups, offering both social cohesion and social exclusion at the same time. This assumption of essential
human spatiality is what underlies much traditional ‘spatial archaeology’ – the isolation and interpretation
of spatial patterns within archaeological evidence that relate archaeological activity in the present to the
generative processes in the past that we are interested in.
But what are some of the complex relations that describe archaeological space and spatialities?
Through which relations are we able to ‘do’ spatial archaeology and interpret human-space interactions
in the past? We briefly lay out five of them here. For one, space in archaeology is linked to time (see Tay-
lor, this volume). Prior to the mid-20th century, cultural evolutionary and cultural historical approaches
in archaeology explicitly privileged time over space working with long time scales, and grand themes
and trends (see Trigger, 1998), echoing a modernist discursive practice (Roberts, 2012, p. 14). Yet, argu-
ably, archaeologists realised relatively early on that notions of space and notions of time were intimately
linked and these were often treated as tacit conceptual axes along which analyses and interpretations were
structured, often alongside a further axis such as ‘form’ (Spaulding, 1960) sociality and the social (Soja,
1996) and materials and material relations (Conneller, 2011; Lucas, 2012, pp. 167–168).
Secondly, space in archaeology is about mobility, rendering movement across space a key focus of
archaeological and anthropological inquiry (e.g. Hammer, 2014; Richards-Rissetto & Landau, 2014;
Snead, Erickson, & Darling, 2009; cf. Verhagen, Nuninger, & Groenhuijzen, 2019). Regardless of its
context and spatial scale, traversing space affords imaginations about what is to come, reflections on
what is left behind and memories of places. As such, it connects time and space with living. After all,
human life is a temporal process that unfolds with the formation of places through movement and
through the material and immaterial traces that movement leaves behind (Ingold, 1993; see Atkin-
son & Duffy, 2019; McCormack, 2008). Yet how can a preoccupation with spatial movement and all
of the terms that come with it (e.g. flows, networks and liquidity, often used to describe and analyse
conditions of late capitalism, and to construct sweeping grand narratives of globalisation) give hope
to archaeologists trying to come to terms with “specific, tangible materialities of particular times and
places” (Dalakoglou & Harvey, 2012, p. 459)? It appears that thinking through space with movement
is a rather different approach than looking at movement to observe and describe space: while the latter
involves an examination of selected outcomes including spatial patterns in order to trace antecedent
causes, the first attempts to follow forward moment-to-moment spatial formations intently (see Ingold,
2011, pp. 6–7; Knappett, 2011).
Thirdly, space in archaeology is about stories and daily practices (such as practices of gathering, com-
position, alignment and reuse) that typically form spatial assemblages with archaeologically traceable
material dimensions (McFarlane, 2011, p. 649; see Seigworth, 2000; Thrift, 2008). The iterative processes
of creating these assemblages and relations are in fact processes of place-making, processes of dwelling.
They involve assembling relations between humans, non-humans, materials, immaterials, and animate
Archaeology and spatial analysis 3
and inanimate things that continually produce the character and history of places (see also McFarlane,
2011). Places come about then through routine repetition “that is permeated by the past and the pres-
ent, and oriented toward the future” (Resina & Wulf, 2019, p. vii); they come about through enactment
and performance of relations anew almost daily. What about telling and re-telling engaging qualitative
and quantitative archaeological stories about these places? They can be considered as just another act of
re-performing spatial relations that re-assemble archaeological places in the present.
Fourthly, space in archaeology is as much about absences as about presences. Archaeological forma-
tion processes often hinder us from asking certain questions about spatialities as spatial relations between
things or things themselves can be absent from the archaeological contexts in question. Often, archaeolo-
gists also lack an adequate appreciation of what is actually missing. Since such uncertainties are endemic
to archaeology, archaeological practices involve and remain open to new ways of incorporating them
within archaeological writing and analysis about space. Nonetheless, the presence of materials remain
almost exclusively the single origin of signification and meaning in archaeological contexts. What about
‘archaeologically empty spaces’, i.e. spaces devoid of materials that can be directly associated with past
human activities? As Löwenborg (2018, p. 37) stresses “[a]n archaeologist should not assume that ‘empti-
ness’ is a random prehistoric phenomenon. The saying ‘the absence of evidence is not the evidence of
absence’ certainly applies to archaeology”. As such, ‘empty spaces’ in archaeology should be subject to
description, representation and interpretation as much as any other archaeological space. Their status
in archaeology today is in fact an effect of common signification and meaning-making practices in the
discipline. That is, there is nothing inherently meaningful or meaningless about ‘empty spaces’ but just
how we do archaeology today.
Finally, space in archaeology is about the challenges of representing it and increasingly so in the digital
age. As discussed in Lefebvre’s (1991, p. 38) detailed treatment of space, “representations of space” form
one of the three fundamentals that structure spatial understanding. Lefebvre conceptualizes “representa-
tions of space” as the space of planners, scientists and engineers who attempt to identify “spatial practice”
and “representational spaces” with it. In archaeology, it is through the process of engaging with evidence
that we conceive representations of space, i.e. the interpretative constructs of the excavation plans, dis-
tribution maps and spatial models used to represent and explain spatial and social relationships. At the
same time it is through these representations of space that we attempt to link past spatial practices with
Lefebvre’s “representational spaces”, i.e. the lived spaces of past people. Put differently, representing space
in archaeology is a generative act, a process of constructing how archaeologists get to know, experience,
understand and deal with space. And as post-representational cartography has made clear, every engage-
ment with representations of space (such as ‘using’ maps) re-creates those representations as well as the
spaces that are represented (Hacıgüzeller, 2017). As such, creating representations of space via GIS or any
other tool, and using them in archaeology are not simple acts; they are processes with great consequences
that need to be identified and talked about (Wood, 2010).
national traditions (Schnapp, 1996; Boast, 2009). As a result, what we offer here is undoubtedly partial
and selective.
We start at the very birth of the discipline, because some of the core themes that shape current spatial
research in archaeology had their origins in the very first phases of archaeological enquiry; not least the
tensions that exist between empiricism and synthesis. For archaeology to become an evidence-based dis-
cipline it needed not just quantities of evidence but for that information to be structured, ordered, cata-
logued and made accessible. Increasing amounts of evidence require effective synthesis and interpretation
in order to produce narratives which provide an understanding of the past. Other key themes that have
been woven into them have concerned the relationship between space and time and the representation
and reasoning of change through time and across space. Another important element is that of scale – at
its simplest level the details of and associations between artefacts, sites and landscape. The interacting
scales of empiricism and synthesis range from the details of artefacts to enable typologies and dating, to
the recording of sites, landscapes and broad geographical regions.
A defining characteristic of archaeology, and of much antiquarian activity, is what we can broadly
call fieldwork. Whether this is investigating the small-scale relationships available through excavation
or the larger views offered by landscape, the physical remains of the past have demanded an apprecia-
tion of spatial relationships explained through reasoning and representation. Those early explorers of the
past established methods of recording that have formed the basis for more recent methodologies, some
of which are still central to the writing of archaeology today, a good example being distribution maps
(Wickstead, 2019). Similarly the recording of excavations through plans and sections, and the later incor-
poration of stratigraphical relationships, have provided the basis for spatial thinking at that scale since the
early days of the discipline.
In 1533 John Leland was commissioned to travel the kingdom to “make a search for England’s Antiq-
uities” and record the “places wherein Records, Writings and secrets of Antiquity were reposed” (Chan-
dler, 1993). His recording of historical documents, artefacts and places were assembled into his ‘Itinerary’,
a large collection of notes, that offered a remarkable account of his nine years of travelling the land with
the intention of producing ‘a map’ of Great Britain. The importance of Leland’s contribution in terms of
methodology is that he escaped from the library and went on the road. Although claimed as the ‘father
of English topography’ Leland’s itinerary is in fact a ‘map in words’ for it contains very few graphical
representations and he never produced a map as originally intended. Spatial relationships between towns,
villages, archaeological sites and other points of interest are described in his notes by measurements of
distance and compass directions.
The next great work of English antiquarianism, William Camden’s Britannia, first published in Latin in
1586, was in many ways the realisation of Leland’s dream (Schnapp, 1996, p. 141). Even with the addition
of over 50 maps, however, the Britannia is still primarily a work of narrative, where text incorporates spa-
tial descriptions and understandings and the maps provide mainly a reference for location. For Schnapp
(1996, p. 154), Camden typified the British interest in archaeological cartography, the “description of
landscape and listing of monuments”. For the beginnings of spatial archaeology in the modern sense of
recording, representing and interpreting individual sites and then doing the same within their landscape
settings, however, we have to wait until the middle of the 17th century and the work of John Aubrey.
Aubrey’s approach to landscape and archaeological sites was somewhat different to those of Leland and
Camden whose studies and writings were very much in the tradition of continental Renaissance human-
ism (Sweet, 2004). He can be seen as an important part of the historical revolution that paralleled the sci-
entific revolution based on collecting evidence, questioning, interpreting and validating (Hunter, 1975).
This involved a shift in spatial thinking so that the accurate planning of earthworks and other (mainly
prehistoric) archaeological sites became tools for analysis rather than just for display, in his own words
Archaeology and spatial analysis 5
“comparative antiquitie writt upon the spott from the monuments themselves” (Hunter, 1975, p. 181).
His belief was that material remains could illuminate the past which should not be bound by texts alone.
A generation after Aubrey, and influenced by his writings, William Stukeley continued and developed
this tradition of fieldwork-based recording, analysis and interpretation (Piggott, 1985); his extensive
fieldwork between the years 1718 and 1725 have had a lasting archaeological legacy. His acceptance of
the importance of spatial relationships is inherent within his accurate plans but also in his other forms
of spatial representation. He produced many ‘prospect’ views, natural style pen and wash drawings of
monuments in their landscape setting often annotated although his most innovative technique was the
“circular view” (Peterson, 2003). This representation is removed from vertical measured plans and land-
scape prospect views and shows the 360 degree horizon around a particular point with landscape features
and archaeological sites integrated.
In many ways Stukeley stood at a crossroads in the development of archaeological spatial thinking
and the resulting recording methods and interpretative representations. Aubrey had acknowledged that
what he did was chorography, literally “place writing”, discerning the past from the present through an
intimate knowledge of landscape where all aspects of human activity and history are recorded with the
emphasis on producing narrative so that any spatial considerations were supporting information. In con-
trast, Stukeley was the first secretary of the newly formed Society of Antiquaries of London, an act in itself
which announced antiquarianism as a nascent discipline in its own right rather than being just one of
the pantheon of interests covered by the Royal Society. Sweet (2004) sees this as the ‘Battle of the Books’
between the ‘ancients’ and the ‘moderns’, which by the early 18th century saw an established difference
between history which was rooted in the study of texts, inscriptions and coins, and antiquarians who
rather than partaking in “gentlemanly learning”, i.e. the Classical texts, were concerned with record-
ing and analysis in the belief that understanding could be gleaned from the material remains themselves
(Sweet, 2004, p. 8).
This move towards an antiquarianism as the foundation for archaeology is nowhere better demon-
strated than by Sir Richard Colt Hoare’s The Ancient History of Wiltshire (1812 and 1821). Colt Hoare’s
methodology, as well as the resulting publications, are important for being the first integration of large-
scale landscape survey with systematic targeted excavation. Colt Hoare describes himself as an “historian
and topographer” and his intentions are clear from the start, ‘we speak from facts not theory’ he claims in
the Introduction. As with earlier works, the maps and plans were integrated with rich textual description
and some novel techniques were favoured, for example the three-dimensional plan of a barrow group
justified by:
a large group of twenty-seven tumuli, which, being so thickly clustered, could not be numbered
sufficiently distinct on the general map: I have therefore had them engraved on a separate plate,
which will explain, better than any verbal description, the different forms of the barrows which
compose this group.
(Colt Hoare, 1812, p. 121)
The period up until 1840 has been characterised as ‘speculative’ in the development of American
archaeology (Willey & Sabloff, 1993) and mirrors the speculation witnessed in Europe but with the
added challenge of explaining the Native Americans, who they were and where they had come from.
Replacing the early descriptive writings following the first European contact, by the opening of the
19th century explorers and travellers crossing North America were systematically recording the topog-
raphy, flora and fauna of the new landscapes that confronted them. As in Europe this ‘natural scientific’
approach often included archaeology and a major focus at this time were the mounds and earthworks
6 Mark Gillings, Piraye Hacıgüzeller and Gary Lock
west of the Appalachians, especially in Ohio and surrounding areas, which gave rise to the “mound
builder debate” (Trigger, 1989, p. 104). The essence of this was whether the native Americans encoun-
tered at that time were descendants of the mound builders or a ‘lost race’ had built them, although
the importance in terms of spatial understandings is that the monuments provided complex cultural
remains in the form of earthworks which could be mapped. The first creditable attempt at this was
the work of Caleb Atwater (1820) a local postmaster in Ohio who surveyed many of the monuments
and produced accurate scale plans with descriptive detail, an impressive achievement considering the
almost total lack of a precedent.
His important paper published in 1820 was in the first volume of the Transactions of the newly
formed American Antiquarian Society and marked the transition from the speculative to classificatory
descriptive period (1840 to 1914) (Willey & Sabloff, 1993), epitomised by the Society’s call in 1799 for
the recording of archaeological remains by “accurate plans, drawings and descriptions” (Willey & Sabloff,
1993, p. 32). Although Atwater’s interpretations were completely fanciful and favoured more advanced
mound builders who had since left the area rather than the existing local groups, his recording was
insightful and certainly responded to the Society’s call for accuracy. Twenty-five years after Atwater’s pub-
lication, Ephraim G. Squier and Edwin H. Davis began systematic fieldwork which would build on this
and result in what remains today the primary source for these monuments (Squier & Davis, 18481). The
quality of their field recording far surpassed anything previous, combining scaled plans, cross-sections, the
integrated recording of landscape and archaeological features, and two-dimensional attempts at represent-
ing elevation and topography.
In both North America and Europe through the first half of the 19th century the collection of infor-
mation, the recording and mapping of sites and the collection of artefacts, increased rapidly thus creating
the need for more ordered ways of classifying the material. The need for temporal refinement increased
in importance as it became obvious that the evidence spanned long periods of time that needed to be
sub-divided and ordered, and between the years 1820 and 1870 much endeavour in Europe, and Scan-
dinavia in particular, was focussed on chronology (Gräslund, 1987). This included both methodology
and the resulting chronological schemes with the development of the 3-Age system and its refinement
and application by Christian Thomsen, Jens Worsaae, Oscar Montelius and others laying the surviving
foundations for prehistoric archaeology world-wide. Although the emphasis of these developments lay
heavily on the empirical study of large quantities of artefacts, there was an important spatial element to
the work. Thomsen’s original method, and Worsaae’s subsequent confirmation through fieldwork, relied
on accurate recording of ‘find associations’, the spatial relationships of finds within excavated contexts.
At the landscape scale the spatial implication was that sites could now be relatively dated through the
establishment of typologies, either of the sites themselves or through artefacts associated with them.
The decades either side of the opening of the 20th century on both sides of the Atlantic saw the
professionalisation of archaeology with the opening of museums and university departments and the
resulting establishment of networks for communicating ideas through travel, meetings and publications.
Chronological schemes were well developed by now, for example, Worsaae’s for much of European
prehistory, a remarkable feat based on extensive travel and the empirical study of many thousands of
artefacts. Within this milieu the concept of ‘cultural groups’, ‘cultural units’ or ‘cultural stages’ developed,
spread and was modified in various ways although in general terms it represented what became known as
‘culture-history’, characterised as a shift of interest from the artefacts to the people who made and used
them; a shift from establishing chronology to writing history (Trigger, 1989, Chapter 5). Although the
term ‘culture’ was applied in American archaeology at this time it was ill-defined and not an accepted
methodological tool (Willey & Sabloff, 1993, p. 89), the main developments in this respect took place in
Europe and especially through the work of Vere Gordon Childe (Green, 1981).
Archaeology and spatial analysis 7
Childe used the idea of cultural groups combined with Montelius’s typologies and through an
extensive program of travelling and recording artefacts and sites produced the first synthesis of Euro-
pean prehistory (1925). His approach offered little in terms of spatial representation as it is based on
severe space-time reductionism; any spatial considerations are in the accompanying text and occasional
distribution maps and these are usually at the regional scale so lack detail. Space is conceptualised as a
passive background across which people, ideas and cultures diffuse and migrate. On both continents the
Childean reductionist chart with time categorised along one axis and space along the other, together with
the associated writing of culture-history, had a profound and lasting effect. Gordon Willey’s Introduction to
American Archaeology (1966) and Christopher Hawkes’ ABC of the British Iron Age (1959) are both based
on such spatial minimalism and the idea of an ‘archaeological culture’ still lingers in some quarters.
Incorporated into these early approaches are two important elements, firstly possible relationships
between past people and their environment, an ecological focus, and secondly how spatial differentiation
and spatial relationships can tell something about human relationships, a social focus. This theme has
developed into what we today call ‘landscape archaeology’ and its importance within the discipline is clear
throughout this book; this is landscape as a spatial metaphor.
The foundations for the representation and understanding of spatial distributions and relationships
through the use of distribution maps were established by antiquarians although it was in the first half of
the twentieth century that the technique was developed to incorporate added interpretative power. One
of the pioneers of this ‘geographical’ approach was Cyril Fox who is best known for his Personality of
Britain (Fox, 1959 originally 1932). Fox argued that the geology, topography, form of coastline, climate,
vegetation all combine to have a profound effect on the areas of occupation and cultural attributes of
people; in effect this was historical geography. These enriched distribution maps provided the interpre-
tative power for Fox especially through the addition of temporal sequence added by phased maps thus
enabling comparative change through time based on spatial comparisons. Fox himself recognised various
issues within spatial thinking of the time so that, for example, ‘massed maps’ use a variety of symbols to
convey ‘cultural complexity’ although it is the ‘resultant patterns that are of interest’ and when detail is
important to the argument a ‘special map’ is provided, for example of a specific pottery type. Fox’s own
words illustrate how a distribution map, a relatively simple form of spatial representation, can be used to
construct a sensitive appreciation of human spatiality through imaginative spatial reasoning:
the keys to understanding [the landscape] were the major landmarks; our traveller steered his way
along plateau and spur, past barrow and cairn and stone circle, by the sight of successive mountain
tops. So guided he reached his goal . . . .
(Fox, 1959, p. 91 (originally 1932)).
Central to developments in North America was Julian Steward (1950) and his ideas and methodolo-
gies of ‘cultural ecology’ and ‘area studies’ (Kerns, 2003). As an example of this approach, and of great
importance in the development of spatial thinking linked to large-scale fieldwork, especially the organ-
isation and collection of survey data and its interpretation, is the Virú Valley Project, Peru (Willey, 1953,
1974). The thinking behind this project, its methodologies and interpretive framework have had a lasting
influence on both sides of the Atlantic. It was innovative in many ways, indeed Gordon Willey himself
described it as ‘experimental’, although at the time he did not realise the potential of it. While chronol-
ogy and ‘pottery sequences’ were still the accepted focus of fieldwork, the objectives here were ambitious
and innovative from the outset with the intention of identifying individual sites, recording each one and
reconstructing past landscapes so as to: “reconstruct cultural institutions as reflected in settlement configu-
rations”, with “settlement patterns” defined as “the way man [sic] disposed himself over the landscape”
8 Mark Gillings, Piraye Hacıgüzeller and Gary Lock
as shown by dwellings and community life reflected by environment, technology and the “institutions
of social interaction” (Willey, 1953). The relationships between settlements, pyramids and cemeteries, for
example, are represented by arrows and dotted lines indicating social groups while the overall limit for
agricultural activity is also delineated. Spatial relationships between sites are no longer a descriptive extra
to understanding each individual site but are central to the wider focus of landscape and the understand-
ing of social and cultural life at different scales. Comparing this to, for example Squiers and Davis, it
can be seen that spatial thinking is developing into a more formal spatial analysis that goes beyond the
inherent locational relationships.
As early as 1948 Walter Taylor’s book A Study of Archaeology was suggesting that archaeology needed to
go beyond the collecting and ordering of data and to some extent predicted the positivism of the 1960s
by arguing for interpretation based on the repeated reworking of hypotheses. It was Willey and Phillips
(1958 although this is based on two papers from 1953 and 1955) who built on Taylor’s proposals and
argued for theory based on rigorous methodology as the way forward for archaeology stating that the
discipline lacked ‘a systematic body of concepts and premises constituting archaeological theory’ (Willey &
Phillips, 1958, p. 1, their emphasis). They suggest that archaeology should be more science than history
and there is a need for cross-cultural generalisations so that rather than the existing concern solely with
“the nature and position of unique events in space and time”, its ultimate purpose should be “the discov-
ery of regularities that are in a sense spaceless and timeless” (Willey & Phillips, 1958, p. 2).
lacking a scheme of systematic and ordered study based upon declared and clearly defined models and
rules of procedure” (Clarke, 1968, p. xv). What follows is a detailed exploration of systems theory applied
to different scales of cultural unit. While the spatial component is represented by often minimalist dia-
grams, the underlying interpretative theory is explained in depth in the text. Spatial data, relationships and
interpretative concepts are often represented through models and modelling which are implicit within
Clarke’s approach although it was four years later that these were made explicit through his collection
of papers in Models in Archaeology (1972a). This reinforced archaeology’s long standing relationship with
geography, not least their parallel claimed revolutions in quantification, and geographical spatial thinking,
being a direct analogy to Chorley and Haggett’s Models in Geography (1967), who saw models as “. . . .
constituting a bridge between the observational and theoretical levels” (Chorley & Haggett, 1967, p. 24).
Typical of the approaches within Clarke’s book are the formal locational modelling techniques based
on forms of economic theory, particularly from the German geographical tradition. For example, Hod-
der’s (1972) analysis of Romano-British settlement uses Central Place Theory and Thiessen polygons to
interrogate the spatial relationships between different sized settlements based on the ‘services’ they pro-
vided. Ellison and Harriss (1972) apply Site Catchment Analysis to the location of sites within a landscape
wherein the ‘resources’ have been classified according to their agricultural potential. By quantifying the
resources around sites of different periods shifts between pastoral and arable could be claimed. Clarke’s
own paper in the volume, (1972b) is an innovative approach to the analysis of a single settlement, the
Iron Age village of Glastonbury, important for its multi-scalar approach. Starting with ‘modular units’
of a single house and associated structures, the analysis then moves to the site’s area and then the region
bringing in added spatial and attribute data. A temporal element is added through an ‘economic cycle
model’ which represents the annual agricultural cycle based on an infield, outfield and wasteland. There
is no doubt that these approaches meet the criteria of spatial thinking by providing a conceptual and
analytical framework, by providing analysis and communication alongside the manipulation, interpreta-
tion and explanation of spatial data. These elements are brought together in Clarke’s influential Spatial
Archaeology, (1977a) where in the opening chapter (Clarke, 1977b) he presents a thorough explication
of spatial theories and methods of the time incorporating four underlying general theories that attempt
to move beyond description to explanation: anthropological, economic, social physics and statistical. He
draws an important distinction between ‘quasi-deductive’ non-formal spatial approaches which are based
on empirical visual interpretation of patterning, and formal modelling based on quantitative analysis.
Interestingly though, even within this strongly positivist framework Clarke does offer a word of caution,
“the underlying theory has been criticised as too ideal in its disregard for non-economic factors and the
fact that ‘cost’ is at least in part a culturally conditioned and relative threshold” (Clarke, 1977b, p. 19).
An important volume which in many ways represents the zenith of spatial quantification during the
1970s is Hodder and Orton’s Spatial Analysis in Archaeology (1976). Acknowledging the importance of
the distribution map as the accepted method of displaying locations and relationships between them,
the book proceeds through a wide range of statistical techniques that extend their interpretative power
based on hypothesis testing possibilities – “the methods aid the testing of hypotheses about spatial pro-
cesses, allow large amounts of data to be handled, and enable predictions to be made about the location,
importance and functioning of sites” (Hodder & Orton, 1976, p. 241). The importance of this volume
is that, in comparison to the others mentioned above, it offers an effective ‘how to do it’ manual which
is light on theoretical discussion and justification, making it a popular choice for working archaeologists.
By the early 1980s cracks were beginning to appear in the acceptance and validity of the quantitative
revolution in both archaeology and geography. In the rush towards post-modernist humanism, the scien-
tific method and its valorised claims of objectivity gave way to a wide range of approaches and theoretical
stances based on more subjective and person-centred understandings of space, place and how to interpret
10 Mark Gillings, Piraye Hacıgüzeller and Gary Lock
spatial relationships. Rather than take a strictly chronological approach to these developments, we would
instead like to identify a number of generic trends that emerged from the complex web of sometimes
tensioned and contradictory approaches that have been grouped under labels such as post-processualism,
post-structuralism, phenomenology, non-representational theory and, most recently, new materialism
(Thomas, 2015).
Perhaps the most striking element of this critique was the explicit questioning of the ‘space’ that lay at
the heart of the spatial archaeology that had flourished under the aegis of the New Archaeology. Despite
its professed reaction to the culture historical paradigm it sought to disrupt, when it came to space, the
work of both the culture historians and processualists was predicated upon an uncritical acceptance of
the ontological status of space as a self-evident given. This served as a universal, a-priori container (in a
Kantian sense) with a 3-dimensional geometry or an inert background with 2-dimensions that could be
measured using standard units. This Cartesian space was neutral and external – the space of the Meso-
lithic was identical to the space of the Bronze Age, which was identical to the space of the Roman period
etc. Crucially, they were all identical to the space of the modern archaeologist. Patterns in this universal
space could be measured and analysed creating a link to the activities that originally created them. These
patterns may well have been distorted by a range of taphonomic and formation processes, but space itself
had no role to play, it had no agency.
The critiques initiated by post-processualism posed a simple, yet radical, question. What if space is not a
universal, a-priori backdrop to human existence? What if it is not (and never has been) a mute canvas upon
which past activities left tell-tale traces that could be mapped and statistically summarised? What if the
space of the Mesolithic was different to the space of the Bronze Age, which was different to the space of the
Roman period etc. and all were different to the space of the modern archaeologist? Further, if space was not
immutable and neutral, then at any given time rather than a single monolithic space, there may have been
a host of competing spaces differentiated on grounds of politics, gender and authority. Spaces could also
be active and agential, exerting power through deft manipulation and careful configuration (e.g. Foucault’s
“heterotopia” (1986)). A growing interest in Phenomenology lead to an active concern with embodied
spatial understandings (rather than the kinds of spatial knowledge gained through abstract representations)
in which knowledge of the world emerged through meaningful engagement within it (Tilley, 1994). This
lead to a broader questioning of the role and necessity of traditional archaeological maps. Why position
such abstract representational schema between archaeologists and the materials they sought to study?
In this context, anthropological and ethnographic studies (many of which themselves drew inspira-
tion from Phenomenology) shed light upon a range of non-western spatial understandings that could
serve as analogues, or points of departure, for interpreting the pre-modern worlds of much archaeologi-
cal enquiry. They also questioned the reality of modernity’s claims on space as a monolithic, universal,
a-priori container by revealing the sheer complexity of spatial understandings that actually exist in the
modern west (Rappaport, 1994) and the myriad ‘species of space’ through which we live our own lives
(Perec, 1997). Emerging from this work was a shift in focus from space to place. Places were portrayed as
key locales, rich in social meaning and significance, around which everyday life was anchored (Cresswell,
2004; Feld & Basso, 1996). Archaeologists had spent years marking dots (meaningful places) on maps with
space providing the situational context. In the ‘palatial’ archaeologies and anthropologies of the 1990s,
space was instead argued to gain its very existence from this configuration of places (Nixon, 2006). This
nodal interpretation of place, and humanist assumptions that locations could only become meaningful
once they had been drizzled with cultural significance, in turn came under scrutiny. Drawing inspiration
from the philosophical work of Deleuze and Guattari (1988, see also Bonta & Protevi, 2006), a static
nodal depiction of place was replaced by a more fluid and dynamic conceptualisation that identifies
Archaeology and spatial analysis 11
places as tangles and knots in the ongoing flow of life (Ingold, 2011, pp. 145–155; Thrift, 1996). In such
work, places and spaces are profoundly emergent and relational – people do not create them by adding a
dollop of ‘culture’ to otherwise neutral locations, but instead it is the very relations that the location itself
is folded up within that allow significance to emerge.
In short, archaeological studies in the 1990s inspired by then contemporary spatial thinking in other
disciplines (e.g. social anthropology, sociology and human geography) questioned the very foundations of
the spatial archaeology project of the 1970s – namely that “lurking beneath the distribution of dots on a
map was a spatial process and causality to be discovered” (Tilley, 1994, p. 9). The result was a questioning
of the key tools used by spatial archaeologists, such as projected maps and the battery of formal statistical
and mathematical tools used to analyse them. If space was a fluid, emergent, profoundly relational and
highly contextual phenomenon, then the identification, representation and analysis of spatial patterns
posed significant challenges that in turn required new methods to address. However, whilst these theoreti-
cal critiques were undoubtedly powerful, they were not matched by any commensurate development of
new methodologies. Upon rejection of these formal approaches to a large extent a methodological void
was created. Although Tilley’s phenomenology is only one strand of post-processual thinking his com-
ment applies generally – “there is and can be no clear-cut methodology arising from [post-processual
thinking] to provide a concise guide to empirical research” (Tilley, 1994, p. 11). As he explained later
(Tilley 2008), formal methodology was not seen as being necessary as the aim was to provide a thick text
narrative, almost a return to the antiquarian’s chorography, that could be reinterpreted and re-written
within an ongoing spiral of changing understanding. So here we have an important difference between
spatial analysis and spatial narrative both of which involve concepts, albeit very different, of space and
processes of reasoning. One key difference concerns the tools we use to represent spatial phenomena;
although spatial data underlie spatial narratives there is no interpretative requirement for them to be
explicitly depicted. Although this situation is undoubtedly changing, particularly with regard to the
potential of representational schema such as maps (see papers in Gillings, Hacıgüzeller & Lock, 2019), the
methodological developments in spatial archaeology that did take place alongside the theoretical ruptures
sketched above were largely computational and took their theoretical inspiration predominantly from
the New Archaeology. As a result rather than running hand-in-hand with this dynamically evolving
theoretical landscape, they ran in parallel.
In methodological terms, this most recent phase of spatial archaeology has been characterised by the
increasing importance of spatial technologies not least Geographic Information Systems (GIS). Since the
early days of GIS in archaeology (Allen, Green, & Zubrow, 1990) their potential has been recognised
within two important areas – firstly the management, integration and display of increasingly large and
complex forms of spatial data and, secondly, their potential within the area of spatial analysis. The rapid
adoption of GIS since the late 1980s, in various forms and in various ways, has had a major impact on
archaeology such that their use is now almost taken for granted. Even so, the use of GIS in archaeology
always has been, and still is (see Howey & Brouwer Burg, 2017; Verhagen, 2018), somewhat contentious
at the theoretical level due to the branching developmental pathway alluded to above, although the attrac-
tions of the technology are usually seen to outweigh any restrictions or disadvantages. Whilst at their
most strident, these arguments centred on accusations of a return to positivism and the inability of the
technology to respond to humanist, subjective understandings of place and landscape, the last decade has
seen a slow but welcome convergence. This has been characterised by a willingness on the part of spatial
analysts to engage with developments in archaeological theory, and a new openness to the possibilities of
spatial representations such as in the case of maps (e.g. Aldred & Lucas, 2019) and the methodological
possibilities offered by technologies such as GIS on the part of theorists (e.g. Fowler, 2013).
12 Mark Gillings, Piraye Hacıgüzeller and Gary Lock
1 They know where, when, how and why to think in spatial terms.
2 Practice spatial thinking in an informed way: they have a broad and deep knowledge of spatial con-
cepts and spatial representations, a command over spatial reasoning using a variety of spatial ways of
thinking and acting, and well-developed capabilities for using supporting tools and technologies.
3 Adopt a critical stance insofar as they: can evaluate the quality of spatial data based on its source and
its likely accuracy and reliability; can use spatial data to construct, articulate, and defend a line of
reasoning or point of view in solving problems and answering questions; and can evaluate the valid-
ity of arguments based on spatial information.
The implications of this for archaeology are twofold. First, that spatial thinking is something that has
to be learnt, developed and supported. Second, that the use of spatial methods, techniques and tech-
nologies alone does not make the user a spatial thinker. The uncritical use of computer software, for
example, the push-button solution of generating a viewshed, does not meet the three criteria above
for determining spatial intelligence (Lock & Pouncett, 2017). In helping researchers to develop such
an ability, the U.S. National Academies (2006, p. 12) also highlighted three fundamental elements, and
we have taken inspiration from this threefold schema in structuring the volume as whole, as well as the
individual chapters that make it up:
1 Concepts of space: providing the conceptual and analytical framework within which data can be integrated,
related and structured into a whole – as already mentioned, the issue here is often characterised as the
difference between ‘space’ and ‘place’, two very different concepts. The former is considered to be
objective, a blank background or container within which human action takes place. The fact that
these actions can be mapped, measured and analysed within a co-ordinate is largely unquestioned
and taken as given. ‘Place’ on the other hand is a culturally constituted locale embedded with mean-
ing through the human actions and experiences that happen there (Cresswell 2004); it is relative
compared to absolute space. It has long been recognised in geography and archaeology that spatial
technologies are designed to work with absolute objective space and, therefore, are often challenged
by concepts of place (Curry, 1998).
2 Tools of representation: providing the forms within which structured information can be stored, analysed, compre-
hended and communicated – the issue here is encapsulated within the post-modern ‘crisis of representa-
tion’, including how to represent those aspects of human experience that cannot be ‘scientifically’
measured and plotted. The data structures of technologies such as GIS are based on spatial primitives
Archaeology and spatial analysis 13
(point, line, polygon plus cell-based coverages and attribute data). These are designed to represent a cer-
tain view of the world, one which is at odds with understandings not based on empirical objectivism.
3 Processes of reasoning: providing the means of manipulating, interpreting and explaining the structured
information – the issue here can be characterised as one of methodology, and equally a lack of it and
a need for it. Here we can differentiate between explicit and implicit methodology. What we can
call implicit methodology is particularly important within GIS and is incorporated into the general
critique of “technological determinism” (Huggett, 2000). Specifically, the procedures which under-
lie GIS operations involve pre-determined algorithms, for example line-of sight, least-cost-path and
interpolating a digital elevation model (DEM). These incorporate ‘black box’ logics and algorithms
not immediately available to the user but fundamental in influencing the interpretation arrived at
and often chosen unknowingly.
An objective of the current volume, albeit ambitious, is to make archaeologists better spatial thinkers,
and as a result better spatial analysts. It is concerned with formal techniques of spatial analysis. This is a
poorly defined term that is sometimes conflated with spatial thinking although to be represented herein
it must involve precise and repeatable methodology, these days usually computer-based. Spatial analysis is
subsumed within spatial thinking and, as the following quote demonstrates, cannot be divorced from the
interplay between quantitative and qualitative approaches, formal and informal aspects:
Spatial analysis exists at the interface between the human and the computer, and both play important
roles. The concepts that humans use to understand, navigate, and exploit the world around them are
mirrored in the concepts of spatial analysis. So [a comprehensive discussion on spatial analysis] will
often appear to be following parallel tracks – the track of human intuition on the one hand, with all
its vagueness and informality, and the track of the formal, precise world of spatial analysis on the other.
(de Smith, Goodchild, & Longley, 2018; our emphasis).
More specifically, the objective of the current volume is to encourage spatial thinking on the part of
archaeologists by detailing a range of contemporary spatial analytical techniques in as accessible a fash-
ion as possible. The remit given to the authors was to structure their chapter in a clear and consistent
fashion. Each starts with an introduction to the technique, explaining why it is important and how it
has been used before. The next section in each chapter explains the methodology, followed by one or
more case-studies applying the technique while the conclusion indicates future directions and possibili-
ties. Whilst the nature of some of the topics made it more difficult to follow this structure to the letter
than others, we think, the results have been worth the effort, providing an accessible summary of each
technique alongside relative inter-chapter harmony. The chapters are embedded within contemporary
practice and are, therefore, to a large extent computer-based and quantitative, although we avoid lengthy
discussions on software solutions which would rapidly age the book since they tend to come and go
quickly. What is crucial to stress is that the quantitative focus of the book does not indicate a return to
the positivism described above but rather a maturing realistic acceptance of the importance and poten-
tial of these methods and related technologies. We believe that the application of quantified, formal
methods can be used in an exploratory way – not least for developing more nuanced understandings
of space and spatiality – and that any results produced are not ‘the answer’ but rather the starting point
for a process of interpretation and understanding of the past. As a final note, we hope that the methods
explained in this book will encourage and guide archaeologists in undertaking spatial analysis for some
years to come.
14 Mark Gillings, Piraye Hacıgüzeller and Gary Lock
Note
1 This work was re-published in 1998 by the Smithsonian Institute as a 150th Anniversary Edition with an extensive
introduction by D.J. Meltzer.
References
Aldred, O., & Lucas, G. (2019). The map as assemblage: Landscape archaeology and mapwork. In M. Gillings,
P. Hacıgüzeller, & G. Lock (Eds.), Re-mapping archaeology: Critical perspectives, alternative mappings (pp. 19–36).
London: Routledge.
Allen, K. M. S., Green, S. W., & Zubrow, E. B. W. (Eds.). (1990). Interpreting space: GIS and archaeology. London:
Taylor and Francis.
Atkinson, P., & Duffy, D. M. (2019). Seeing movement: Dancing bodies and the sensuality of place. Emotion, Space
and Society, 30, 20–26.
Atwater, C. (1820). Description of the antiquities discovered in the state of Ohio and other western states. Transac-
tions and Collections of the American Antiquarian Society, 1, 105–267. (1997 re-published by Arthur W. McGraw).
Binford, L. R. (1964). A consideration of archaeological research design. American Antiquity, 29, 425–441.
Binford, L. R. (1965). Archaeological systematics and the study of culture process. American Antiquity, 31(2), 203–210.
Boast, R. (2009). The formative century, 1860–1960. In B. Cunliffe, C. Gosden, & R. Joyce (Eds.), The Oxford hand-
book of archaeology (pp. 47–70). Oxford: Oxford University Press.
Bodenhamer, D. J., Corrigan, J., & Harris, T. M. (Eds.). (2010). The spatial humanities: GIS and the future of humanities
scholarship. Bloomington: Indiana University Press.
Bonta, M., & Protevi, J. (2006). Deleuze and geophilosophy. Edinburgh: Edinburgh University Press.
Chandler, J. (1993). John Leland’s Itinerary: Travels in Tudor England. Stroud: Alan Sutton Publishing.
Childe, V. G. (1925). The dawn of European civilization. London: Kegan Paul.
Chorley, R., & Haggett, P. (1967). Models in geography. London: Methuen.
Clarke, D. (1968). Analytical archaeology. London: Methuen.
Clarke, D. (Ed.). (1972a). Models in archaeology. London: Methuen.
Clarke, D. (1972b). A provisional model of an Iron Age society and its settlement system. In D. Clarke (Ed.), Models
in archaeology (pp. 801–869). London: Methuen.
Clarke, D. (Ed.). (1977a). Spatial archaeology. New York: Academic Press.
Clarke, D. (1977b). Spatial information in archaeology. In D. Clarke (Ed.), Spatial archaeology (pp. 1–32). New York:
Academic Press.
Colt Hoare, R. (1812 and 1821). The ancient history of Wiltshire (2 Vols.). Republished 1975 by EP Publishing and
Wiltshire County Library.
Conneller, C. (2011). An archaeology of materials. London: Routledge.
Cresswell, T. (2004). Place: A short introduction. Oxford: Blackwell Publishing.
Curry, M. R. (1998). Digital places: Living with geographic information technologies. London: Routledge.
Dalakoglou, D., & Harvey, P. (2012). Roads and anthropology: Ethnographic perspectives on space, time and (Im)
mobility. Mobilities, 7(4), 459–465.
Deleuze, G., & Guattari, F. (1988). A thousand plateaus. London: Athlone Press.
de Smith, M., Goodchild, M., & Longley, P. (2018). Geospatial analysis: A comprehensive guide (6th ed.). Retrieved May
2019, from www.spatialanalysisonline.com/
Doran, J. (1970). Systems theory, computer simulations and archaeology. World Archaeology, 1, 289–298.
Doran, J., & Hodson, F. (1975). Mathematics and computers in archaeology. Edinburgh: Edinburgh University Press.
Ellison, A., & Harriss, J. (1972). Settlement and land use in the prehistory and early history of southern England:
A study based on locational models. In D. Clarke (Ed.), Models in archaeology (pp. 911–962). London: Methuen.
Feld, S., & Basso, K. H. (Eds.). (1996). Senses of place. Santa Fe: SAR Press.
Foucault, M. (1986). Of other spaces. Diacritics, 16(1), 22–27.
Fowler, C. (2013). The emergent past. Oxford: Oxford University Press.
Fox, C. (1959). The personality of Britain: Its influence on inhabitant and invader in prehistoric and early historic times (4th
ed., originally published 1932). Cardiff: National Museum of Wales.
Archaeology and spatial analysis 15
Gillings, M., Hacıgüzeller, P., & Lock, G. (Eds.). (2019). Re-mapping archaeology: Critical perspectives, alternative mappings.
London: Routledge.
Gräslund, B. (1987). The birth of prehistoric chronology. Cambridge: Cambridge University Press.
Green, S. (1981). Prehistorian: A biography of V. Gordon Childe. Bradford-on-Avon: Moonraker Press.
Hacıgüzeller, P. (2017). Archaeological (digital) maps as performances: Towards alternative mappings. Norwegian
Archaeological Review, 50(2), 149–171.
Hammer, E. (2014). Local landscape organization of mobile pastoralists in southeastern Turkey. Journal of Anthropo-
logical Archaeology, 35, 269–288.
Hawkes, C. (1959). The ABC of the British Iron Age. Antiquity, 33, 170–182.
Hodder, I. (1972). Locational models and the study of Romano-British settlement. In D. Clarke (Ed.), Models in
archaeology (pp. 887–910). London: Methuen.
Hodder, I., & Orton, C. (1976). Spatial analysis in archaeology. New Studies in Archaeology, 1. Cambridge: Cambridge
University Press.
Howey, M. C. L., & Brouwer Burg, M. (2017). Assessing the state of archaeological GIS research: Unbinding analyses
of past landscapes. Journal of Archaeological Science, 84, 1–9.
Huggett, J. (2000). Computers and archaeological culture change. In G. Lock & K. Brown (Eds.), On the theory and
practice of archaeological computing (pp. 5–22). Oxford: Oxford University Committee for Archaeology.
Hunter, M. (1975). John Aubrey and the realm of learning. London: Duckworth.
Hymes, D. (Ed.). (1965). The use of computers in anthropology. London: Mouton.
Ingold, T. (1993). The temporality of landscape. World Archaeology, 25(2), 152–174.
Ingold, T. (2011). Being alive: Essays on movement, knowledge and description. London: Routledge.
Kerns, V. (2003). Scenes from the high desert: Julian Steward’s life and theory. Urbana: University of Illinois Press.
Knappett, C. (2011). Networks of objects, meshworks of things. In T. Ingold (Ed.), Redrawing anthropology: Materials,
movements, lines (pp. 45–63). Farnham: Ashgate.
Lefebvre, H. (1991). The production of space. Oxford: Blackwell’s.
Lock, G., & Pouncett, J. (2017). Spatial thinking in archaeology: Is GIS the answer? Journal of Archaeological Science,
84, 129–135. http://dx.doi.org/10.1016/j.jas.2017.06.0020305-4403/
Löwenborg, D. (2018). Knowledge production with data from archaeological excavations. In I. Huvila (Ed.), Archae-
ology and archaeological information in the digital society (pp. 37–53). Abingdon: Routledge.
Lucas, G. (2012). Understanding the archaeological record. Cambridge: Cambridge University Press.
McCormack, D. P. (2008). Geographies for moving bodies: Thinking, dancing, spaces. Geography Compass, 2(6),
1822–1836.
McFarlane, C. (2011). The city as assemblage: Dwelling and urban space. Environment and Planning D: Society and
Space, 29(4), 649–671.
Nixon, L. (2006). Making a landscape sacred. Oxford: Oxbow.
Perec, G. (1997). Species of spaces and other pieces. London: Penguin.
Peterson, R. (2003). William Stukeley: An eighteenth century phenomenologist? Antiquity, 77(296), 394–400.
Piggott, S. (1985). William Stukeley: An eighteenth century antiquarian (2nd ed.). London: Thames and Hudson.
Rappaport, A. (1994). Spatial organization and the built environment. In T. Ingold (Ed.), Companion encyclopedia of
anthropology (pp. 460–502). London: Routledge.
Redman, C. L. (1974). Archaeological sampling strategies. Modular Publications in Archaeology, 55. New York:
Addison-Wesley.
Resina, J. R., & Wulf, C. (2019). Repetition, recurrence, returns: How cultural renewal works. Lanham, MD: Lexington
Books.
Richards-Rissetto, H., & Landau, K. (2014). Movement as a means of social (re)production: Using GIS to measure
social integration across urban landscapes. Journal of Archaeological Science, 41, 365–375.
Roberts, L. (2012). Mapping cultures: A spatial anthropology. In L. Roberts (Ed.), Mapping cultures: Place, practice,
performance (pp. 1–25). Houndmills, Basingstoke, Hampshire and New York: Palgrave Macmillan.
Schnapp, A. (1996). The discovery of the past. London: British Museum Press.
Seigworth, G. (2000). Banality for cultural studies. Cultural Studies, 14(2), 227–268.
Snead, J. E., Erickson, C. L., & Darling, J. A. (2009). Landscapes of movement: Trails, paths, and roads in anthropological
perspective. Philadelphia: University of Pennsylvania Museum of Archaeology and Anthropology.
16 Mark Gillings, Piraye Hacıgüzeller and Gary Lock
Soja, E. (1996). Thirdspace: Journeys to Los Angeles and other real and imagined places. Oxford: Blackwell’s.
Spaulding, A. (1960). The dimensions of archaeology. In G. E. Doles & R. L. Cameiro (Eds.), Essays in the science of
culture in honor of Leslie A. White (pp. 437–456). New York: Crowell.
Squier, E. G., & Davis, E. H. (1848). Ancient monuments of the Mississippi Valley. Smithsonian Contributions to Knowl-
edge, 1. Washington. (Re-published in 1998 with an extended Introduction by D. J. Meltzer).
Steward, J. (1950). Area research: Theory and practice. New York: Social Science Research Council.
Sweet, R. (2004). Antiquaries: The discovery of the past in eighteenth century Britain. London: Hambledon and London.
Thomas, J. (2015). The future of archaeological theory. Antiquity, 89(348), 1287–1296.
Thrift, N. J. (1996). Spatial formations London. Thousand Oaks, CA: Sage.
Thrift, N. J. (2008). Non-representational theory: Space, politics, affect. London: Routledge.
Tilley, C. (1994). The phenomenology of landscape: Places, paths and monuments. Oxford: Berg.
Tilley, C. (2008). Phenomenological approaches to landscape. In B. David & J. Thomas (Eds.), Handbook of landscape
archaeology (pp. 271–276). Walnut Creek: Left Wing Press.
Trigger, B. G. (1989). A history of archaeological thought. Cambridge: Cambridge University Press.
Trigger, B. G. (1998). Sociocultural evolution: Calculation and contingency. Malden, MA: Blackwell Publishers.
US National Academies. (2006). Learning to think spatially: GIS as a support system in the K-12 curriculum. Washington:
The National Academies Press. Retrieved March 2019, from www.nap.edu/read/11019/chapter/1
Verhagen, P. (2018). Spatial analysis in archaeology: Moving into new territories. In C. Siart, M. Forbriger & O.
Bubenzer (Eds.), Digital geoarchaeology: New techniques for interdisciplinary human-environmental research (pp. 11–25).
Cham: Springer International Publishing.
Verhagen, P., Nuninger, L., & Groenhuijzen, M. R. (2019). Modelling of pathways and movement networks in
archaeology: An overview of current approaches. In P. Verhagen, J. Joyce, & M. R. Groenhuijzen (Eds.), Finding
the limits of the limes: Modelling demography, economy and transport on the edge of the Roman Empire (pp. 217–249).
Cham: Springer International Publishing.
Warf, B., & Arias, S. (2009). The spatial turn: Interdisciplinary perspectives. London and New York: Routledge.
Wickstead, H. (2019). Cults of the distribution map: Geography, utopia and the making of modern archaeology.
In M. Gillings, P. Hacıgüzeller, & G. Lock (Eds.), Re-mapping archaeology: Critical perspectives, alternative mappings
(pp. 37–72). London: Routledge.
Willey, G. R. (1953). Prehistoric settlement patterns in the Virú Valley, Peru. Bulletin No. 155. Washington: Bureau of
American Ethnology.
Willey, G. R. (1966). An introduction to American archaeology. NJ: Prentice-Hall.
Willey, G. R. (1974). The Virú Valley settlement pattern study. In G. R. Willey (Ed.), Archaeological researches in retrospect
(pp. 149–178). Cambridge: Winthrop Publishers.
Willey, G. R., & Phillips, P. (1958). Method and theory in American archaeology. Chicago: University of Chicago Press.
Willey, G. R., & Sabloff, J. A. (1993). A history of American archaeology (3rd ed.). New York: W.H. Freeman and
Company.
Wood, D. (2010). Rethinking the power of maps. New York: The Guilford Press.
Wood, D. (2012). The anthropology of cartography. In L. Roberts (Ed.), Mapping cultures: Place, practice, performance
(pp. 280–303). Houndmills, Basingstoke, Hampshire and New York: Palgrave Macmillan.
2
Preparing archaeological data
for spatial analysis
Neha Gupta
Introduction
In the final pages of Spatial Analysis in Archaeology, Ian Hodder and Clive Orton (1976, p. 245) remarked
that the ‘slow collection of large bodies of reliable data, [. . .] will allow spatial processes to be better
understood’. This scholarly work made explicit spatial concepts in the field of archaeology, and drew
attention to cross-disciplinary conversations that archaeologists can have with geographers and social
scientists. Published in 1976, Hodder and Orton’s remarks might seem simple and banal, yet they under-
score two key facets in archaeology that hold true in today’s digital data-r ich environment; first, that
archaeologists will re-use archaeological data that were collected by other scholars at different times,
who employed different methods, tools and technologies; and second, that for archaeology to engage in
meaningful conversations on complex phenomena, we must have ‘large bodies of reliable data’ (emphasis
mine) which can be understood as a reliable archaeological database (Gupta & Devillers, 2017, p. 857).
Hodder and Orton (1976, p. 244) further note that spatial analytic techniques go ‘hand-in-hand’ with
the ‘collection of better data’ and remark on the value of ‘very detailed information’ but they do not
explicitly describe what reliable means, and how research design and the goals of a particular project
are linked to data quality and what role data quality might play a role in the analysis, interpretation and
re-use of archaeological data.
Archaeologists increasingly face a scenario where the re-use of archaeological data, particularly the
processing and analysis of digital archaeological data, is posing pointed challenges to the practice of
archaeology (Kansa, Kansa, & Arbuckle, 2014; Huggett, 2015). The social life of archaeological data
typically extends beyond specialists and is best understood in relation to local communities and society
as a whole. The changing relationship between archaeology and society is reflected in scholarship on
the abuse and misuse of archaeology (Silberman, 1989; Kohl & Fawcett, 1995; Meskell, 2005; Kohl,
Kozelsky, & Ben-Yehuda, 2007), as well as on challenging inaccurate views of the human past (Wylie,
2002; Trigger, 2006). Archaeology (and archaeological data), therefore, can serve both public and
scholarly goals.
Recent interests in the preparation of archaeological data for further use, and the quality of archaeo-
logical data are influenced by two broader developments, namely; the growing use of digital and geo-
spatial tools and technologies in data acquisition (Dibble & McPherron, 1988; Levy & Smith, 2007); and
18 Neha Gupta
second, the exponential growth of communication tools, particularly Web 2.0 technologies that facilitate
collaboration and can encourage exchange and sharing of data between scholars, institutions and non-
specialists (Kansa, 2011). The apparent democratization of archaeological data has renewed concerns
over the privacy of archaeological sites and the sharing of sensitive locational information (Bampton &
Mosher, 2001; Sitara & Vouligea, 2014).
Growing awareness of a digital data-r ich environment (Bevan & Lake, 2013) has spurred calls for
Open Science in archaeology (Marwick, 2017), practices that aim to enable ‘reproducibility’, generate
‘scripted workflows’, ‘version control’, ‘collaborative analysis’, and encourage public availability of pre-
prints and data (Marwick et al., 2017, p. 8). Efforts in open archaeology are premised on the belief that
the ‘analytical pipeline’ in archaeological research has not been available for scholars to examine, critique
and re-use, a situation that impacts the range and scope of archaeology (Marwick, 2017, p. 424). This
situation is reflected in the prevailing use of ‘point-and-click’ commercial software that obscures underly-
ing algorithms and assumptions. Moreover, archaeologists typically do not document (or do not report)
sufficient information on how and why particular decisions were made during cleaning and analysis, a
situation that presents significant challenges in replicating analytical methods and results, even when data
are available for re-use.
Understanding the role of quality information in archaeology is pressing as digital geospatial data
acquired in the field are increasingly compiled with existing digitized information and together they are
combined into computational pipelines (Snow et al., 2006; Kintigh, 2006). Archaeologists are accumu-
lating large amounts of data through ‘real-time’ digital documentation in the field (Vincent, Kuester, &
Levy, 2014), ‘mobilizing the past’ (Averett, Counts, & Gordon, 2016) and promoting ‘transparency’ in
field collection (Strupler & Wilkinson, 2017). Typically paperless, these efforts are thought to minimize
redundancy and human-introduced errors in the recording of archaeological sites and archaeological data
(Austin, 2014) and potentially shorten the time interval between stages in the archaeological workflow
(Roosevelt, Cobb, Moss, Olson, & Ünlüsoy, 2015).
To manage, store and analyse these large amounts of digital archaeological data, archaeologists typically
harness geospatial technologies such as commercial Geographic Information Systems (GIS). However,
these spatial databases are known to have poor error management, a situation that can result in error
propagation that impacts subsequent analysis and the final result (Figure 2.1) (Hunter & Beard, 1992).
The widespread use of GIS in archaeology therefore can constrain broader assessments of archaeological
methods and the quality of data in terms of interpretation and re-use. Moreover, processing of data and
analysis within computational pipelines is rarely documented and shared (Costa, Beck, Bevan, & Ogden,
2013, p. 450), limiting what is known on procedures and transformations and the overall quality of
sources (Evans, 2013).
In this chapter, I discuss the preparation of archaeological data for spatial analysis and re-use from the
perspective of spatial data quality in the archaeological workflow. I draw from scholarship on geospatial
data quality to shed light on data-centric issues in archaeology. I show that archaeological spatial analysis
can be improved through better documentation procedures and transformations of archaeological data
and discuss how these practices can facilitate deeper understanding of archaeological methods and prac-
tice and open new forms of research in archaeology. I argue that documentation on data cleaning and
tidying procedures, and version control can enable more rigorous research practice, and attune archaeolo-
gists to data-centric imperfections in archaeological data.
Uncertainty in geographical data is the recognition that there exists a difference between a complex
reality and our conceptualization and measurement of that reality (Plewe, 2002). Our conceptualization
of reality is necessarily a generalization and abstraction, and thus an imperfect model. Uncertainty in
Preparing data for spatial analysis 19
Figure 2.1 Sources and types of errors in data collection and compilation, data processing and data usage that
result in final global error, adapted from Hunter and Beard (1992).
this model can be described as having three dimensions; space, time and theme. A map, for example, is
an imperfect model of a complex reality; a full scale (1:1) map of a town could never be rolled out, nor
would we make good use of such a document. Within this framework, we can better understand for
each of the three dimensions, elements of quality, including error, accuracy, precision, consistency and
completeness. Greater awareness of sources and causes of error and uncertainty can enable us to represent
and manage imperfections within spatial databases, which in turn can facilitate greater confidence in the
interpretation of digital archaeological data.
Quality issues are present in all data and throughout the research process, a situation that impacts
the interpretation of archaeological data and decision-making (Figure 2.2). The preparation of
archaeological data for analysis is tied to data quality which, in turn, is related to research design
and the archaeologist’s intended purpose for those data. Data quality can be understood in terms of
internal and external. Internal quality is the ‘level of similarity between data produced’ and the ‘ideal
data’ (data without error) or ‘control data’. The ideal data are based on a set of specifications or rules
and requirements that define how objects will be represented, which geometries will represent each
type of object, the attributes that describe them and possible values for these attributes (Devillers &
Jeansoulin, 2006, p. 38). Therefore when a result differs to some degree from what a theory was
expecting, we have error, imprecision and incompleteness. External quality relates to data that were
produced and how they meet the needs of a particular user. Whereas data quality elements can be
measured separately for space, time and theme, an assessment of overall data quality requires care-
ful consideration on all three dimensions because they are interdependent. Greater emphasis is now
placed on ‘fitness for use’, which shifts focus to the needs of particular users and their intended use
of data (Chrisman, 2006).
20 Neha Gupta
Figure 2.2 The archaeological workflow in terms of a computational pipeline from data acquisition to unpub-
lished data, and re-use. Problems of quality impact data in each stage. Black boxes in this workflow occur
wherever archaeologists employ software and tools whose code are unavailable to review and modify and that
do not enable documentation of transformations.
Scholarship on quality issues in archaeology ranges from ‘quality assurance’ (Banning, Hawkins,
Stewart, Hitchings, & Edwards, 2017), and ‘quality standard’ (Willems & Brandt, 2004) in field surveys to
verifying and validating the quality of computational models (Burg, Peeters, & Lovis, 2016), the use of
statistical techniques to address spatio-temporal uncertainty (Zoghlami, de Runz, Akdag, & Pargny, 2012;
Kolar, Macek, Tkáč, & Szabó, 2015; cf. Fusco & de Runz this volume) and temporal uncertainty (Green,
2011; Bevan, Crema, Li, & Palmisano, 2013; Crema, 2012) to the quality of 3-dimensional photogram-
metric models (Porter, Roussel, & Soressi, 2016). While fruitful, these efforts typically offer immediate
solutions for project-specific problems and emphasize quantification techniques themselves, overlooking
data quality issues in processing digital archaeological data and their future usage.
Recent interest in the quality of digital archaeological data shifts the focus away from positional
accuracy as the primary concern in archaeology, as is reflected in works such as Dunnell, Teltser, and Ver-
cruysse (1986), Dibble and McPherron (1988), Wheatley and Gillings (2002), Heilen, Nagle, and Altschul
(2008), Atici, Kansa, Lev-Tov, and Kansa (2013), Evans (2013), Kansa et al. (2014), Wilshusen, Heilen,
Catts, de Dufour, and Jones (2016), Cooper and Green (2016) and McCoy (2017). Managing and shar-
ing digital geospatial data are encouraging archaeologists to think in terms of data-intensive methods and
‘big data’ i.e. data that are characterized by volume, velocity, variety, veracity, visualization and visibility
(McCoy, 2017, p. 76; Green, this volume). Large sets of data, such as those that are generated through
Preparing data for spatial analysis 21
multiple field seasons, are greatly impacted by error (Dunnell et al., 1986). A similar scenario might be
described for data collected through highly complex projects that involve the integration of multiple
sources of information that are heterogeneous spatially, temporally and thematically.
To improve data documentation and quality, Kansa et al. (2014) place emphasis on editorial and col-
laborative review of archaeological collections. They suggest that data cleaning early in the archaeologi-
cal workflow can avoid costly investments in terms of person hours, and publication delays later in the
process, particularly when the archaeologist who collected the data is not available to encode and link
individual field documents. Re-use and comparison of existing digital archaeological collections is high-
lighting data ‘accuracy, reliability and completeness’ (Evans, 2013, p. 20), and the challenges in integrating
diverse data that do not have clear documentation (Cooper & Green, 2016). Most importantly, recent
scholarship greatly expands what archaeologists consider pertinent to the quality of data, and draw atten-
tion to elements such as error, accuracy, precision, consistency and completeness in digital archaeological
collections. These efforts reflect growing intellectual interest in re-use of research data.
Research data management initiatives are now supported by governments. In the United States and
Canada as well as other countries, publicly funded research projects are required to lay out data man-
agement plans that ensure academic outputs are prepared for preservation and re-use (National Science
Foundation, 2017; Tri-Agency, 2016). The Canadian Tri-Agency Statement for Principles on Digital
Data Management, for example, includes an overview of the responsibilities of researchers, research com-
munities, research institutions and research funders, as well as best practices in data management planning
throughout the research project lifecycle. These broader developments are encouraging archaeologists in
enabling sharing of digital archaeological data over Web-based platforms. Data publishers such as Open
Context, and digital repositories such as US-based, the Digital Archaeological Record (tDAR), UK-based
Archaeology Data Service (ADS), and the Advanced Research Infrastructure for Archaeological Dataset
Networking in Europe (ARIADNE) offer new opportunities for data re-use. These efforts reflect greater
control over metadata in archaeology and the potential for new forms of collaborative research.
Method
Consideration of data quality is invariably linked to research design and the goals of a particular project.
Until recently, spatial data quality in archaeology seemed to refer to positional accuracy which is com-
monly associated with tools and technologies such as Global Positioning Systems (GPS) and remotely
sensed imagery (Wheatley & Gillings, 2002). Emphasis on locational information comes as no surprise,
given archaeological interests in field-based research and because of requirements in Cultural Resource
Management (CRM) and planning to inventorise archaeological sites (Heilen et al., 2008). Archaeologists
made great efforts to better model the spatial dimension in digital archaeological data, overlooking the
temporal dimension or chronology (Llobera, 2007; Rabinowitz, 2014). Yet archaeological data have spa-
tial, temporal and thematic dimensions, all of which must be considered in any evaluation of data quality,
and especially when data-intensive methods are employed.
Archaeological field data are typically complemented with terrestrial imagery (e.g. ground penetrat-
ing radar, aerial, and satellite imagery) and the recovery of portable artefacts such as potsherds, tools and
skeletal material. Archaeological documentation of surface features such as earthworks, field walls, monu-
ments, pathways, rock images, and subsurface ones such as hearths, camps, and dwellings and their spatial
relationships with other recovered material culture can be thought of as a collection. In this conceptual-
ization, the archaeological database is differentiated from the archaeological record. The latter refers to material
culture that exists, whether it has been recovered or awaits investigation (Gupta & Devillers, 2017). The
archaeological database consists of collections that archaeologists have successfully recovered at different
22 Neha Gupta
times and places, and can be thought of as an imperfect model of a complex reality. A growing, more
reliable archaeological database can facilitate insights into human history.
In practice, archaeologists are increasingly digitizing and integrating new archaeological data with
archaeological collections stored in local and national repositories for combined analysis (Kintigh et al.,
2015). Yet repositories are themselves a product of the society in which they were created, and thus,
social, political, cultural and historical circumstances influence them. One might consider, for example,
why a particular collection is chosen for digitization, and how and why specific classes of data within
that collection are preserved and curated. These decisions can impact subsequent study of these research
data. Data-intensive methods that integrate different collections take on their assumptions and limitations
(Atici et al., 2013), in addition to uncertainties in any new data (Allison, 2008).
Moreover, at some point in the life of archaeological data, regardless of their acquisition through
research or regulatory projects, they will be in the hands of experts who do not have direct access to
the original data collectors, their ‘contextual knowledge’ and field journals. In practice, the person who
acquired data on-site during archaeological fieldwork typically also encodes them for further use. There-
fore, the encoder had pre-existing knowledge of the spatial relationships in the data that enable linkages
between individual documents. Ideally, the same archaeologist analyses, interprets and presents results, a
situation that is typical in small academic projects. In the case of regulatory or CRM archaeology, digital
archaeological data, once acquired, might be transferred to data analysts and data managers.
With site information, aerial photographs, geophysical readings, and topographic surveys, an
archaeologist might prepare derived data products such as digital elevation models and files that store
the location, shape and attributes of archaeological features (points, lines, polygons). These data are
processed and analysed within a computational pipeline, the results of which are used to produce a
synthetic document that receives some form of peer review either as a technical report, or a scholarly
publication (Van der Linden & Webley, 2012). Such documents, particularly those produced under
regulatory frameworks, in turn can be the basis upon which scholars and policy makers make decisions
that impact local communities and society as a whole. Yet, in many cases, although not all, the data
themselves are not subject to review (Gobalet, 2001; Roebroeks, Gaudzinski-Windheuser, Baales, &
Kahlke, 2017), nor is quality information on research data necessarily made explicit (McCoy, 2017,
pp. 4–5). This oversight can ‘hide serious logical and empirical faults in the underlying assumptions’
in archaeological practice (e.g. in CRM archaeology, the failure to detect archaeological sites despite
100% or full-coverage survey) (Heilen et al., 2008, p. 1.1). This situation, however, does not mean that
the quality of data did not matter or that these data cannot be repurposed. Rather, recent scholarship
suggests that archaeologists are concerned about the quality of data, and have criteria upon which they
base their level of confidence. It should come as no surprise that insights into data management and
field methods are often gained through repurposing of existing data.
For example, Wells (2011) presents the integration of archaeological information from four state
historic preservation offices (SHPOs) in the United States. The SHPOs included in the study were
Kentucky, Illinois, Indiana and Missouri and each office stored and maintained archaeological data in
a GIS. The author notes that archaeological site records in each spatial database included similar basic
information. Wells examines the format, projection and coordinate system of location information (e.g.
polygon shapefile in Lambert conformal conic, measurements in feet) to assess interoperability across the
four sources. To bridge the four sources, the author devised six categories of attributes, including loca-
tion information, site identification, site type, definitions of one specific cultural affiliation, the quality of
previous investigations and an assessment of site informational quality (2011). Although Wells does not
explicitly define ‘quality’, he had clear criteria upon which to evaluate archaeological information such as
Preparing data for spatial analysis 23
cultural affiliations and Mississippian culture change. Specifically, he bases the strength of these ‘ontologi-
cal definitions’ on two factors, namely; the level of investigation at a site, as it offers consideration on how
far an ontological characterization can be extended (maximum intensity of previous investigations), and
second, the diversity of data structures used to represent the diversity of investigative approaches. Most
fundamentally, this approach highlights thematic information in assessing overall data quality. Wells shows
that careful evaluation of spatial and thematic accuracy enables thoughtful integration and meaningful
repurposing of archaeological site records.
In an era of cyber-infrastructures, scholars are increasingly interested in ‘grey literature’, unpublished
reports prepared by professional archaeologists under regulatory frameworks as a source of archaeologi-
cal information. Some scholars have shed light on the ‘accuracy, reliability and completeness’ (Evans,
2013, p. 20) of these unpublished documents. In his examination of three sources on archaeological
field investigations in England – the National Monuments Record Excavation Index, the Archaeo-
logical Investigations Project, and Online Access to the Index of Archaeological Investigations – Evans
(2013) suggests that greater efforts are necessary to understand the limitations of unpublished reports.
The author examines and compares the three national spatial databases, and like Wells, considers
archaeological site records or ‘events’ within them. In his study, Evans (2013, p. 26) devised overarch-
ing nomenclature to incorporate the range of terminology that describes on-site investigations such as
‘post-determination/research’, ‘evaluation’ and ‘excavation’. The author then analysed the frequency
of reporting across these investigative approaches between 1990 and 2007. Evans also examined each
source for records on one county (Staffordshire) to ascertain gaps in coverage between them, challeng-
ing perceptions that national databases are complete and ‘authoritative’ (2013, p. 32). He concluded
that meta-analysis highlight the uneven distribution of archaeological investigations, identifying
regions where investigations have been overlooked and where accepted data standards have not been
implemented (2013, pp. 21–22).
Similarly, in his examination of ‘integrative databases’, McCoy (2017, p. 77) remarks that ‘clear biases’
are evident when distribution of site records is presented on a map. The author defines integrative data-
bases as those that ‘continuously take in new information’, usually from a variety of sources, distinguish-
ing them from ‘archival databases’. Archival databases are those that ‘grow by accretion of distinct datasets’
(e.g. tDAR, ADS) (McCoy, 2017, pp. 75–76). The author suggests that the quality of geospatial data can
be thought of as ‘how well the dataset conforms to established best practices’ (2017, p. 78). To this end,
McCoy (2017, pp. 91–92) has proposed a ‘standalone quality report’ that describes the archaeological
geospatial data and how they were derived for tasks such as research, assessment and documentation. This
quality report would be a supplement to technical information in metadata. While potentially fruitful,
we currently do not know how effective metadata and quality reports are in archaeology, to what degree
quality information minimizes misuse of digital archaeological data and/or potentially enables their
re-use.
In the English Landscapes and Identities project, Cooper and Green (2016, p. 289) seek to integrate
diverse ‘secondary digital datasets’ from national and regional archaeological repositories (Green, this
volume). These data include GIS-based vector files, associated documents in portable document format
and spreadsheets, and in one case, records that were downloaded from a website (Cooper & Green, 2016,
p. 300). The authors treated ‘multiple and varied representations’ of an archaeological entity within dif-
ferent sources ‘as if it is accurate’ (Cooper & Green, 2016, p. 292). The authors note that the source data
have ‘diverse histories, contents and structures’ and are ‘riddled with gaps, inconsistencies and uncertain-
ties’ (Cooper & Green, 2016, p. 294). The authors do not offer insight into what they consider to be
‘inconsistencies and uncertainties’ or what impact error and imperfections have on potential re-use of
24 Neha Gupta
these data. They do, however, suggest that thematic information in site records can shed light on spatial
relationships between archaeological sites, particularly when analysed at the ‘national level’. The authors
remark that at this scale of analysis, spatial precision becomes less important and emphasis shifts to the
‘spatial character’ of structures such as field systems including their length and orientation.
Heilen et al. (2008) examine data quality in American archaeology from the perspective of CRM
projects. In their study of military installations for the Department of Defense, they focused on survey
reliability, site location recording and site boundaries. They note that whilst overall accuracy of site loca-
tion recording improved with the use of GPS, this brought other concerns to the fore. They suggest that
with definitions and standards ‘came the expectation that data collected at different times, by different
contractors, would be equivalent in quality’ (p. 5.1). The authors remark that this assumption has resulted
in sites being ‘mischaracterized’. For example, small artifact scatters that were recorded at a location were
later identified as large village sites, and some sites were missed entirely. Furthermore, they highlight key
issues in the management of inventory data; specifically, that location information on archaeological sites
can be accurate, yet, the ‘size, shape, depths and importance’ changes with ‘environmental conditions’ and
‘academic debate’, suggesting the complexity of delineating these attribute values (2008, pp. 5.6–5.7).
They observe the problematic practice of deletion of ‘repeated’ site records in favour of the most recent
site inventory record. In this context, the authors recommend detailed records on the history of site dis-
covery and recording that include the equipment used, as well as details on field methods such as survey
intervals, transect size and shapes, shovel-test design, and observations on erosion and visibility at the time
of field documentation. While the authors do not discuss how data managers would interpret this infor-
mation or how such an evaluation would impact the quality of inventory data, they do draw attention
to thematic and temporal accuracy in digital archaeological data. These efforts underscore institutional
practices as a factor in data quality in archaeology.
Data preparation is based on data quality, which in turn is fundamentally tied to research design and
a user’s intended purpose, as I have described above. In order for archaeologists to share the preparation
of high quality data, methods and analytic techniques, we must document how our data transformed
from one state to another. As noted, in digital environments, the use of commercial software within the
archaeological workflow often means that we impose ‘black boxes’ that prevent us from examining and
modifying underlying algorithms and code. In this context, a black box can refer to an instrument, device
or software that receives an input and delivers an output, yet its internal workings are unknown or poorly
understood by a scholar. This situation can cast doubt on the received output. When archaeologists give
up the opportunity to critically evaluate and improve existing tools and technologies, we, in effect, limit
the aims and scope of archaeology (Marwick, 2018). Much effort is put toward cleaning and tidying data
to make them machine understandable and re-usable, yet these investments are lost in a computational
pipeline that is closed to scholarly review, modification and development.
In data-intensive archaeology, three concepts are of prime importance; namely, scripted workflows,
versioning, and open and collaborative research processes. These concepts are central in preparing
archaeological data for data analysis and can be facilitated by data cleaning tools and techniques that offer
‘recipes’ or replicable steps. Some of techniques are applicable specifically to geospatial data while others
have a broader scope. These tools and techniques in turn, can create opportunities to disassemble black
boxes in archaeology’s computational pipeline. Documentation of data transformation and code sharing
can enable more rigorous archaeological research, while also opening intellectual space for collabora-
tion across disciplinary boundaries. I show how archaeologists might employ tools such as OpenRefine,
languages such as Python and versioning systems such as GitHub to document, manage and share digital
archaeological data and code. I emphasize that whilst these technologies might change, the goal remains
the same: dismantling black boxes in archaeology.
Preparing data for spatial analysis 25
Scripted workflows
Scripted workflows are a way to document the research process, which in turn, can disable black boxes
in archaeology. A script is typically a simple text file that consists of instructions to initiate and complete
tasks in a computational environment. These instructions can be combined with other instructions to
complete different tasks with the archaeological research process, or workflow. For example, an archaeolo-
gist might write a script to transform geographic coordinates (latitude, longitude) to Universal Transverse
Mercator (northing, easting) coordinates based on some specifications, save these transformed data to a
new file and display them on a map. In this case, the script serves not only as instructions for computa-
tional tasks, but also as a ‘very high-resolution record’ of the research process that can be shared, examined,
modified and re-used multiple times and by different scholars (Marwick, 2017, p. 432). This is crucial as
most commercial software do not enable documentation of the research process.
Scripted workflows have been utilized in different fields, including processing of geospatial infor-
mation from different sources for land classification (Leroux Lemonsu, Bélair, & Mailhot, 2009). In a
scripted workflow, the ‘process becomes public, transparent and reproducible’ (Thompson, Matloff, Fu, &
Shin, 2017). A scripted workflow can contain instructions for several, often sequential tasks within, and
throughout data processing, visualization and analysis and presentation. The key facet in a scripted work-
flow is its explicit description of process and code to enable transformation of data, a situation that can
facilitate insights into decisions that were made during processing, their potential impact on results, and
how best the data and code might be re-used.
Science aims to promote transparency, openness and reproducibility across scientific disciplines and
change the culture of research publication (Nosek et al., 2015). To that end, Nosek et al. (2015, p. 1424)
envision more rigorous publication policies, and they propose eight standards that open research com-
munication aspires to, including citation standards, data transparency (sharing), analytic methods (code)
transparency, research materials transparency, design and analysis transparency, preregistration of studies,
preregistration of analysis plans and replication. Recognizing that journals vary across disciplines and that
there are barriers to adopting the standards, each standard is measured on three different levels (Nosek
et al., 2015, p. 1425). The levels, increasing in stringency, are meant to facilitate gradual adoption of the
eight standards. Implementation is recognized by ‘badges’.
A growing number of scholars are interested in open research data as a way to practice ‘better science’
(Molloy, 2011; Foster & Deardorff, 2017). They draw attention to barriers to ‘maximum dissemination
of scientific data’ such as inability to access data, restrictions by publishers on data usage, and difficulties
in re-use due to poor annotation, as well as cultural concerns over losing control over data and the lack
of incentives to make data re-useable. While informative, these efforts tend to address communication
issues in research, overlooking deeper structural inequalities in academia and in society.
For example, in his call to humanize open science, Eric Kansa (2014, p. 32) draws attention to ‘under-
lying causes’ of dysfunction in research, beyond technical and licensing issues. He argues that broadening
the boundaries of open science to encompass ‘systematic study’ creates intellectual space for social sci-
ence and humanities scholars, enabling them to meaningfully engage with efforts in reforming research.
Kansa (2014, p. 36) rightly observes that archaeology relies on primary research data, and that recovered
material culture is not replaceable or renewable, yet archaeologists are often reluctant to disseminate and
archive research data. He suggests that these challenges reflect neoliberal values and problematic institu-
tional practices (Kansa, 2014, pp. 50–51). Most crucially, Kansa (2014, p. 52) remarks that a ‘high level
of collegiality and trust’ are necessary for truly opening the research process to a wider community, a
situation to which archaeologists can certainly relate. He suggests that open science can succeed when
real efforts are made to ‘dismantle a powerful and entrenched set of neoliberal ideologies and policies’
(Kansa, 2014, p. 54).
In this context, collaborative research, particularly with Indigenous and descendent communities is an
overarching theme in archaeology of the 21st century. Ownership of the past, including digital archaeologi-
cal data is emerging as a key concern amongst equity-seeking groups in the United States, Canada, Australia
and New Zealand. In this context, Indigenous peoples want to generate knowledge about their ancestors,
and they are increasingly engaging with digital tools and technologies to challenge colonial practices that
prevented them from access to, and control of archaeology. This is particularly pressing in scenarios where
archaeology is practiced within a regulatory framework that privileges government-and/or CRM-led field
collection. Barriers to accessing primary research data persist for many Indigenous peoples and archaeolo-
gists, and these social issues are impacting how archaeological research is carried out (Gupta, Nicholas &
Blair, n.d.). These tensions will continue to influence the way ‘openness’ is practiced in archaeology.
Case studies
extreme values, duplicate records, misspellings, missing values and other input errors. It should come as
no surprise then that archaeologists generally do not document the transformation of data in the compu-
tational pipeline, although this situation is changing (Kansa et al., 2014; Marwick et al., 2017; Strupler &
Wilkinson, 2017; Marwick, 2018; more broadly, see Shawn Graham’s open lab notebook). The common
thread in each of these works is the aim to lay bare ‘point-and-click’ procedures, while documenting what
worked and what did not.
A key aspect in data cleaning is its iterative nature, i.e. that the analyst must go through a number of
transformations and cleaning routines that are often non-linear, and tailored to specific analytic goals
and quality specifications. Interactivity and visualization are important as an analyst works through
cleaning routines, and data cleaning systems typically offer user interfaces that enable an analyst to
write cleaning sequences, preview them on a portion of the data, and then apply these instructions to
whole sets of data. The instructions are saved and can be un-done or extracted at any step. The clean-
ing sequences can also be applied to other data and offer a real-time history of transformations. This
kind of documentation can be readily reviewed, shared, modified and repurposed. Below, I offer an
example through OpenRefine, an open source, standalone, desktop application that supports iteration
with a spreadsheet style interface.
In his study of data quality, privacy and ‘geospatial big data’, McCoy (2017, p. 79) examines the
case of publically available and ‘professional’ or privately maintained and restricted archaeological site
records. Specifically, he evaluates the frequency and density of reported fortifications across New Zea-
land in three sources, and complements them with LiDAR images for one particular fortification called
Puketona Pa. The author employed spreadsheet software and ArcGIS for his analysis. The professional
database of archaeological site records, developed and maintained by the New Zealand Archaeological
Association, is available only through prior authorization and is not considered here. The two publi-
cally available sources are a radiocarbon database maintained by the Waikato Radiocarbon Lab with
1671 records, and location information on fortifications maintained by Land Information New Zea-
land. In a GIS, these data are represented as a point, a single location defined by a set of geographic
coordinates. Thematic information such as the name of an archaeological site, the site identification
number, site type, the radiocarbon date, the material that was sampled, and the source of the sample
are added as ‘attributes’ to the point.
For each source, McCoy describes the methods and analytic techniques he employed, yet there is lim-
ited documentation on his data cleaning and processing. This is somewhat surprising given his remarks
that ‘filtering, classifying and coding temporal information’ was the most time-intensive part of the
analysis (McCoy, 2017, p. 84). Elsewhere, the author notes that the radiocarbon data were downloaded as
a Google keyhole markup zip (kmz) and then ‘transferred’ to the commercial GIS software, ArcMap 10.3.
However, McCoy (2017, p. 83) notes that information on ‘site type, and material dated did not migrate
smoothly’, a problematic situation because these two fields were central in processing temporal values.
To correct this, the author manually searched the online database for lab identification numbers and ‘re-
attach[ed]’ the missing information to all 1,671 site records.
I offer an alternative processing sequence on OpenRefine for McCoy’s radiocarbon data that trans-
forms the information for use in a GIS without manual searching and re-attachment of missing data
fields. I note that ArcMap and other GIS software, such as QGIS have available tools that convert between
Google’s keyhole markup language (kml) and shapefiles (shp). These procedures are ‘point-and-click’
within GIS software, and by default, the software does not retain a history of transformations in a project.
In OpenRefine, the cleaning sequence created can be applied to other data in need of similar processing
and most importantly, for the purpose of this study, it serves as documentation of data transformations
that are typical in data-intensive methods. For example, a recipe can be created for identifying missing
28 Neha Gupta
Figure 2.3 Parsing options in OpenRefine for a file in keyhole markup language.
Note that the information of interest is within the placemark tags.
values, extreme or anomalous values, and resolving them. More complex tasks such as linking codes or
shorthand (e.g. LBK for Linearbandkeramik and ‘grv’ for grave) from field journals and code books can
be facilitated through a cleaning sequence. The cleaned data can be exported in Comma Separated Value
(csv) format which are easily read in GIS software.
The radiocarbon data were directly accessed through the Web link supplied in McCoy (2017, p. 82)
(www.waikato.ac.nz/nzcd/C14kml.kmz). OpenRefine enables parsing of data in extensible markup lan-
guage (xml), which is a standard used in Google’s keyhole markup language (kml) (Google Developers,
2018) and is therefore interoperable (Figure 2.3). The placemark is an object that contains three elements;
namely a name, a description and a point that specifies the position of the placemark on the Earth’s sur-
face using a pair of coordinates (longitude and latitude). Additional thematic information is added to the
placemark as ‘description’, as well as styling for the icon and text. Thus, each placemark object contains
information that is of greatest interest.
Once parsed, the data are displayed within a spreadsheet-style interface with rows and columns
where they and the values within them can be cleaned (Figure 2.4). Column names are based on tags
<tag> </tag> within the placemark object, and include information that is not relevant for further analy-
sis. Visual inspection of the data show empty rows and unnecessary columns that can be removed. More
importantly, thematic information (e.g. site name, site type, etc.) are all parsed into one column (Placemark –
description), and they have tags that will cause problems in further analysis. But information within the tags
is needed and must be separated into individual columns for use in a GIS. When performed manually on
over 1600 records, such an undertaking can easily result in input error and unintended modifications and
Preparing data for spatial analysis 29
Figure 2.4 A spreadsheet style interface on OpenRefine that shows information in columns.
Note that radiocarbon and site information is within tags and will require cleaning.
deletions. With OpenRefine, it is possible to write a cleaning sequence that can be previewed on part of the
data, and then applied to all records. In this case, the cleaning sequence is shown in Box 2.1:
Once the sequence is implemented, the data are manipulated accordingly. The figure below shows the
resulting data (Figure 2.5), along with the history of the cleaning sequence (Figure 2.6). Note that the
cleaning sequence is available as description and as code. The code can be extracted and applied to other
data that need similar processing. The ‘clean’ data can be exported as a cross-platform spreadsheet format
(csv) that can be read routinely by GIS software.
Figure 2.5 The cleaned version of the file ready with coordinates for mapping.
30 Neha Gupta
Figure 2.6 The cleaning sequence or ‘recipe’ for converting kml into comma separated value (csv) format. The
code can be exported, modified and re-used.
This brief case study did not replicate McCoy’s manual processing and re-attachment of values and
thus, it cannot offer any specific measure of duration of that task nor compare it with processing time in
OpenRefine. Automating transformations can reduce the potential for mistaken entry or deletion within
a spreadsheet. Moreover, in using a platform that enables writing of a cleaning sequence, we gain clear
documentation of data transformations and facilitate potential re-use of these procedures. This resulting
routine can be reviewed, modified and repurposed for processing other geospatial data.
Figure 2.7 An overview of the location and estimated size of the study area in Saint-Pierre, France.
I present the case of a survey on the island of Saint-Pierre, France, where initial archaeological field-
work was carried out using a total station and a handheld Global Positioning Systems (GPS) unit. I
document efforts to transform data collected in a local coordinate system to a global system in a situa-
tion where only two control points are available. Transformation of locational information into a global
coordinate system can facilitate the integration of field data with other sources, and can enable spatial
analysis of archaeological data.
The study area is located on the eastern coast of the island of Saint Pierre, France (Figure 2.7). A field
survey with a total station and handheld GPS unit was carried out as part of an archaeological project at
Memorial University of Newfoundland, Canada, to identify historical (18th century – onwards) settle-
ment on this part of the island. The initial survey team consisted of three archaeologists. Field collection
in the study area (measuring approximately 100 × 200 m), was organized into two surveys; one focused
on archaeological features visible on the surface, and the second focused on recording topography at
regular intervals, with the intention to bring these data into a GIS and examine them with historical
maps and other documents. For example, the archaeological features can be made into polygons (where
appropriate) with thematic information, enabling measurement of size and shape of surface features. This
information can be used to assess survey strategy, and offers a historical document prior to site excava-
tion. Therefore, a geographically referenced model of the topography and archaeological features was
highly desirable.
The first survey on archaeological features (features) resulted in 178 points, and the second survey
on topography (landscape) consisted of 343 points (Figure 2.8). All measurements were made in
32 Neha Gupta
Figure 2.8 A map showing points from two surveys that were collected on a total station. Location of the
total station or origin is represented as a star, survey on archaeological features on surface is marked in green,
and the survey of topography is in brown. A colour version of this figure can be found in the plates section.
Note that the feature survey data is rotated.
metres. Both surveys had the same origin and back sight for registration. Coordinates for the origin
and back sight were recorded on the handheld GPS unit with an error of +/-5 metres. During initial
processing of the data in a local coordinate system, we immediately identified a significant problem
with the first survey. The survey points were rotated to some degree and had to be corrected prior to
being transformed to a global coordinate system. The team recognized that a mistake in registering
the total station set-up was likely the source of this error, yet the second survey did not share this
dislocation. Because the team had used an old model total station that came with limited support for
processing survey measurements, the project directors decided it was necessary to develop an equation
to adjust the archaeological features based on known locations in geographic space i.e. the origin
and back sight.
The situation, however, was not ideal as there were only two control points (origin and back sight)
available and most transformations require a minimum of three control points. For example, the
CHaMP Topo Processing tool developed by Wheaton, Garrard, Whitehead, and Volk (2012) at Utah
State University for use in ArcMap is available for local to global coordinate transformations. However,
this tool was not used because of its three control point prerequisite. To transform the survey data,
Maria Yulmetova (2018), a student at Memorial University of Newfoundland developed a Python
script that calculated the rotation factor to transform local coordinates into Universal Transverse
Preparing data for spatial analysis 33
Figure 2.9 The survey points overlaid on a scanned map that is geo-rectified to WGS-UTM 21. A Python
script was developed to enable rotation and transformation of points in a local coordinate system to a global
coordinate system (UTM) using two known coordinate pairs. A colour version of this figure can be found in
the plates section.
Mercator (northing and easting) using two control points. In practice, the script generated modified
UTM coordinates that could be aligned with two different sources: a scanned topographic map from
the National Institute of Geographical and Forestry Information (IGN-F), France (Figure 2.9), and on
imagery from Google Earth. The validation was based on visual inspection of the overlap between
survey measurements and features visible on the topographic map. With survey data corrected, it was
possible to more closely examine archaeological features, and their estimated size and shapes alongside
historical documents.
The script reflects a step towards creating a tool that archaeologists can use for transformations of
survey data that have a limited number of control points. Because most tools for processing survey data
are proprietary, they are not available for scholarly review or modification, as was needed in this scenario.
When the underlying code is available, it is possible to alter and customize default criteria and this in
turn, can enable more appropriate decision-making and more rigorous research practice in archaeology.
More fundamentally, scripted workflows offer archaeologists the chance to engage more deeply with the
range of tools and technologies they employ throughout the archaeological workflow, and open cross-
disciplinary collaborations with geographers, cartographers and computer scientists in disabling black
boxes. By sharing their scripted workflows, archaeologists can encourage re-use, modification and refine-
ment of ‘recipes’ and data processing tools and technologies. Furthermore, greater attention to examining
and modifying code can create intellectual space for training and empowering undergraduate and gradu-
ate students for archaeology of the 21st century (Marwick, 2017).
34 Neha Gupta
Conclusion
Preparation of archaeological data for further analysis, curation and re-use is tied to data quality, which in
turn, is integral to research design and an archaeologist’s intended use of archaeological data. Geospatial
technologies such as GIS are routinely used in archaeology to manage, store and analyse large amounts of
digital archaeological data. However, these spatial databases are known to have poor error management,
a situation that can result in error propagation that impacts on subsequent analysis and the final result.
The widespread use of GIS in archaeology therefore can constrain a broader assessment of archaeological
methods and the appropriateness of data in terms of interpretation and re-use. Furthermore, processing
of data and analysis within computational pipelines is rarely documented and shared, a situation that
limits what is known on procedures and transformations and the overall quality of archaeological data.
Without clear documentation of how data were processed, archaeologists impose black boxes within the
archaeological workflow that prevent examination of how data were transformed from acquisition to
their final presentation and publication. This situation is especially problematic when point-and-click
software is uncritically utilized. Archaeologists must become ready to dismantle black boxes at a moment
when greater amounts of ‘born digital’ archaeological data, are being generated.
Recent interests in the preparation of archaeological data and the quality of those data are influenced
by the growing use of digital and geospatial tools and technologies, and the rapid growth of communica-
tion tools that facilitate exchange and sharing of data between scholars, institutions and non-specialists.
Archaeologists are accumulating large amounts of data through real-time digital documentation in the
field, that are paperless and these efforts are thought to minimize redundancy and human-introduced
errors in the recording of archaeological sites and archaeological data. These efforts can also potentially
shorten the time interval between data acquisition, processing, analysis and presentation and publication.
A growing awareness of a digital data-r ich environment is encouraging archaeologists to think in
terms of data-intensive methods and big data whilst highlighting that greater efforts are needed to docu-
ment and report how decisions were made on cleaning, analysing and publishing data. This situation
presents challenges and opportunities for archaeologists. Calls for Open Science in archaeology reflect
these tensions, and offer a way forward in terms of promoting the generation of scripted workflows, ver-
sion control for data management and collaborative research. Recognition that the interests and needs of
social groups differ in terms of ownership of the past are attuning archaeologists to the role of institutional
practices in data quality issues.
Greater and more stringent control over metadata is enabling archaeologists in documenting their data
creation methods, sampling techniques and contextual information that facilitates the re-use of digital
archaeological data. Metadata typically include authorship information, basic project and site descrip-
tions, keywords, chronological ranges and geographical coverage (e.g. bounding box coordinates). The
data publisher Open Context, for example, has shown leadership in preparing digital data for re-use,
including data cleaning, such as performing basic checks on received data to correct data entry errors and
inconsistencies in classification fields, as well as more involved transformations to translate code books
and reconcile them with tabular information (Kansa et al., 2014, p. 60). We do not yet have sufficient
information on how metadata are being used beyond search, browse and filtering for specific records.
Nonetheless, recent developments show that archaeologists are aware of data quality issues and are actively
taking steps to communicate the level of confidence they have in their analysis and interpretation of
archaeological data.
The apparent democratization of archaeological site information has renewed concerns over privacy
and the security of sensitive locational information. Publishing archaeological information presents sig-
nificant challenges and opportunities. Conventional wisdom is that archaeological data collected in the
36 Neha Gupta
field contain sensitive locational information, and that sharing locations of archaeological and histori-
cal sites can facilitate, if not result in the destruction of those sites through looting. Looting and illegal
trafficking of archaeological artefacts and human bones is an issue observed in many places (Brodie,
Doole, & Renfrew, 2001; Huffer & Graham, 2017). These concerns are often heightened in national
contexts where tensions over ownership of the past exist between archaeologists and local communities
and Indigenous peoples. Yet recent developments in geovisual analytics demonstrate that scholars can
meaningfully analyse data even when they contain sensitive location information (Andrienko et al.,
2007). Archaeologists are now putting greater efforts into examining how to share sensitive archaeologi-
cal information, and making explicit scenarios in which such efforts are inappropriate. These efforts are
reflected in conference sessions at the 2018 Society for American Archaeology meetings, such as the
‘Futures and Challenges in Government Digital Archaeology’ symposium organized by Jolene Smith,
and a forum, ‘Keeping Our Secrets: Sharing and Protecting Sensitive Resource Information in the Era of
Open Data’, that was chaired by David Gadsby and Anne Vawser. The ethos of ‘openness’ is encouraging
archaeologists to better understand possibilities in and potential implications of publishing archaeological
data on the Web.
The re-use of digital archaeological data require scholarly efforts in cleaning and better documenta-
tion of these procedures and transformations. This scholarship is being promoted to facilitate deeper
engagement with archaeological methods, which in turn, can open new forms of research in archaeol-
ogy. Archaeologists are increasingly extracting geographical information from historical documents and
repurposing these data for spatial analysis (Murrieta-Flores & Gregory, 2015). Employing sophisticated
techniques such as Natural Language Processing, archaeologists draw out place-names in historical texts
and incorporate them into GIS software. Tools such as geoparsers that automate annotation of texts, and
geo-reference place-names (e.g. create pairs of coordinates) are now being developed for specific corpora.
Platforms such as ORBIS (2018), a geospatial network model of the Roman World developed at Stanford
University and the Pelagios Commons (2018), an online community that enables linked open data on
historical places, are highlighting the range and scope of interdisciplinary scholarship. These efforts often
emphasize collaborative code development and code sharing.
Growing numbers of archaeologists are employing programming languages such as R and Python in
documenting their research processes. Scripted workflows and code sharing is facilitated by Web-based
platforms such as GitHub and Jupyter notebooks. For example, the Open Digital Archaeology Textbook
and Environment (Graham et al., 2018), an open-access digital textbook makes extensive use of Jupyter
notebooks to share code and data for teaching purposes (https://o-date.github.io/support/notebooks-
toc/). Most crucially, the digital environment offers scholars and learners a platform to read and experi-
ment with code writing. Notebooks of particular interest include one on spatial analysis developed by
Rachel Optiz (https://mybinder.org/v2/gh/ropitz/spatialarchaeology/master), as well as one on process-
ing public data such as Light Detection and Ranging (LiDAR) that are published by local and national
institutions. These notebooks greatly extend the potential and possibilities for scripted workflows in
spatial analysis, within and without traditional GIS software.
Greater attention is now given to archaeological site records as historical documents, and the his-
tory of site discovery as a way to assess data quality. In this context, archaeologists are making greater
effort to employ version control systems that log changes in files and potentially reduce the likelihood
of errors going undetected through the archaeological workflow. Because all files in versioning systems
are authored, it is possible to organize and manage multi-authored projects on these platforms. As such,
documentation of changes to a digital object offers a history of that object, and a way to track error and
its propagation through the archaeological workflow. Implementing good documentation practices into
the archaeological workflow can enable better data quality. Yet version control systems present barriers
Preparing data for spatial analysis 37
in terms of implementation; expertise and institution support and resources are necessary prerequisites.
Nonetheless, version control systems offer great potential for archaeologists who manage site inventory
information that changes and where archaeological data are managed by experts who do not have access
to the original data collectors and their contextual knowledge. These challenges underscore the need for
better data management techniques in archaeology more broadly.
With more stringent control over metadata, there is enormous scope for data-intensive methods in
archaeology. Federal funding agencies are placing greater emphasis on data management plans for funded
projects and these developments are creating an environment in which archaeological data are being more
closely scrutinised for sharing on Web-based platforms. As a result, greater amounts of better documented
data are available for re-use in archaeology, which in turn, can facilitate a better understanding of the
human past. More fundamentally, these efforts are creating opportunities for new forms of research in
archaeology that can promote collaboration with anthropologists, historians, cognitive scientists, geog-
raphers and computer scientists, which in turn, can have broader implications in the social sciences and
humanities.
References
Allison, P. (2008). Dealing with legacy data: An introduction. Internet Archaeology, 24. http://dx.doi.org/10.11141/
ia.24.8.
Andrienko, G., Andrienko, N., Jankowski, P., Kraak, M-J., Keim, D., MacEachren, A. M., & Wrobel, S. (2007).
Geovisual analytics for spatial decision support: Setting the research agenda. International Journal of Geographical
Information Science, 21(8), 839–857.
Atici, L., Kansa, S. W., Lev-Tov, J., & Kansa, E. C. (2013). Other people’s data: A demonstration of the imperative of
publishing primary data. Journal of Archaeological Method and Theory, 19, 1–19.
Austin, A. (2014). Mobilizing archaeologists: Increasing the quantity and quality of data collected in the field with
mobile technology. Advances in Archaeological Practice, 2(1), 13–23.
Australian National Data Service (ANDS). (2018). Data versioning. ANDS. Retrieved October 2018, from www.
ands.org.au/working-with-data/data-management/data-versioning
Averett, E. W., Counts, D. B., & Gordon, J. (2016). Introduction. In D. B. Counts, E. W. Averett, & J. Gordon
(Eds.), Mobilizing the past for a digital future: The potential of digital archaeology. Retrieved from http://dc.uwm.edu/
arthist_mobilizingthepast/
Bampton, M., & Mosher, R. (2001). A GIS driven regional database of archaeological resources for research and CRM
in Casco Bay, Maine. Bar International Series, 931, 139–142.
Banning, E. B., Hawkins, A. L., Stewart, S. T., Hitchings, P., & Edwards, S. (2017). Quality assurance in archaeological
survey. Journal of Archaeological Method and Theory, 24(2), 466–488. https://doi.org/10.1007/s10816-016-9274-2
Bevan, A., Crema, E., Li, X., & Palmisano, A. (2013). Intensities, interactions, and uncertainties: Some new approaches
to archaeological distributions. In A. Bevan & M. W. Lake (Eds.), Computational approaches to archaeological spaces
(pp. 27–52). Walnut Creek, CA: Left Coast Press.
Bevan, A., & Lake, M. W. (2013). Computational approaches to archaeological spaces. Walnut Creek, CA: Left Coast Press.
Brodie, N., Doole, J., & Renfrew, C. (Eds.). (2001). Trade in illicit antiquities: The destruction of the world’s archaeological
heritage. Cambridge: McDonald Institute for Archaeological Research.
Burg, M. B, Peeters, H., & Lovis, W. A. (Eds.). (2016). Uncertainty and sensitivity analysis in archaeological computational
modeling. Switzerland: Springer.
Chrisman, N. (2006). Development in the treatment of spatial data quality. In R. Devillers & R. Jeansoulin (Eds.),
Fundamentals of spatial data quality (pp. 22–30). Newport Beach, CA: ISTE.
Cooper, A., & Green, C. (2016). Embracing the complexities of “Big data” in archaeology: The case of the English
landscape and identities project. Journal of Archaeological Method and Theory, 23(1), 271–304. doi:10.1007/
s10816-015-9240-4
Costa, S., Beck, A., Bevan, A. H., & Ogden, J. (2013). Defining and advocating open data in archaeology. In G. Earl,
T. Sly, A. Chrysanthi, P. Murrieta-Flores, C. Papadopoulos, I. Romanowska, & D. Wheatley (Eds.), Archaeology
38 Neha Gupta
in the Digital Era: Papers from the 40th annual conference of computer applications and quantitative methods in archaeology,
Southampton, 26–29 March, 2012 (pp. 449–456). Amsterdam: University Press.
Crema, E. (2012). Modelling temporal uncertainty in archaeological analysis. Journal of Archaeological Method and
Theory, 19, 440–461.
Devillers, R., & Jeansoulin, R. (2006). Spatial data quality: Concepts. In R. Devillers & R. Jeansoulin (Eds.), Funda-
mentals of spatial data quality (pp. 31–42). Newport Beach, CA: ISTE.
Dibble, H. L., & McPherron, S. P. (1988). On the computerization of archaeological projects. Journal of Field Archaeol-
ogy, 15(4), 431–440. https://doi.org/10.2307/530045.
Dunnell, R. C., Teltser, P., & Vercruysse, R. (1986). Efficient error reduction in large data sets. Advances in Computer
Archaeology, 3, 22–39.
Evans, T. N. L. (2013). Holes in the archaeological record? A comparison of national event databases for the historic
environment in England. The Historic Environment: Policy & Practice, 4(1), 19–34. https://doi.org/10.1179/1756
750513Z.00000000023.
Foster, E. D., & Deardorff, A. (2017). Open Science Framework (OSF). Journal of the Medical Library Association :
JMLA, 105(2), 203–206. doi:10.5195/jmla.2017.88.
Gobalet, K. W. (2001). A critique of faunal analysis: Inconsistency among experts in blind tests. Journal of Archaeologi-
cal Science, 28(4), 377–386.
Google Developers. (2018). What is KML? Retrieved March 2018, from https://developers.google.com/kml/
Graham, S. Open lab notebook. Retrieved March 20, 2018, from https://electricarchaeology.ca/
Graham, S., Gupta, N., Smith, J., Angourakis, A., Carter, M., & Compton, B. (2018). The open digital archaeology
textbook environment. Retrieved from https://o-date.github.io/draft/book/
Green, C. (2011). It’s about time: Temporality and intra-site GIS. In E. Jerem, F. Redő, & V. Szeverényi (Eds.), On
the road to reconstructing the past: Computer applications and quantitative methods in archaeology (CAA): Proceedings of the
36th international conference, Budapest, April 2–6, 2008 (pp. 206–211). Budapest: Archaeolingua.
Gupta, N., & Devillers, R. (2017). Geographic visualization in archaeology. Journal of Archaeological Method and Theory,
24(3), 852–885.
Gupta, N., Nicholas, R., & Blair, S. (n.d). Post-colonial and indigenous perspectives in digital archaeology. In E.
Watrall & L. Goldstein (Eds.), Digital heritage and archaeology in practice. University Press of Florida. Retrieved from
http://dhainpractice.anthropology.msu.edu/
Heilen, M. P., Nagle, C. L., & Altschul, J. H. (2008). An assessment of archaeological data quality: A report submitted in
partial fulfillment of legacy resource management program project to develop analytical tools for characterizing, visualizing, and
evaluating archaeological data quality systematically for communities of practice within the department of defense. Department
of Defense Legacy Resource Management Program, Technical Report 08–65, Statistical Research Inc., Tuscon, AZ.
Hodder, I., & Orton, C. (1976). Spatial analysis in archaeology. New York: Cambridge University Press.
Huffer, D., & Graham, S. (2017). The insta-dead: The rhetoric of the human remains trade on Instagram. Internet
Archaeology, 45. https://doi.org/10.11141/ia.45.5
Huggett, J. (2015). Digital haystacks: Open data and the transformation of archaeological knowledge, In A. T. Wil-
son & B. Edwards (Eds.), Open source archaeology: Ethics and practice (pp. 6–29). Walter de Gruyter GmbH & Co KG.
Hunter, G. J., & Beard, K. (1992). Understanding error in spatial databases. Australian Surveyor, 37(2), 108–119.
https://doi.org/10.1080/00050326.1992.10438784
Kansa, E. C. (2011). Introduction: New directions for the digital past. In E. C. Kansa, S. W. Kansa, & E. Watrall
(Eds.), Archaeology 2.0: New approaches to communication and collaboration (pp. 1–25). Los Angeles, CA: Cotsen
Institute of Archaeology Press.
Kansa, E. C. (2014). The need to humanize open science. In S. Moore (Ed.), Issues in open research data (pp. 31–58).
Ubiquity Press. doi:10.5334/ban.c.
Kansa, E. C., Kansa, S. W., & Arbuckle, B. (2014). Publishing and pushing: Mixing models for communicating
research data in archaeology. International Journal of Digital Curation, 9(1), 57–70. https://doi.org/10.2218/ijdc.
v9i1.301
Kintigh, K. (2006). The promise and challenge of archaeological data integration. American Antiquity, 71(3), 567–578.
Kintigh, K., Altschul, J. H., Kinzig, A. P., Limp, W. F., Michener, W. K., Sabloff, J. A., . . . Lynch, C. A. (2015).
Cultural dynamics, deep time, and data: Planning cyberinfrastructure investments for archaeology. Advances in
Archaeological Practice, 3(1), 1–15.
Preparing data for spatial analysis 39
Kohl, P. L., & Fawcett, C. (Eds.). (1995). Nationalism, politics and the practice of archaeology. Cambridge: Cambridge
University Press.
Kohl, P. L., Kozelsky, M., & Ben-Yehuda, N. (Eds.). (2007). Selective remembrances: Archaeology in the construction, com-
memoration and consecration of national pasts. Chicago: The University of Chicago Press.
Kolar, J., Macek, M., Tkáč, P., & Szabó, P. (2015). Spatio-temporal modelling as a way to reconstruct patterns of past
human activities. Archaeometry, 58(3), 513–528. doi:10.1111/arcm.12182
Leeper, T. J. (2015). Collecting thoughts about data versioning: Contribute to Leeper/data-versioning development by creating
an account on GitHub. Retrieved October 2018, from https://github.com/leeper/data-versioning
Leroux, A., Lemonsu, A., Bélair, S., & Mailhot, J. (2009). Automated urban land use and land cover classification for
mesoscale atmospheric modeling over Canadian cities. Geomatica, 63(1), 13–24.
Levy, T. E., & Smith, N. G. (2007). On-site GIS digital archaeology: GIS-based excavation recording in southern
Jordan. In T. E. Levy (Ed.), Crossing Jordan: North American contributions to the archaeology of Jordan (pp. 47–58).
Oakville, CT: Equinox Publishing.
Llobera, M. (2007). Reconstructing visual landscapes. World Archaeology, 39(1), 51–69. http://doi.org/10.1080/
00438240601136496
Marwick, B. (2017). Computational reproducibility in archaeological research: Basic principles and a case study of
their implementation. Journal of Archaeological Method and Theory, 24(2), 424–450.
Marwick, B. (2018). Using R and related tools for reproducible research in archaeology. In J. Kitzes, D. Turek, &
F. Deniz (Eds.), The practice of reproducible research: Case studies and lessons from the data-intensive sciences. Oakland, CA:
University of California Press. Retrieved from www.practicereproducibleresearch.org/case-studies/benmarwick.
html
Marwick, B., d’Alpoim Guedes, J., Barton, C. M., Bates, L. A., Baxter, M., Bevan, A., . . . Wren, C. D. (2017). Open
science in archaeology. The SAA Archaeological Record, 17(4), 8–14.
McCoy, M. D. (2017). Geospatial big data and archaeology: Prospects and problems too great to ignore. Journal of
Archaeological Science, 84, 74–94.
Meskell, L. (2005). Archaeology under fire: Nationalism, politics and heritage in the eastern Mediterranean and Middle East.
London: Routledge.
Molloy, J. C. (2011). The open knowledge foundation: Open data means better science. PLoS Biology, 9(12),
e1001195. doi:10.1371/journal.pbio.1001195
Murrieta-Flores, P., & Gregory, I. (2015). Further frontiers in GIS: Extending spatial analysis to textual sources in
archaeology. Open Archaeology, 1(1), 166–175.
National Science Foundation. (2017). Dissemination and sharing of research results. Retrieved February 2018, from www.
nsf.gov/bfa/dias/policy/dmp.jsp
Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., & Buck, S. (2015). Promoting an
open research culture. Science, 348(6242), 1422–1425. doi:10.1126/science.aab2374
ORBIS: The Stanford geospatial network model of the roman world. (2015). Stanford University Libraries. Retrieved
October 2018, from http://orbis.stanford.edu/
Osborne, J. W. (2013). Best practices in data cleaning: A complete guide to everything you need to do before and after collecting
your data. Retrieved from http://srmo.sagepub.com/view/best-practices-in-data-cleaning/SAGE.xml
Pelagios Commons. (2018). Linking the places of our past. Retrieved October 2018, from http://commons.pelagios.
org/
Plewe, B. (2002). The nature of uncertainty in historical geographic information. Transactions in GIS, 6(4), 431–456.
Porter, S. T., Roussel, M., & Soressi, M. (2016). A simple photogrammetry rig for the reliable creation of 3D artifact
models in the field lithic examples from the Early Upper Paleolithic sequence of Les Cottés (France). Advances
in Archaeological Practice, 4(1), 71–86.
Rabinowitz, A. (2014). It’s about time: Historical periodization and linked ancient world data. ISAW Papers, 7(22).
Retrieved March 2018, from http://dlib.nyu.edu/awdl/isaw/isaw-papers/7/rabinowitz/
Rahm, E., & Hai Do, H. (2000). Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin,
23(4), 3–13.
Roebroeks, W., Gaudzinski-Windheuser, S., Baales, M., & Kahlke, R.-D. (2017). Uneven data quality and the earliest
occupation of Europe: The case of untermassfeld (Germany). Journal of Paleolithic Archaeology, 1(1), 5–31. https://
doi.org/10.1007/s41982-017-0003-5
40 Neha Gupta
Roosevelt, C. H., Cobb, P., Moss, E., Olson, B. R., & Ünlüsoy, S. (2015). Excavation is destruction digitization:
Advances in archaeological practice. Journal of Field Archaeology, 40(3), 325–346. https://doi.org/10.1179/2042
458215Y.0000000004
Silberman, N. A. (1989). Between past and present: Archaeology, ideology, and nationalism in the modern Middle East. New
York: Holt. Retrieved from http://hdl.handle.net/2027/heb.02303.0001.001
Sitara, M., & Vouligea, E. (2014). Open access to archeological data and the Greek law. In A. Sideridis, Z. Kardasi-
adou, C. Yialouris, & V. Zorkadis (Eds.), E-democracy, security, privacy and trust in a digital world. e-Democracy 2013.
Communications in Computer and Information Science, 441. Cham: Springer.
Snow, D. R., Gahegan, M., Giles, C. L., Hirth, K. G., Milner, G. R., Mitra, P., & Wang, J. Z. (2006). Cybertools and
archaeology. Science, 311(5763), 958–959.
Strupler, N., & Wilkinson, T. C. (2017). Reproducibility in the field: Transparency, version control and collaboration
on the project panormos survey. Open Archaeology, 3(1). https://doi.org/10.1515/opar-2017-0019
Thompson, P. A., Matloff, N., Fu, A., & Shin, A. (2017, August). Having your cake and eating it too: Scripted
workflows for image manipulation. ArXiv:1709.07406 [Eess]. Retrieved from http://arxiv.org/abs/1709.07406
Tri-Agency Statement of Principles on Digital Data Management. (2016). Retrieved February 2018, from www.
science.gc.ca/eic/site/063.nsf/eng/h_83F7624E.html?OpenDocument
Trigger, B. (2006). A history of archaeological thought (2nd ed.). New York: Cambridge University Press.
Van den Broeck, J., Argeseanu Cunningham, S., Eeckels, R., & Herbst, K. (2005). Data cleaning: Detecting, diagnos-
ing, and editing data abnormalities. PLoS Medicine, 2(10), e267. https://doi.org/10.1371/journal.pmed.0020267
Van der Linden, M., & Webley, L. (2012). Introduction: Development-led archaeology in northwest Europe: Frame-
works, practices and outcomes. In L. Webley, M. Van der Linden, C. Haselgrove, & R. Bradley (Eds.), Development-
led archaeology in Northwest Europe proceedings of a round table at the University of Leicester 19th–21st November 2009
(pp. 1–8). Oxford: Oxbow.
Vincent, M. L., Kuester, F., & Levy, T. E. (2014). OpenDig: Contextualizing the past from the field to the web. Medi-
terranean Archaeology and Archaeometry, 14(4), 109–116.
Wells, J. (2011). Four states of Mississippian data: Best practices at work integrating information from four SHPO
databases in a GIS-structured archaeological Atlas. Society for American archaeology e-symposium. Retrieved
from http://visiblepast.net/see/americas/four-states-of-mississippian-data-best-practices-at-work-integrating-
information-from-four-shpo-databases-in-a-g is-structured-archaeological-atlas/
Wheatley, D., & Gillings, M. (2002). Spatial technology and archaeology: The archaeological applications of GIS. London:
Taylor & Francis.
Wheaton, J. M., Garrard, C., Whitehead, K., & Volk, C. J. (2012). A simple, interactive GIS tool for transforming
assumed total station surveys to real world coordinates: The CHaMP transformation tool. Computers & Geosci-
ences, 42, 28–36.
Willems, W. J. H., & Brandt, R. (2004). Dutch archaeology quality standard. Den Haag: Rijksinspectie voor de
Archeologie.
Wilshusen, R. H., Heilen, M., Catts, W., de Dufour, K., & Jones, B. (2016). Archaeological survey data qual-
ity, durability, and use in the United States. Advances in Archaeological Practice, 4(2), 106–117. https://doi.
org/10.7183/2326-3768.4.2.106
Wylie, A. (2002). Thinking from things: Essays in the philosophy of archaeology. Berkeley, CA: University of California
Press.
Yulmetova, M. (2018). Python script: Transformation of local coordinates to global coordinates. Retrieved March 2018, from
https://github.com/MariaYulmetova88/Transferring-local-coordinates-to-UTM-using-the-GPS-coordinates
Zoghlami, A., de Runz, C., Akdag, H., & Pargny, D. (2012). Through a fuzzy spatiotemporal information system for
handling excavation data. In J. Gensel, D. Josselin, & D. Vandenbroucke (Eds.), Bridging the geographic informa-
tion sciences: International AGILE’2012 Conference, Avignon (France), April 24–27, 2012 (pp. 179–196). New York:
Springer.
3
Spatial sampling
Edward B. Banning
Introduction
information in sample design (Hole, 1980); in fact, many archaeologists either summarized then ignored
useful information at their disposal or explicitly excluded it from their research plans.
Eventually, a backlash against simplistic and poorly conceived sampling designs led to the misplaced
rejection of sampling altogether, based on the idea that “such ‘samples’ often fail to capture the true vari-
ability present in the archaeological record” (Tartaron, 2003, p. 23). The sudden popularity of “full coverage
surveys” (Fish & Kowalewski, 1990) yielded sets of data that addressed some kinds of research questions more
effectively than a typical sample could but, in many cases, were actually still samples (Cowgill, 1990, p. 254),
albeit disguised as whole populations, at scales usually much smaller than those of older extensive surveys.
Because of the impression that the result was a whole population, often these projects paid no explicit atten-
tion to survey intensity, sample size, sampling error, or bias in the estimates that we might base on them (cf.
Cowgill, 1990; Plog, 1990; Tartaron, 2003). In part, this rejection of sampling resulted from the tendency to
confuse sampling with either searching (as for rare sites) or detection of spatial patterning (e.g., settlement
networks). Far too many introductory archaeology texts contributed to this confusion by describing sam-
pling as a method for “finding sites.” Conventional sampling is indeed a poor method for finding rare things
(Flannery, 1976, pp. 134–135; Redman, 1987, p. 251) or identifying extensive spatial structure (Banning,
2002, pp. 155–156). Archaeological sampling in a spatial context has almost disappeared from the scholarly
literature of the last three decades, despite the publication of Orton’s (2000) important book, mention of
sample design in standard references (e.g., Banning, 2002; Collins & Molyneaux, 2003; Drennan, 2010;
White & King, 2007), and the continued use of sampling in contract archaeology.
In reality, sampling is a tool that is useful in some situations, unhelpful in others, and that always
requires tailoring to the purpose at hand. The fact that poorly designed samples lead to incorrect con-
clusions or do not accomplish project goals (Redman, 1987, pp. 250–251) should not be an indictment
of sampling. We should also keep in mind that nearly a century of statistical research has explored the
nature of samples, their efficiency, statistics and sampling errors, making it unnecessary for archaeologists
to “reinvent the wheel.”
Method
have long noticed, this creates a paradox for most field archaeology in that it is usually impossible for us
to specify, in advance, what sites or artifacts exist in the population. Consequently, archaeologists have
favoured geometrical sampling frames, typically a rectangular grid arbitrarily imposed on a site or a region
from which they could select some rectangles for examination and ignore others. Early on, Binford (1964,
p. 428) claimed that “The units of the frame should be approximately equal in size,” and it was widely
assumed that using a regular grid ensured that this would be the case. This type of sampling frame became
de rigeur in North American archaeology, and increasingly archaeology elsewhere, in the 1970s and 1980s.
Although archaeologists have favoured rectangular sampling grids, triangular ones are actually more
efficient. This is not only true in terms of the likelihood of intersecting sites or features, the characteristic
that archaeologists have most emphasized (Banning, 2002, pp. 97–102; Kintigh, 1988; Krakker, Shott, &
Welch, 1983; Verhagen, 2013), but also for minimizing errors in predictions based on the sample. Thanks
to geometry, a triangular grid with the same density of points is more likely than a square grid to “hit”
theoretically circular, or even oblong, sites or features that are smaller than the grid interval. In addition, the
triangular grid, by minimizing the farthest distance from sampled to non-sampled points, minimizes the
worst predictions of population characteristics (Thompson, 2012, pp. 302–303), such as artifact densities.
However, there is nothing sacred about square, triangular, hexagonal, or any other kind of geometric
sampling frames (Wobst, 1983). In fact, given that variability in both cultural and non-cultural spatial
information is unlikely to correspond even remotely with the borders of such units, and that some parts
of these units might even be inaccessible to observation (Binford, 1964, p. 428; Hole, 1980), they are argu-
ably a rather poor choice. A unit that is nominally 500 m × 500 m, but a third of which consists of a lake,
a steep and eroded slope, or a shopping mall, is clearly not comparable to one that has none of these. More
“natural” or “non-arbitrary” sampling frames, such as ones based on geological landscape elements at the
regional scale (e.g., Banning, 1996, 2002; Collins & Molyneaux, 2003, p. 21; Orton, 2000, pp. 3, 86; Sch-
langer, 1992; Stafford, 1995), or city blocks in an urban setting (e.g., Wallace-Hadrill, 1990, pp. 153–156)
can be more useful and also much more relevant to the variables in which we are most interested.
In addition, while spatial sampling always involves two dimensions, it is important to recognize that
it can often have a third: the thickness, depth or vertical component of deposits within a feature, site, or
landscape (Orton, 2000, pp. 167–168). As with two-dimensional sampling frames, this third dimension
can be geometrical and arbitrary, as with “arbitrary spits,” but it is usually better to use “natural” strati-
graphic boundaries whenever possible.
One of the pitfalls of spatial sampling frames is that it is easy to forget that they formally describe a
population that consists of spatial units, which are rarely congruent with sites, buildings or features, and
never with artifacts. It is important to remember that when we sample with spatial units as a way to get
at a population that consists of smaller things – like sites or artifacts – that these units contain, then we
are doing cluster sampling (see Cluster Sampling).
as a tendency to result in many elements with “zero” observations, or by increasing the tendency for sites
or features to fall into more than one spatial element (an “edge effect”). In reality, we need to balance
the spatial size of sample elements with other considerations, such as whether they will have sufficiently
uniform character within their boundaries, whether they are likely to have non-zero observations (for
cluster samples), whether accessibility or other practical factors will become a problem, how uniform or
“patchy” the sampled region is, and how much travel time or set-up time the sample size would entail.
Fixed sample sizes are ones that involve deciding, in advance, how large the sample will be, typically
on the basis of the cost of collecting or analyzing the sample. When we can calculate these costs, we can
simply budget for a particular sample size. For example, in a regional survey, we might estimate that our
survey team can walk a total of 60 km each day; if we have budgeted 30 field days to complete a survey,
then we could survey the number of sample elements that 1800 km would cover at whatever intensity or
coverage (Banning, Hawkins, & Stewart, 2011) we would like to accomplish, although it is a good idea
also to account for set-up and travel time between units. Alternatively, we might try to balance these costs
with the risk that the resulting sample will not meet particular objectives, such as being able to estimate
some parameter of interest or compare populations within a certain tolerance of precision, statistical
power, and confidence (see examples from McManamon, 1981 and Lee, 2012, below).
Systematic sampling
Unlike SRS, this design involves making only the first selection randomly, and then every other selection is
strictly determined by a “spacing” rule, such as taking every fifth element in a population ordered by the
sampling frame. In a spatial context, this usually means sampling by parallel, equally spaced transects, or
at the intersections of a regular grid, whether rectangular or not (Figure 3.1(b)). It is thus most common
Figure 3.1 Examples of random and systematic spatial samples using points, rectangles, and transects as the
sample elements. (a) random point sample, (b) systematic transect sample (walking north), and (c) systematic,
stratified, unaligned sample of small squares.
Spatial sampling 45
Stratified sampling
Stratified sampling takes prior information into account so as to ensure that important aspects of the
population’s variability are reflected in the sample. This involves subdividing the population into sub-
populations (or “strata”) that are different from one another in some meaningful way. Strata that are
highly arbitrary are useless or worse. From the statistical perspective, strata should differ significantly in
the parameters of interest; consequently, it is important to compare strata after sampling is complete to
ensure that the sample design was successful in differentiating these subpopulations. Some of the more
obvious and meaningful grounds on which to base stratification include differential probability of site
discovery (regionally) or feature preservation (in sites), or likely land use (regionally) or activity area (sites).
Stratifying by soil type or altitude only makes sense if we have an explicit theory as to why sites on red
soils or in highlands might differ in their character or distribution from ones on grey soils or in lowlands,
for example, while stratifying a site by cardinal directions (northwest quarter, etc.) would likely only make
sense by accident. Many archaeological applications of stratified sampling in the 1970s and 1980s failed to
justify their basis for stratification and it may have done little to improve inferences based on the samples
(Wobst, 1983, pp. 59–60). It is better to stratify within sites, for example, when we have information that
would lead us to believe that certain areas within them were predominantly industrial, cultic or adminis-
trative, while others were predominantly residential. As noted below, a version of stratified sampling can
also be useful in the context of testing specific spatial hypotheses.
Commonly, stratified sampling is proportional; that is, the sampling fraction in each of the strata is the
same. This can lead to problems when the strata themselves differ markedly in size because the smaller strata
might have too small sample sizes, while sampling the larger ones might be unmanageable or wasteful. In
such cases, and sometimes for other reasons, samplers may employ disproportional stratified sampling, in
which the sampling fraction varies from one stratum to another. Disproportional stratified sampling requires
weighting factors (ratio of sample variance to sample size) for the various strata to compensate for the fact
that the probability of selection differs from stratum to stratum, and calculating statistics to estimate popula-
tion parameters must take these weights into account (Thompson, 2012, pp. 141–146). Ideally, dispropor-
tional stratified samples provide more precise estimates by having larger sample size and less sample variance
within the smaller strata than would be the case for a proportional design. However, many archaeological
samples have accidental imbalances in stratum size that are not optimal and may lead to larger variance.
elements will “line up” (Figure 3.1c). This strategy ensures a somewhat even sampling of the spatial popu-
lation without as artificial an arrangement as a purely systematic strategy would have, while also avoiding
problems that might result from spatially periodic patterns in the population, but it retains the disadvantages
of an artificially geometric sampling frame and omits the most important advantages of stratified sampling.
Another somewhat common sampling design is Probability Proportional to Size (PPS) sampling. In
this design, the population consists of spatial entities that vary in size (e.g., plots on a landscape, build-
ings in a site, or rock grains in a petrographic slide). Rather than randomly selecting these items directly,
sampling proceeds by imposing a set of randomly or systematically arranged points or line segments on
the area that encloses the population and then selecting every element that is intersected by a sampling
location (Figure 3.2; Orton, 2000, p. 186). Larger elements have a higher probability of intersection than
small ones (thus the PPS designation), which could result in biased estimates if we are interested in esti-
mating such parameters as average size unless we compensate for this effect. If, instead, we are interested
Figure 3.2 Example of a random Probability Proportional to Size (PPS) sample of agricultural fields used
as sampling elements. Any field that contains one or more of the random points is included in the sample
(hatched). Note how larger fields are over-represented, but this may have practical advantages in fieldwork in
terms of survey costs.
Spatial sampling 47
in estimating something like the total population of human communities that occupied a region, or the
ratio of one artifact type to another, there is no bias as long as we can reasonably expect that the density
of human occupation or the artifact ratio is much the same in large and small spaces. PPS sampling, typi-
cally with a grid of points, has been used to identify places that will be subsampled with archaeological
survey (e.g., Kuna, 1998).
Cluster sampling
As mentioned above, whenever we use a sampling frame of spatial areas, whether geometrical or not, as a
means to select a subset of a population of some other kind of entity, such as sites, mounds, pits, features,
artifacts, seeds, or bone fragments, we are cluster sampling. In other words, the sampling elements are not
identical to the elements in the population of interest. For cluster samples, N and n are numbers of larger
sample elements or “clusters” in the population and sample respectively, while M is the number of things
in the population and m the number in our sample (Mueller, 1975; Orton, 2000, pp. 212–213; Read, 1975,
pp. 54–58; Thompson, 2012, pp. 157–166).
Calculating estimates of proportion, mean or density in such samples is not difficult. We can conceive of
the sample elements as n “clusters” and base the estimates on the total number of observations (m) across all
sampled clusters. So, the density of flakes from an excavation using a 1m grid as a sampling frame would just
be the total number of flakes found in all the sampled 1m2 units divided by the number of sampled units.
Calculating variance and standard error is somewhat more complicated, however, as we need to examine
how much each individual cluster deviates from the overall mean, proportion or density. The calculations
that most basic statistical and spreadsheet software provides unfortunately do not account for cluster sampling
and lead to biased estimates of dispersion or error. However, we can still calculate them properly by tracking
these deviations in a spreadsheet and applying appropriate formulae (Banning, 2000, p. 83; Drennan, 2010,
pp. 243–248; Orton, 2000, pp. 212–213). One other complication is the presence of “edge effects” when, for
example, a site falls partly within, but partly outside, an element of our spatial sample. Do we count it? Count-
ing whole sites in such cases leads to bias (i.e. the overestimation of site number and settled area). Among
the potential solutions is to count it as a fraction of a site, or only if its centre lies within the sample element.
Failure to deal with edge effects properly can lead to biased estimates from cluster samples.
sampling units surrounded by “empty” sampling units. This is a very promising approach but, as with
cluster sampling, we need to be careful to use the correct statistics for this kind of sample; calculating the
mean and standard deviation for an adaptive sample requires us to account for the total number of sam-
pling elements in the population, the number of elements in the original sample, the number of networks,
and the number of elements in each network in order to avoid bias (Orton, 2000, p. 214; Thompson &
Seber, 1996, p. 96). Some North American jurisdictions now recommend or require professional archae-
ologists to use some version of adaptive sampling for survey by shovel-testing (e.g., Ministry of Tourism,
Culture and Sport [MTCS], 2011, p. 33), even though these cases are usually intended for discovery, not
estimation, and do not always conform to the statistical form of adaptive sampling. Consequently, they
may not provide unbiased estimates of population parameters.
One of the misconceptions that many archaeologists hold is that the formal or statistical sampling
methods discussed above are appropriate for the discovery or detection of sites and features in space. As
already noted, their actual purpose is to make inferences about populations, or to compare population
characteristics, not to ensure our detection of any particular observation (Shott, 1985; Wobst, 1983). In
fact, it should be quite obvious that a method whose premise is that we can make inferences on the basis
of a small subset of potential observations would, by definition, omit the majority of the population. The
only type of sampling that can maximize our chances of finding something in particular, let alone finding
almost everything, is purposive selection.
Non-probability sampling
Making inferences about populations – estimating their parameters, setting confidence intervals, or test-
ing statistical hypotheses about them – is relatively straightforward when we use any kind of probability
sampling, at least in the sense of avoiding uncontrollable bias. However, many archaeological samples were
not collected according to any formal probability sampling plan and there are even cases where it is more
effective or more reasonable to use a purposive (or authoritative) sample.
The simplest kind of non-probability sample is the convenience sample. This type of sampling involves
just taking whatever observations come most readily to hand, and results in a sample that has a high risk
of providing biased estimators. For example, spatial convenience samples are often clustered in space (e.g.,
samples taken along roads where accessibility is high) so that they are susceptible to spatial autocorrelation
and may not be representative of a much larger population space. From a Bayesian perspective, convenience
samples may still be useful as long as there is no a priori reason to think that one group of potential observa-
tions will differ from another group in the population with respect to the variables of interest (a concept
called exchangeability, Buck, Cavanagh, & Litton, 1996, pp. 73–74). In the case of sampling by intervals along
a pre-existing, straight road or pipeline corridor, for example, the resulting sample might be representative
of a population with respect to characterizing clay sources in the surrounding region as long as the road or
pipeline crosscuts the population space in a way that is not expected to favour particular soil types. However,
sampling along a road that follows the route of an ancient road or pathway would be a poor way to charac-
terize the proportions of pottery of different periods in the region, since sites and activities associated with
one period might cluster along the road while those of other periods have quite different spatial patterns.
to misunderstandings as to what it includes, how we can apply it, and for what objective, compounded
by a tendency to confuse it with convenience sampling. While it is true that convenience sampling is
likely to lead to biased estimates of population parameters, careful purposive designs can be much more
efficient than random ones when the goal is either to test specific hypotheses or to discover some kind of
“target,” such as a site or a feature, especially when the target is rare. They can also perform better than
probabilistic designs when prior information convincingly shows that sampling in some spaces would be
ineffective or wasteful.
Unlike convenience sampling, careful purposive selection involves many of the same steps as prob-
ability sampling, including creating a sampling frame and rules of selection. However, there are also
differences. Rather than obtaining a random or representative sample, purposive selection often has the
aim of targeting rare observations that a random sample would likely miss. In such cases, it approaches
a disproportional sampling of a stratified population so that spaces considered most likely to contain the
rare sites or artifacts receive greater density of examination. Such an approach is implicit in many state-
mandated survey guidelines that allow much lower survey intensity, or even no survey at all, in areas of
“low potential” (e.g., MTCS, 2011, p. 28). In other instances, geomorphological indications of the places
most likely to contain sites of a particular age can be the main criteria for allocation of survey (Hitchings
et al., 2013). Among the best-known applications of purposive survey are searches for shipwrecks; the
most efficient of these are informed by information on currents, reefs, ports, and the historically docu-
mented routes that ships favoured. However, it is notable that, in many of these cases, the goal is not to
make inferences about populations but merely to discover rare observations. As outlined below, purposive
sampling can also be important in the context of very specific hypotheses.
One of the hallmarks of purposive survey is its use of prior information. To return to the example of
sampling along a road, in the context of searching for a particular kind of site, say Roman forts and way-
stations that can be expected to cluster along ancient roads, purposive survey along the known routes of
such roads would be an excellent way to improve the chances of finding most of the relevant sites.
In excavation samples, purposive selection can be the sensible choice for a second stage of sampling
after an initial probabilistic sample (Redman, 1973). In a site with rectilinear architecture, for example, a
systematic sample might reveal segments of walls that provide no good information on the sizes of build-
ings. Following up with purposive sample units guided by the directions of these walls can lead to the
discovery of building corners, thus allowing us to reconstruct the sizes of whole buildings at relatively low
cost. However, we should keep in mind that these are likely to be, in some sense, PPS samples.
The most sophisticated versions of purposive selection involve optimal searching (Koopman, 1980;
Stone, 1975). Optimal searches are designed to locate particular kinds of observations with the lowest pos-
sible search cost, making them well-suited to the identification of rare kinds of sites or features, and they
make explicit use of prior information to guide the search. Searching of this kind can be a very useful
supplement to the more typical kinds of sampling – which provide information on the most common
or most representative phenomena – by providing information on rare phenomena that a conventional
sample would likely omit. However, it is important to keep in mind that observations obtained through
optimal searching should not be combined with those from probabilistic sampling to estimate population
characteristics without adjustments to account for the non-probabilistic way in which they were acquired.
For example, a more conventional sample might be sufficient to establish an upper limit (“detection limit”)
on the number or density of a rare kind of site, while the purposive sample can provide information on the
characteristics of those rare sites or document their presence when the conventional sample omits them.
Another important use of purposive selection is in the context of hypothesis testing. Some kinds of
archaeological hypotheses make specific predictions about where certain kinds of archaeological remains
should occur, and where they should not. A random sample would be an extremely ineffective way to
50 Edward B. Banning
test such predictions, and it is obvious that it is better to target spaces, whether in regions or within sites,
where the hypothesis predicts such remains should or should not be present. This involves classifying the
sampling frame into three categories – spaces where we predict there should be certain kinds of observa-
tions, spaces where they should not occur, and all the rest – and then disproportionately target the “posi-
tive” or “negative” spaces, or both. In cases where we introduce probability sampling within these three
categories of space, we can more formally describe this approach as disproportional stratified sampling.
When the “positive” spaces are few, however, we would probably want to investigate all of them.
One tool that has greatly enhanced the potential and role of purposive selection is predictive model-
ling, typically within a Geographical Information System (GIS) (see Verhagen & Whitley, this volume).
Although the most obvious use of GIS in a sampling context is for purposive selection, it can also be
extremely useful to create, and later test, the strata in a stratified sample design.
One thing to check is whether the sample size was adequate. Even if we attempted to determine an
appropriate sample size with one of the methods illustrated below, the attempt is not always successful.
For example, it is only prudent to check the Standard Error of the estimates based on the sample to ensure
that they are within our tolerances for whatever confidence level we have chosen.
It is also critical to check the degree to which the strata in a stratified sampling design have captured
relevant variability. If there are no statistically significant differences between the characteristics of the
strata, so that variances within strata are smaller than the variance in the whole population, then the strati-
fication was a failure. Among the ways to make this assessment would be to compare the stratum-level
estimates of a relevant ratio-scale parameter, such as site size or artifact density, using a statistical method
like analysis of variance (ANOVA).
However, in the case of archaeological sampling, we also need to look out for other kinds of errors.
In addition to the sampling error that is intrinsic to sampling, and well understood by statisticians, a
variety of non-statistical errors can have impacts on the results. Among the most common of these
is error that results from inaccessibility of some observations. For example, the sample design for a
regional survey may select some spatial units in which it was impossible to detect sites or artifacts
because of safety concerns, because a landowner, military authority or other government agency
denied access, or because the material we needed to detect was inaccessible to our detection methods,
as is usually the case whenever it is deeply buried or hidden by modern development. Similarly, a
sample of spaces within a site could easily be missing data either because the relevant materials were
destroyed or poorly preserved relative to other parts of the site, or because excavations in parts of the
site did not go deeply enough to encounter materials that are contemporary with those encountered
in the rest of the site. This effect, which is analogous to “non-response” in questionnaire-type sur-
veys, has an impact on the real sample size for some kinds of remains and, in some cases, could lead
to distorted (biased) inferences of site or artifact characteristics or their distributions in space. One
way to assess this is the response rate, which is simply the number of sampling units for which we
have complete data divided by the total number of sampling units that were nominally in our sample
(cf. Fowler, 1984).
Detection effects
A simplifying assumption is that any potential observation within each sample element will always be
observed without error. However, in archaeological spatial samples, potential observations are often
overlooked because of their lower detectability (Casteel, 1972; Gordon, 1993; Krakker et al., 1983).
In excavations, for example, we routinely fail to detect some of the lithics, sherds, seeds or bone frag-
ments because they are very small or difficult to distinguish from natural stones. Consequently, if we
do not account for this effect, our values of m in a cluster sample will be too low, leading to biased
estimates. Archaeologists use methods such as screening sediments in their attempt to improve detect-
ability, but these are never perfect. In some instances, it may be reasonable to assume that detectability
for a particular kind of artifact or feature is constant over our sample elements and we can account for
it with reasonable estimates of detectability (Thompson, 2012, pp. 215–219; Orton, 2000, pp. 26–27,
39; Banning et al., 2011). The more serious case is when there is differential detectability across our
population, as when we are attempting to survey artifacts of particular colour and texture against a
background of highly variable environmental characteristics (Banning, 2002, pp. 48–49), or when
there are significant inter-observer differences (Hawkins, Stewart, & Banning, 2003). To obtain
unbiased estimates of population characteristics, we then need to take these differences into account
(Thompson, 2012, pp. 224–225).
52 Edward B. Banning
Case studies
Figure 3.3Map of the survey region of the Ayl to Ras an-Naqab Survey in southern Jordan, with three strata
and 500 m × 500 m sample elements (after MacDonald et al., 2012, p. 6).
Spatial sampling 53
dissected Stratum 1. In principle, it also provides a framework for making estimates of such things as the
number, density, and average size of sites, or the proportions of different artifact classes, in each stratum.
However, it also illustrates quite well some typical problems. First, given the arbitrarily square sam-
pling elements and the very jagged borders between strata, it is not surprising that many of the sampling
elements either straddle the boundary between two strata, making it unclear to which they belong, or
lie partly outside all three strata (edge effects). Second, in six of the squares in Stratum 1, severe ero-
sion over some or all of their extent made it impossible or unsafe to survey them as thoroughly as in
other areas (MacDonald et al., 2012, p. 9). Third, even had survey been possible, many archaeological
traces will not have survived in these areas because erosion has carried them all away. Consequently,
coverage of some parts of the survey region, especially in Stratum 1, is actually less than the nominal
sampling fraction of 5%, providing the potential for biased estimates of population parameters. Fourth,
the fact that the transects within sampling elements were so widely spaced means that these elements
were themselves subsampled, so that coverage, especially for artifacts and sites less than 50 m in size,
would be much less than 5% and estimates of some parameters, such as site size, would be biased unless
we correct for these effects. Finally, travel costs between such widely scattered survey units over a large
territory are large. These are not criticisms but rather the realistic consequences of surveys that employ
low-intensity subsampling and arbitrary, geometric sampling frames over highly diverse terrain. Strate-
gies to compensate for these challenges can include non-geometric sample elements (see next section)
and adding random units whenever units initially selected prove inaccessible or too eroded to preserve
archaeologically interesting deposits.
likely to be “hit” by at least one point than are small fields. Survey teams then subsampled the selected
fields by transects 10m in width, subdivided into 10m segments, and at varying intervals. Because vis-
ibility within fields is usually less variable than between fields, fields are often much better as sample
elements than arbitrary geometric units in that they make it easier to control for variability in detection
probabilities.
Taking this approach further, some archaeologists have used “landscape elements” defined on geoar-
chaeological grounds as sampling units. Where the archaeology of interest has considerable time depth, it
is possible or even likely that the landscape elements currently visible on the modern surface differ mark-
edly in age, some being remnants of ancient land surfaces that have mostly disappeared through erosion
or been deeply buried by alluviation or colluviation. In such circumstances, it makes sense to define a
sampling frame that consists of all landscape elements that are likely to be remnants of the ancient land-
scape, and either ignore younger ones or treat them as different strata in a stratified sample. In the deeply
incised drainage system of Wadi Quseiba, northern Jordan, Hitchings et al. (2013) recognized that most
of the valley floor that existed in Neolithic times had eroded away, leaving only fragments in the form of
“terraces” stranded some way up the sides of the valley walls. The set of these terraces, along with some
flat or gently sloping plateaus on the valley margins, became the sampling frame for a purposive sample
that employed predictive modelling in a GIS and Bayesian allocation methods to optimize the survey’s
chances of finding Neolithic sites (Figure 3.4).
Figure 3.4 Map of a portion of the Wadi Quseiba survey region in northern Jordan, showing the ephemeral
stream channels (dashed) and the population of landscape elements or “polygons” (hatched) that constituted
the sampling frame for Stratum 2 of this survey (after Hitchings et al., 2013).
Spatial sampling 55
n = ( st )2 / (rx )2
where n is the number of sample elements to be surveyed in a stratum or substratum, s is the standard
deviation from the 1% pilot sample, used as an estimate of the population standard deviation (s) for that
stratum, t is the Z-score for 80% confidence (1.28), r is the relative error (0.1), and x̄ is the sample mean
from the pilot sample, used as a rough estimate of the population mean (μ). This allowed him to estimate
the required number of sample elements as 38 in Stratum IA, 22 in Stratum IB, and 151 in Stratum II.
With the data at hand, it was not possible to establish the required sample size for Stratum IC.
At the intra-site scale, Lee (2012) gridded the floors of large semi-subterranean houses at several
Korean Mumun sites in the Nam River Valley on a 1 m grid, and collected 10 litres of sediment from
each grid square and pit feature for paleoethnobotanical analysis. Taking the total of these at each site as
the population, and given the substantial labour costs of counting seeds in such volumes after flotation, she
used the same method as McManamon to estimate the sample size (number of grid squares and features)
needed to achieve a relative error on seed densities of 20% at a 90% confidence interval (t = 1.83 and
r = 0.2). This provided a basis for estimating the proportions of different taxa among the seeds of whole
sites, but sacrificed the spatial information that could have resulted from sub-sampling the 10-litre vol-
umes of all of the grid squares and features. However, seed density was too low to achieve good estimates
of population characteristics from the latter strategy.
In a similarly time-consuming study of micro-remains in Late Neolithic house floors in Jordan,
Ullah, Duffy, and Banning (2015) opted for systematic sampling of the floors to reflect spatial pattern
combined with sequential sampling of the micro-remains in each sample element. Having collected
sediment samples from across a square grid imposed on each floor, they controlled for inter-observer
error in counting the remains in each square by having a large group of students each count a small
sample, with replacement, of the screened sediment from all volumes taken from gridded house floors.
To balance the problems of sparse data and the time costs of counting, they experimented with various
volumes for the subsample elements until settling on 3 ml and then used sequential sampling to have stu-
dents count remains in these small volumes until the standard error on the density of the most important
micro-remains levelled off (Figure 3.5). Having each grid context counted once by every student largely
controlled for varying student abilities at identifying particular classes of material, and makes it possible
to identify spatial variations across the house floors that provide hints of activity areas and site-formation
processes, in addition to estimates of the proportions of micro-remains in larger spatial units, such as
houses or activity areas.
56 Edward B. Banning
Figure 3.5 Decline in the Relative Standard Error (RSE) of micro-refuse counts with increasing sample size in
the use of sequential sampling in Wadi Ziqlab. Sampling stopped after the three-point slope was less than 0.03
for three consecutive measures of RSE (after Ullah et al., 2015, p. 1254).
waste flakes in the sample elements in the strata/populations, eschewing a site-based approach. Statistical
analysis with respect to the hypotheses had equivocal results.
The polythetic stage of this research was a particularly good example of how sampling a region
could help archaeologists test a specific hypothesis about past human use of a landscape. However,
the Reese River project surveyed a large number of spaces that were likely irrelevant to the hypothesis
test, costing time that could have been invested in a larger sample size of spaces that had either high
or low probability of containing sites, or having high or low densities of projectile points and waste
flakes, if the Steward hypothesis was correct. In addition, the use of rectangular sample elements
almost certainly led to a lack of uniformity within each unit in the degree to which they satisfied
the polythetic criteria.
Williams et al. (1973) had already defined a more “natural” sample element than either sites or arbi-
trary squares. They note that it would be easy to identify spaces that qualify as sites, defining the spaces
by the edges of the artifact scatter, but it was impossible to do this for “non-sites.” Actually, it would be
equally impossible to identify the site areas until after the survey was complete. Consequently, the most
efficient approach would have been to classify the entire map of the survey area by the polythetic criteria
to identify contiguous areas of space that satisfied at least five of the seven criteria, as well as spaces that
satisfied, say, less than three of those criteria. This would have been somewhat difficult to do in the early
1970s (although they did accomplish this, at least roughly, for the “positive” spaces on the stereographic
pairs), but fairly straightforward today using a GIS. It then would have been possible to restrict survey to a
sample of the spaces where the hypothesis predicts there should be finds, and a sample of spaces where it
predicts there should not be any (or the artifact densities should be much lower). If the polythetic criteria
are good predictors of site location, there should be a significant difference between the two sets of spaces
in their densities of sites or artifacts, with most of the sites and the highest densities in the “positive” set; in
fact, the “negative” set should have hardly any sites or artifacts at all. Spaces between these two extremes
would contribute little or nothing to the hypothesis test (although they might be important for other
reasons), so avoiding their survey, or sampling only a small proportion of them, would reduce the survey
cost or allow a larger sample size of more useful spaces.
Conclusion
Although some flaws and misunderstandings in the archaeological applications of sampling theory caused
early enthusiasm to give way to a near general scepticism, careful application of sampling theory continues
to allow archaeologists to draw some kinds of conclusions about populations and how they compare with
other populations with a high and well-specified degree of confidence, but only if they are realistic about
how they define such populations, and the research questions about them, in the first place.
In the context of that scepticism, most archaeology, including archaeology in an interpretive or post-
modernist vein, has continued to employ inferential statistics either formally or informally, but sometimes
lacks the careful attention to sample design that is necessary to draw sound conclusions from the data.
In a spatial context, most of our samples of artifacts are purposive or convenience cluster samples, often
from contiguous or highly clustered excavation areas within sites, that may not be representative of the
population of interest. While we will always need to make the most of the samples at our disposal, we
should at least be mindful of the potential biases in estimates we base on such samples.
Moving forward, defining the sampled population accurately will involve carefully considering the
site-formation processes that have altered the target population, and the factors, including detectability
and accessibility, that may impede our ability to draw meaningful samples from that population. A return
to broader uses of formal sampling should include greater attention to cluster sampling, more careful
58 Edward B. Banning
thought to the appropriate size and shape of sample elements, greater use of samples designed to test
specific hypotheses, and a bigger role for carefully conceived stratified samples.
As Hole (1980, p. 226) aptly points out, there is no magic formula to decide what sampling strategies
are best. “The optimal strategy always depends on what is meant by the word optimal, the nature of the
particular archaeological application, and the data.”
References
Banning, E. B. (1996). Highlands and lowlands: Problems and survey frameworks for rural archaeology in the Near
East. Bulletin of the American Schools of Oriental Research, 301, 25–45.
Banning, E. B. (2000). The archaeologist’s laboratory. New York, NY: Kluwer Academic and Plenum Publishing.
Banning, E. B. (2002). Archaeological survey. New York, NY: Kluwer Academic and Plenum Publishing.
Banning, E. B., Hawkins, A., & Stewart, S. T. (2011). Sweep widths and the detection of artifacts in archaeological
survey. Journal of Archaeological Science, 38(12), 3447–3458.
Beckner, M. (1959). The biological way of thought. New York, NY: Columbia.
Binford, L. R. (1964). A consideration of archaeological research design. American Antiquity, 29(4), 425–441.
Buck, C. E., Cavanagh, W. G., & Litton, C. (1996). Bayesian approach to interpreting archaeological data. New York, NY:
John Wiley & Sons.
Casteel, R. W. (1972). Some biases in recovery of archaeological faunal remains. Proceedings of the Prehistoric Society,
38, 382–388.
Collins, J. M., & Molyneaux, B. L. (2003). Archaeological survey. Walnut Creek, CA: Altamira Press.
Cowgill, G. L. (1990). Toward refining concepts of full-coverage survey. In S. K. Fish & S. A. Kowalewski (Eds.),
The archaeology of regions: The case for full-coverage survey (pp. 249–259). Washington: Smithsonian Institution Press.
Drennan, R. D. (2010). Statistics for archaeologists: A commonsense approach, 2nd ed. New York, NY: Springer.
Fish, S. K., & Kowalewski, S. A. (Eds.). (1990). The archaeology of regions: The case for full-coverage survey. Washington:
Smithsonian Institution Press.
Flannery, K. V. (1976). Sampling on the regional level. In K. V. Flannery (Ed.), The early Mesoamerican village (pp. 131–
136). New York, NY: Academic Press.
Fowler, F. J. (1984). Survey research methods. Thousand Oaks, CA: Sage.
Gordon, E. A. (1993). Screen size and differential faunal recovery: A Hawaiian example. Journal of Field Archaeology,
20(4), 453–460.
Hawkins, A. L., Stewart, S. T., & Banning, E. B. (2003). Interobserver bias in enumerated data from archaeological
survey. Journal of Archaeological Science, 30(11), 1503–1512.
Hitchings, P., Abu Jayyab, K., Bikoulis, P., & Banning, E. B. (2013). A Bayesian approach to archaeological survey in
north-west Jordan. Antiquity, 87(336), project gallery.
Hole, B. L. (1980). Sampling in archaeology: A critique. Annual Review of Anthropology, 9, 217–234.
Judge, W. J., Ebert, J. I., & Hitchcock, R. K. (1975). Sampling in regional archaeological survey. In J. W. Mueller
(Ed.), Sampling in archaeology (pp. 82–123). Tucson, AZ: University of Arizona Press.
Kintigh, K. W. (1988). The effectiveness of sub-surface testing: A simulation approach. American Antiquity, 53, 686–707.
Koopman, B. O. (1980). Search and screening: General principles with historical applications. New York: Pergamon Press.
Krakker, J. J., Shott, M. J., & Welch, P. D. (1983). Design and evaluation of shovel-test sampling in regional archaeo-
logical survey. Journal of Field Archaeology, 10, 469–480.
Kuna, M. (1998). Method of surface artefact survey. In E. Neustpny (Ed.), Space in prehistoric Bohemia (pp. 77–83).
Praha: Institute of Archaeology and Czech Academy of Sciences.
Lee, G.-H. (2012). Taphonomy and sample size estimation in paleoethnobotany. Journal of Archaeological Science, 39(3),
648–655.
Leonard, R. D. (1987). Incremental sampling in artifact analysis. Journal of Field Archaeology, 14, 498–500.
Lyman, R. L. (1995). Determining when rare (zoo-)archaeological phenomena are truly absent. Journal of Archaeologi-
cal Method and Theory, 2(4), 369–424.
MacDonald, B., Herr, L. G., Quaintance, D. S., Clark, G. A., & Macdonald, M. C. A. (2012). The Ayl to Ras an-Naqab
archaeological survey, southern Jordan 2005–2007. Boston, MA: American Schools of Oriental Research.
Spatial sampling 59
McManamon, F. P. (1981). Probability sampling and archaeological survey in the Northeast: An estimation approach.
In D. R. Snow (Ed.), Foundations of northeast archaeology (pp. 195–227). New York, NY: Academic Press.
Ministry of Tourism, Culture, and Sport [MTCS]. (2011). Standards and guidelines for consultant archaeologists. Toronto,
ON: Ministry of Tourism, Culture, and Sport.
Mueller, J. W. (1975). Archaeological research as cluster sampling. In J. W. Mueller (Ed.), Sampling in archaeology
(pp. 33–41). Tucson, AZ: University of Arizona Press.
Orton, C. (2000). Sampling in archaeology. Cambridge: Cambridge University Press.
Plog, F. (1990). Some thoughts on full-coverage surveys. In S. K. Fish & S. A. Kowalewski (Eds.), The archaeology of
regions: The case for full-coverage survey (pp. 243–248). Washington: Smithsonian Institution Press.
Read, D. W. (1975). Regional sampling. In J. W. Mueller (Ed.), Sampling in archaeology (pp. 45–60). Tucson, AZ:
University of Arizona Press.
Redman, C. L. (1973). Multistage fieldwork and analytical techniques. American Antiquity, 34, 265–277.
Redman, C. L. (1987). Surface collection, sampling, and research design: A retrospective. American Antiquity, 52,
249–265.
Ringrose, T. J. (1993). Bone counts and statistics: A critique. Journal of Archaeological Science, 20, 121–157.
Rootenberg, S. (1964). Archaeological field sampling. American Antiquity, 30(2), 181–188.
Schlanger, S. H. (1992). Recognizing persistent places in Anasazi settlement systems. In J. Rossignol & L. Wandsnider
(Eds.), Space, time, and archaeological landscapes (pp. 91–112). New York, NY: Plenum Press.
Shott, M. L. (1985). Shovel-test sampling as a site discovery technique: A case study from Mchigan. Journal of Field
Archaeology, 12(4), 457–468.
Stafford, C. R. (1995). Geoarchaeological perspectives on paleolandscapes and regional subsurface archaeology. Journal
of Archaeological Method and Theory, 2(1), 69–104.
Steward, J. H. (1938). Basin-plateau aboriginal sociopolitical groups. Bureau of American Ethnology Bulletin, 120.
Washington, DC: Smithsonian Institution.
Stone, L. D. (1975). Theory of optimal search. New York: Academic Press.
Tartaron, T. F. (2003). The archaeological survey: Sampling strategies and field methods. In J. Wiseman & K. Zachos
(Eds.), Landscape archaeology in southern Epirus, Greece (pp. 23–45). Hesperia Supplements, 32. Princeton: American
School of Classical Studies at Athens.
Thomas, D. H. (1973). An empirical test of Steward’s model of Great Basin settlement patterns. American Antiquity,
38, 155–176.
Thomas, D. H. (1975). Nonsite sampling in archaeology: Up the creek without a site? In J. W. Mueller (Ed.), Sampling
in archaeology (pp. 61–81). Tucson, AZ: University of Arizona Press.
Thompson, S. K. (2012). Sampling (3rd ed.). Hoboken, NJ: John Wiley & Sons.
Thompson, S. K., & Seber, G. A. F. (1996). Adaptive sampling. New York: John Wiley & Sons.
Ullah, I., Duffy, P., & Banning, E. B. (2015). Modernizing spatial micro-refuse analysis: New methods for collecting,
analyzing, and interpreting the spatial patterning of micro-refuse from house-floor contexts. Journal of Archaeologi-
cal Method and Theory, 22(4), 1238–1262.
Verhagen, P. (2013). Site discovery and evaluation through minimal interventions: Core sampling, test pits and trial
trenches. In C. Corsi, B. Slapšak, & F. Vermeulen (Eds.), Good practice in archaeological diagnostics (pp. 209–225).
New York, NY: Springer.
Vescelius, G. S. (1960). Archaeological sampling: A problem in statistical inference. In G. E. Dole & R. L. Carneiro
(Eds.), Essays in the science of culture, in honor of Leslie A. White (pp. 457–470). New York, NY: Crowell.
Wallace-Hadrill, A. (1990). The social spread of Roman luxury: Sampling Pompeii and Herculaneum. Papers of the
British School at Rome, 58, 145–192.
White, G. G., & King, T. F. (2007). The archaeological survey manual. Walnut Creek, CA: Left Coast Press.
Whitelaw, T., Bredaki, M., & Vasilakis, A. (2006). The Knossos urban landscape project. Archaeological Interpretation,
10, 28–31.
Williams, L., Thomas, D. H., & Bettinger, R. (1973). Notions to numbers: Great Basin settlements as polythetic sets. In
C. L. Redman (Ed.), Research and theory in current archaeology (pp. 215–237). New York, NY: John Wiley and Sons.
Wobst, M. (1983). We can’t see the forest for the trees: Sampling and the shapes of archaeological distributions. In
J. A. Moore & A. S. Keene (Eds.), Archaeological hammers and theories (pp. 37–85). New York, NY: Academic Press.
4
Spatial point patterns
and processes
Andrew Bevan
Introduction
Point pattern analysis typically refers to a suite of statistical methods that address the potentially complex
spatial relationships that might exist among real-world phenomena (e.g. a spatial distribution of artefacts
across a house floor or of human settlements across a landscape), by simplifying these phenomena as 2D
points (occasionally extended to the 3D case). Sometimes the points can have categorical or numerical
labels (e.g. chronological phases or size/weight/area estimates), but often they do not, and the main ques-
tion of interest is what we might learn about the pure spatial structure of the point distribution itself. In
archaeology, such techniques have a long pedigree stretching back to the earliest informal interpretations
of artefact or site distribution maps (e.g. Crawford, 1912), to the rise of more quantitative archaeologi-
cal methods from the later 1960s and 1970s (Clarke, 1968; Hodder & Hassell, 1971; Hodder & Orton,
1976; Clarke, 1977; Hodder, 1977), to the extra spatial support provided by Geographical Information
Systems (GIS) from the early 1990s onwards (Allen, Green, & Zubrow, 1990; Ladefoged & Pearson, 2000;
Wheatley & Gillings, 2002 pp. 114–120; Conolly & Lake, 2006 pp. 112–186), to the greater flexibility
offered by simulation-based techniques today (e.g. Crema, Bevan, & Lake, 2010; Nakoinz & Knitter,
2016). Archaeology has been an enthusiastic borrower of methods from neighbouring subject areas such
as ecology, statistical science or quantitative geography, but has also been forced to address some distinc-
tive challenges raised by its own uncertain, time-compressed and patchy evidence. This chapter reviews
some of the key concepts associated with point pattern analysis and point process modelling, whilst also
making some suggestions about which techniques are usually more effective in addressing archaeological
problems (for more detailed treatment beyond archaeology, please see Illian, Penttinen, Stoyan, & Stoyan,
2008; Gelfand, Diggle, Fuentes, & Guttorp, 2010; Diggle, 2013; Baddeley, Rubak, & Turner, 2015).1
Method
Spatial intensity
A simple, traditional way to progress from a first ‘eye-balling’ of a spatial point distribution to a
more formal characterisation of it is to measure the density of the points in the distribution within a
Spatial point patterns and processes 61
carefully-defined study area. Informally, archaeologists often get away without defining the exact spatial
region within which they are making their observations, but a crucial first step in any more formal treat-
ment is to define this zone explicitly (e.g. as a square, rectangle, irregular polygon, etc.). The stricter tech-
nical term for this measurement of density within a crisp analytical window is the first-order spatial intensity
of the point distribution. The simplest summary is a single number (often the Greek latter l is used to
refer to this) expressing the average intensity (aka expected value) of points per unit area. Figure 4.1(a)
shows an example of a random distribution of 100 points in a 10 × 10 unit (for the sake of argument let
us assume these units are in metres) so l=1. More localised impressions of spatial intensity are possible
if, for example, we divide this study area up into 25 grid squares or quadrats each 2 × 2m. A basic first
question to ask is how many points might we expect, by chance and all other factors being equal, to fall
into each of these quadrats (given each quadrat has an equal chance of receiving points)? On brief first
reflection, you might be forgiven for assuming that most quadrats might get about 100/25 = 4 points
each? However, if the distribution is random, the actual observed count of points per quadrat should
not uniformly be 4, but rather should follow a theoretical Poisson distribution, with a lot of quadrats
exhibiting fewer points than 4 and a few exhibiting considerably more (Figure 4.1(b–c)). This theoreti-
cal premise is central to point pattern analysis: in a wholly random distribution of points, the observed
intensity of points per unit area should conform to a Poisson distribution (this is sometimes referred to as
complete spatial randomness or CSR, and the random behaviour that created the pattern is often referred to
as a Poisson point process). A more formal quadrat test of Figure 4.1(a) for example confirms that it does
not depart significantly from what we would expect if the pattern were generated by a random Poisson
process (p [probability value]=0.44).
A further, related theoretical starting assumption is that any given point pattern will behave in a
homogeneous way across a given study area. However, many real world point distributions depart from
this starting assumption and are inhomogeneous, exhibiting systematically fewer or greater numbers of
points in certain parts of the study area than in others. An archaeological example might be a prehistoric
house-floor where more artefacts are found on the northeastern side of the floor than elsewhere. In such
observed cases of an inhomogeneous pattern, we typically assume that an external trend (e.g. a past cul-
tural behaviour that favoured artefact use in the northeastern part of a house or a preservation bias leading
to better survival of archaeological finds in this part of the house) is influencing the changing intensity
of points across the study area. As an example, Figure 4.1(d) shows a new pattern of 100 points where
there is now a clear first-order trend towards more points in the upper-r ight part. A quadrat count in
this second case produces a frequency distribution which does not look Poisson in shape and a quadrat
test confirms this (Figures 4.1(e–f), p=0.001). In many cases, departure from this null model of a Poisson-
distributed, random, homogenous point pattern (aka CSR) is so obvious we probably do not need to do
such a test. Indeed practical experience suggests that very few real-world archaeological examples are ever
wholly random (there is almost always some sort of first-order trend leading to an uneven distribution).
A popular and useful related method for summarising the first-order intensity of a point pattern is a
kernel density surface (aka kernel density estimation, or KDE). The principle behind this approach is similar
to the quadrat count map in Figures 4.1(b) and 4.1(e), but instead of the grid squares, a small window or
‘kernel’ is moved systematically across each part of the study area. At each location as the kernel moves, the
number of points inside the kernel are counted up and this total is then mapped at the temporary location
of the kernel centre, before the latter moves to a new location. The result, once the kernel has been moved
across the entire study area and a count at each specified location has been calculated, is often a raster (pixel-
based) map where each pixel expresses the intensity of the point pattern in that local neighbourhood. The
chosen kernel shape need not be square: GIS packages, for example, often provide kernel density estimates in
a crisply defined circular region (i.e. of a fixed radius), sometimes with a distance-weighting so that points
Figure 4.1 Examples of the first-order spatial intensity of a point pattern and its summaries: (a) a random point
distribution (n = 100, the study area is notionally 10 × 10 map units in size), (b) a quadrat count of the same,
(c) the histogram of observed quadrat counts and the expected Poisson distribution if the pattern is random,
(d) an inhomogeneous point distribution where the intensity of points is higher in the top-r ight corner, (e) a
quadrat count of the same, (f) the histogram of observed quadrat counts and the expected Poisson distribution if
the pattern is random, (g) kernel density estimate of the inhomogeneous pattern in (d) using a Gaussian kernel
with a standard deviation of 0.5 map units, (h) the same, but with a kernel standard deviation of 1 map unit,
and (i) the same, but with a kernel standard deviation of 2 map units.
Spatial point patterns and processes 63
falling near the centre of the circle contribute more to the density estimate than those falling on the edge.
In contrast, an even more useful choice for a range of statistical applications is a continuous two-dimensional
Gaussian kernel, which takes into account all points in each calculation (i.e. the kernel does not have an
abrupt outer boundary) and weights each point with a distance decay from the centre of the kernel that is
shaped like a normal (bell-shaped) distribution. This offers a statistically stricter estimate that maintains the
correct total intensity of the point pattern across the study area. A nice analogy for this stricter approach is
offered by Baddeley et al. (2015, p. 168) who suggest thinking of each point in the pattern as a square of
chocolate. A hairdryer (the continuous, distance-decaying kernel) is passed over the study area systematically
to melt each bit of chocolate a little. The result is that all of the chocolate bits turn into a smoother, slightly
melted surface, but one that still retains peaks of chocolate where there were more bits in the first place, and
one that conserves the original total mass of chocolate.
An important further choice in kernel density estimation is the size or ‘bandwidth’ of the kernel. In
Figures 4.1(g–i), kernel density surfaces are shown for three different Gaussian bandwidths corresponding
to 0.5, 1 and 2 map units, revealing finer-and coarser-scale information about the first-order patterning.
Automatic methods for selecting a statistically appropriate bandwidth do exist (Baddeley et al., 2015,
pp. 171–172) and are initially attractive in taking this otherwise often arbitrary decision out of the hands
of the user, but it is worth noting that such bandwidth optimisation routines do not always agree with
each other or produce the most visually convincing results (for example in the current example, three
different bandwidth selectors would suggest strikingly different optimal Gaussian bandwidths: s = 0.5,
1.3 and 3.9), so the best strategy is often to explore more than one bandwidth and then justify your choice
in some manner (either by noting that it arises from consistent results amongst automatic methods or
because it reflects a meaningful behavioural scale for the pattern and process under study).
Beyond choices of kernel size and shape, it might be asked why one would bother with a kernel den-
sity surface at all when you could just eye-ball the spatial pattern of dots on the map itself (or use some
other statistical summary)? Indeed, sometimes kernel density maps are published as ‘analysis’ when they
do not really seem to be adding to the authors’ argument. Nevertheless, there are several reasons why
KDE are often an important starting point: (a) when different bandwidths are explored, they formalise
our comparison of finer-versus coarser-scale first order intensity, (b) they can summarise situations where
there are too many points in the study area to eye-ball easily, where there are lots of nearly superimposed
points or where points need to be treated unequally (with weights although this should not be confused
with situations where the points are sampling locations and interpolation is far more appropriate, see
below), and (c) when a strict kernel density estimator is used, the non-parametric summary of inhomo-
geneous spatial intensity that it provides can be repurposed for more complex statistical treatments, for
instance as part of inhomogeneous point process models or relative risk surfaces (see below).
Spatial interactions
So far we have considered the so-called first-order properties of a point pattern, those to do with the general
spatial intensity (density) of the points, and especially whether this intensity is homogeneous or not across
the study area. In cases, where the pattern is not homogeneous, we might anticipate an external factor at
work: the vast literature of archaeological site location modelling (sometimes just called ‘predictive mod-
elling’, see Kvamme, this volume; Verhagen & Whitley, this volume), for example, is based firmly on the
assumption that a spatial pattern of past human settlements across a landscape is often non-random and that
site locations can be predicted by external variables such as ‘steepness of terrain’, ‘soil fertility’ or ‘distance
to the nearest river’. However, a crucial second aspect of point patterns is the fact that the existence of one
point at a certain location in the study area may well increase or decrease the chance that another point
64 Andrew Bevan
occurs nearby. This is a second-order property of the point pattern, associated with the spatial interactions
that might exist between two or more points (it might also be called the pattern’s covariance structure).
Where non-random second-order patterning is observed, the assumption is usually that some process of
‘attraction’ encourages more clustering (aka clumping) of points than we would expect by chance or some
process of ‘inhibition’ is at work that discourages the points from getting too close to one another and leads
to a ‘regular’ (aka ‘uniform’ or ‘dispersed’) spatial pattern. Figures 4.2(a–c) show three simulated examples
of point patterns with, respectively, no second order effects, regular spacing and strong clustering of points.
Figure 4.2 Three hypothetical distributions (a–c) and how they manifest as K functions (d–f) and pair correla-
tion functions (g–i), the x-axes in (d–i) are in metres (see the scalebar in Figure 4.2(c)) and refer to the radius
of the circles around each point within which the respective K or Pair Correlation Function (PCF) statistic is
calculated; the critical envelope encompasses 95% of 999 simulations.
Spatial point patterns and processes 65
Table 4.1 Nearest neighbour (NN) test for the three hypothetical distributions in Figure 4.2(a–c) (with an edge
correction applied, as proposed by Donnelly (1978)).
There is also a long tradition of archaeologists addressing second-order patterns. For example, case studies
from the 1970s onwards sought to formalise our assessment of whether settlements were more evenly spaced
in the landscape than we might expect by chance, or whether certain artefact categories were clustered
together2 on a house floor (e.g. Hodder & Orton, 1976 pp. 38–51). A popular, but in truth highly problem-
atic, exploratory statistic and significance test for this in archaeology from the 1970s onwards was to calculate
the distance between each point and its nearest neighbour and compare the mean nearest neighbour distance
to what we might expect by chance given the sample size of points and size of the study area (borrowed from
ecology, Clark and Evans (1954), and known as a nearest neighbour test or Clark and Evans test). Table 4.1
summarises the results we would get for this test for the patterns in Figure 4.2(a–c). An r-value or nearest
neighbour index of around 1 suggests a random pattern whereas r > 1 suggest a more regular pattern and r
< 1, a more clustered one, with the accompanying p-value (sometimes a z-score is quoted instead) indicat-
ing how significant this departure might be from our null hypothesis that the spacing of the points arose by
chance. For the three examples in Figures 4.2(a–c), the nearest neighbour test correctly identifies a random
pattern, significant regularity and significant clustering respectively.
A first problem with the calculation of nearest neighbour distances however (and for many other
methods discussed below) is the uncertainty introduced by the edges of the defined study area, in those
cases where we know the observed point pattern extends beyond the study area but is undocumented. In
such situations, we may over-estimate the nearest neighbour distance for a point falling near the edge of
the study area, because its real closest neighbour lies just outside but is not known. Fortunately, such edge
effects can be corrected in a variety of ways (as are those in Table 4.1), even if no single optimal solution
exists for all applications. A second, more severe limitation of the above nearest neighbour test is the fact
that it assumes second-order interactions only ever exist at one scale (the smallest spatial scale, operating
between nearest neighbours), whereas a host of real-world examples and simulated datasets demonstrate
that points might exhibit meaningful clustering or regularity at multiple scales (or indeed appear random
at one of these different scales, but not at others). A commonplace human example in many parts of the
past and present world is a landscape of villages, where houses might be spaced a small, regular distance
apart (e.g. to allow for gardens and private family space), but cluster together at a medium scale into
hamlets and villages (e.g. to enable various forms of social cooperation), only for those villages perhaps
to be spaced out from one another at the coarse scale (e.g. to share out farmland and other resources).
A variety of more complex methods have been developed since the late 1970s (especially Ripley,
1977), that respond to this challenge and are better designed to characterise random, regular and clustered
behaviour over different scales (with these methods only appearing in archaeology over the last 10–15
years, e.g. Bevan et al 2010; Orton, 2004; Bevan & Conolly, 2006; Vanzetti, Vidale, Gallinaro, Frayer, &
Bondioli, 2010; Nakoinz & Knitter, 2016). Perhaps the most well-known of these multi-scale methods
is the K-function which is constructed by considering each point in the pattern in turn and, for each
point, measuring the intensity of other points that fall in ever-expanding circular regions around it. The
K function is then a summary of this procedure: the mean point intensity for all circular neighbourhoods
66 Andrew Bevan
of a particular radius. For simpler cases, it is possible to anticipate what the theoretical shape of the K
function would be if the point pattern were wholly random in nature (e.g. if it were Poisson distributed),
but as we shall see several times below, a more flexible approach to assessing this question is to simulate
many random sets of points (a Monte Carlo simulation, for the general approach, see Robert & Casella,
2004) and produce a ‘critical envelope’ that encompasses all (or a chosen percentage) of the simulated K
functions. If the observed K function falls above or below this envelope at a particular circle radius, then
it can be treated as departure from what we might expect by chance and therefore possibly worthy of
further attention.3 Figures 4.2(d–f) provide examples of the K-functions produced by the three simulated
point patterns in Figures 4.2(a–c). The results confirm that the first pattern is random at all scales, that
the second is regularly-spaced at small scales up to about 0.5m (a correct identification of the minimum
spacing imposed by the simulation) and that the third pattern is strongly clustered.
In fact, there are many alternatives to K-functions for measuring second-order patterning and each
one has certain strengths and weaknesses. As an example, it is worth making a direct comparison with the
results of calculating a pair correlation function (PCF) for the same three distributions (Figures 4.2(g–i)).
A PCF is calculated in a very similar way to the K-function, but with the important difference that its
circular radii are non-cumulative (i.e. each one is a doughnut-shaped ring or annulus that excludes the
previous circle [with a slight smoother then applied], whereas the K function uses cumulative circles). The
results agree closely with those provided by the K-function, except in the clustered case where the PCF
offers better information about the scale of clustering by suggesting that the pattern becomes random
again at circle radii >1.5 or 2. This correctly identifies the approximate parameter used to create this
simulated pattern in the first place (and the fact that the K function does not identify it reflects its use
of a cumulative approach that is sometimes swamped by the presence of strong smaller-scale clustering).
The above examples so far have only addressed patterns with a single scale of second-order effect, but
in contrast Figure 4.3 provides an example with multiple-scales of second-order effect. More specifically,
Figure 4.3 Multi-scale second-order effects: (a) a simulated process with small-scale regularity and medium-
scale clustering (b) the pair correlation function of a (the x-axis is in metres and refers to the radius of the
circles around each point within which the pair correlation statistic [y-axis] is calculated; the critical envelope
encompasses 95% of 999 simulations).
Spatial point patterns and processes 67
for this example, a point simulation is constrained to produce spatial patterning with both a minimum
spacing of 25cm between points (i.e. small-scale, strong regularity) and then moderate clustering at
medium scales up to 2m. It is reassuring that the resulting PCF calculated for this example correctly
identifies both these patterns.
Figures 4.4(a–b) shows a hypothetical example of such an approach, demonstrating that the deliber-
ately manufactured first-order trend of higher point intensity in the top-r ight of the study area in Fig-
ure 4.1(d) is correctly recovered via the probability surface (i.e. by a kernel density of the Figure 4.1(d),
divided by a kernel density of both the Figure 4.1(a) and 4.1(d) points). Figures 4.4(c–d) offers a real
Figure 4.4 Mapping the spatial probability of a point subset: (a) a hypothetical example of 200 points that
combines the random and inhomogeneous point patterns from Figures 4.1 (a, d). (b) the local spatial probabil-
ity (out of 1) of finding a point from the inhomogeneous surface Figure 4.1d, with a much higher probability
to the top-r ight (c) UK Portable Antiquities Scheme data showing Iron Age gold and silver coins of ‘Dobunni’
style, and (d) the local spatial probability of finding gold coins with an area of much higher probability on the
western borders (the hatched area is an arbitrarily excluded zone where the overall number of coins of any
material is too low to allow a meaningful probability estimate).
Spatial point patterns and processes 69
world example showing how the same method can identify an interesting difference in gold versus
silver coins for an assumed late Iron Age tribal area in western England (associated with ‘Dobunni’-
style coinage), where the gold coins appear preferentially deposited at the margins of a possible tribal
territory (perhaps related to border disputes and/or mercenary payments) in contrast to the silver coins
which appear to have circulated more evenly within the tribal core region. These insights are possible
despite likely spatial biases in the modern recovery of Iron Age coins (Bevan, 2012 pp. 500–504, with
further references). A useful extra step to avoid misleading results is to truncate the probability surface
along its edges at an arbitrary minimum findspot density (i.e. do not calculate the probability where
the denominator is too low, such as the hatched area in Figure 4.4(d)).
As a final note in this section, it is worth addressing a common point of confusion with regard
to point patterns that have marks or attributes: the difference between a discrete point-event and a
point-based sample. If we have objects such as settlements or artefacts that are discrete in space and
time and can each be simply represented by a point with an accompanying mark representing some-
thing like size, then the point pattern and marked point pattern methods discussed in this chapter are
entirely appropriate. If, however, we have points that merely represent sampling locations from a wider
continuous field of possible measurements (that could in principle have been made anywhere in the
study area), then alternative methods are needed. For example, you might measure soil chemistry or
layer-depth information at multiple borehole sites across a valley or estimate surface pottery density in
a set of circular samples across a ploughed field. In these two examples, the point samples are part of a
wider continuous pattern and a mixture of simple interpolation techniques, regression and geostatisti-
cal methods such as kriging are far more appropriate (Conolly, this volume, Hacıgüzeller, this volume;
Lloyd & Atkinson, this volume).
Case study
There are as yet very few published case studies that fit point process models to archaeological data (for a
preliminary example using Iron Age I sites in the West Bank, see Bevan et al., 2013), so it is worth explor-
ing an additional example here. An interesting example is provided by two complementary perspectives
on recent settlement on the Greek island of Antikythera (Bevan & Conolly, 2013). On the one hand,
standing building evidence (e.g. old houses and other shelters) is what people traditionally think of as
Spatial point patterns and processes 71
defining the location, size and shape of historical period villages, but on the other hand, it is often through
surface artefact scatters that survey archaeologists seek to infer the same parameters for past episodes of
settlement. An important, under-appreciated question, then, is the degree to which the former and the
latter exhibit the same patterning, in the rare cases where both kinds of evidence are available. An inten-
sive survey across Antikythera’s entire extent (20.5 sq. km) revealed 407 standing buildings (excluding
special purpose installations such as windmills) dating approximately to the 18th-earlier 20th century
AD (hereafter called the ‘Recent’ period, see also Bevan et al., 2004) and 1,644 diagnostic pottery sherds
of the same period (Figures 4.5(a–b)).
Both kinds of evidence are clearly clustered in only certain parts of the island and typically they seem
to coincide, as we might hope and expect. Pottery is likely to be discarded within people’s houses and next
to them in middens, for example, but it might also find its way further out into agricultural fields via pro-
cesses such as manuring. Previous regression modelling identified significant correlations between Recent
period sites (as identified by the pottery) and landscape variables such as access to lower lying, flat (and
better arable) land in the island’s softer geologies and proximity to freshwater springs (Bevan & Conolly,
2013 pp. 106–109). Here only two key variables are retained – the amount of local flat land within
500m radius neighbourhood and distance to the two main freshwater springs (hereafter “flat land” and
“spring distance”) – not least because they seem to capture most of the first order variation we observe
(Figures 4.5(c–d)). For completeness given the methodological focus of this chapter, Figures 4.6(a–b)
shows PCF for the two observed datasets, with a critical envelope drawn from entirely random points. As
expected and obvious visually, both patterns are highly clustered at all spatial scales, such that the two plots
are largely unnecessary. Inset maps in both figures however give an idea of what the random simulations
that make up the critical envelope look like (completely spatially random).
More informative, is the result when a first-order regression model is built by correlating the intensity
of observed points with the two covariates, flat land and spring distance. Both covariates are significant
and are retained in the resulting multivariate logistic models for both the buildings and the pottery. The
critical 95% envelopes of the PCFs can now be recalculated for simulations conditioned on these first-
order covariates (Figures 4.6(c–d)), with the result that the observed PCFs are now much closer to the
simulated ones. Even so, there is a suggestion of additional clustering of buildings up to about a 200m
interaction distance, and at even larger distances for the pottery. A third and final modelling stage there-
fore fits an additional clustering component, above and beyond the first order trend modelled by the
covariates.4 When the critical enevelopes are now recalculated, the observed PCFs for both the buildings
and the pottery now fall almost entirely inside (Figures 4.6(e–f)), especially in the case of the buildings.
More informally, example simulations of the two models produce results that look close in character to
the observed villages and the observed pottery distributions, with finds in plausible locations and with
plausible levels of clumping (map insets in Figures 4.6(e–f)). The fit of the first order covariates can now
also been adjusted in light of the additional fitted clustering component, and the final results suggest flat
land and spring distances are similarly strongly correlated with the buildings, but that for the pottery
flat land was the more important of the two (and spring distances retained in the model but of marginal
significance) (Table 4.2).
What can be learnt from such an exercise? It is reassuring that both pottery and buildings exhibit simi-
lar patterns with respect to the wider environment and also show signs of additional clustering. Whilst
we might further expect regularity in the spacing between villages in a landscape (i.e. not just short-range
clustering but also medium-range ‘inhibition’ due to competition over resources), in this particular case
the clustering alone is enough to account for the pattern. Perhaps more importantly, the pottery displays
stronger clustering than the standing buildings over short distances (up to 75m), but then also weaker
extended clustering at larger distances than the standing buildings. This suggests we should be a little
Figure 4.5 Archaeological survey evidence from the Greek island of Antikythera: (a) individual houses and
field huts of approximately 19th-early 20th century AD date, (b) surface pottery of approximately 19th-early
20th century AD date collected during fieldwalking of the whole island, (c) access to flat land (a count of how
many flatland cells are within a radius of 500m), and (d) distance to the nearest large freshwater spring (square
root-transformed).
Figure 4.6 PCFs for three stages of model fitting, calculated in the same way for both buildings and surface pot-
tery (the x-axis is in metres and refer to the radius of the circles around each point within which the pair correla-
tion statistic [y-axis] is calculated; the critical envelope encompasses 95% of 999 simulations): (a–b) observed PCF
(black line) and 95% critical envelope (grey shaded area) constructed from random simulations of a homogeneous
Poisson process (i.e. a null model of complete spatial randomness); (c–d) the same as (a–b), but with envelopes
now constructed from conditional random simulations from a fitted first order model using the two covariates in
Figures 4.5(c–d); and (e–f) the same as (c–d), but with envelopes now constructed from simulations conditioned
on both the two first order covariates and an additional clustering model.
74 Andrew Bevan
Table 4.2 Summary results of final models (after adjustment for the correlation of the clustering component).
cautious about our interpretation of the extent of pottery scatters as proxies for settlement footprints,
because (to offer an interpretation) highly localised, dense pottery dumps may under-represent village
spaces whilst wider manuring and other pottery dispersal in the wider landscape may over-represent them
(with a more accurate inference about what constitutes a ‘village’ probably lying somewhere in-between).
Conclusion
The above discussion is a necessarily rapid survey of the theoretical underpinnings, methodological chal-
lenges, and analytical opportunities associated with point pattern analysis in archaeology. While impor-
tant developments over the last few years have enhanced the applicability of these methods, a range of
challenges still remain. These methodological struggles are rarely new or trivial, but the way they arise
with respect to the deceptively simple point pattern, serves nicely to elucidate wider challenges of theory,
method and communication in archaeology overall.
Acknowledgements
I have benefitted for many years from discussion with a range of spatially and statistically inclined colleagues
and students at University College London (although any remaining problems with it are of course my
own). Thanks especially to Mark Lake and James Conolly with whom I have co-taught relevant courses
and/or co-published on several occasions, as well as to Enrico Crema, Alessio Palmisano and Eva Jobbova
for further in-class assistance and insight. Thanks also to Denitsa Nenova for reading a draft of this chapter,
Daniel Pett for providing the PAS dataset and Joanita Vroom for her work on pottery from Antikythera’s
more recent past (for both the pottery and buildings datasets, see DOI: 10.5284/1024569). Many thanks to
Mark Gillings, Piraye Hacıgüzeller and Gary Lock for some fine editorial suggestions as well. Although soft-
ware solutions do come and go, it is worth noting here that the best place to conduct point pattern analysis
and point process modelling is currently the R statistical environment (R Development Core Team, 2011),
especially with the spatstat package (Baddeley & Turner, 2005; Baddeley et al., 2015).
Notes
1 In order to address as wide an audience as possible, this chapter tries to limit the amount of new statistical jargon
that it introduces, but inevitably some technical vocabulary is necessary. Important concepts are italicised the first
time they appear, while other jargon terms are sometimes mentioned more cursorily in single quotation marks.
Spatial point patterns and processes 75
The use of formalised statistical notation for concepts or particular analytical functions has been avoided: this
strategy of course has both strengths and weakness, but readers who prefer these formalisms are referred to the
four textbooks cited alongside this note.
2 While this chapter deals with how clustered points might be modelled, it does not deal with the related topic of
how one assigns points in a distribution to a particular clustered group. The latter is the spatial version of a wider
cluster-definition problem for which there are many spatial and aspatial applications (e.g. Maddison, this volume).
3 It is worth emphasising (a) that this Monte Carlo envelope does not provide a ‘confidence interval’ for the true
value of the K function, nor (b) does the default pointwise approach used in most software provide a ‘global sig-
nificance test’ for interaction distances, because of the statistical pitfalls of testing multiple hypotheses at once (i.e.
multiple circle radii: Baddeley et al., 2015, pp. 233–236, for further details).
4 This paper largely avoids introducing the considerable technical terminology used for speaking about point
interaction models and covariance, but it is worth noting that for this case study, a Log Gaussian Cox Process
(LGCP) was fitted with spherical model for the buildings, and another with an exponential model for the
differently-shaped clustering of the pottery (in both cases also specifying an inhomogeneous trend via the two
mapped covariates). LGCPs are well-suited to situations where the causes of the additional clustering might
involve missing first-order variables and/or a mixed set of second-order interactions. It is also possible to fit
cluster processes that are more explicit about the kind of point interactions involved (e.g. parent-offspring
models of Neyman-Scott type, see the kppm() function in the R spatstat package and Baddeley et al., 2015,
pp. 473–479, for further details).
References
Allen, K. M. S, Green, S. W., & Zubrow, E. B. W. (Eds.). (1990). Interpreting space: GIS and archaeology. London, UK:
Taylor and Francis.
Baddeley, A. J., Rubak, E., & Turner, R. (2015). Spatial point patterns: Methodology and applications with R. Boca Raton,
US: Chapman and Hall and CRC.
Baddeley, A. J., & Turner, R. (2005). Spatstat: An R package for analyzing spatial point patterns. Journal of Statistical
Software, 12(6), 1–41.
Bevan, A. (2012). Spatial methods for analysing large-scale artefact inventories. Antiquity, 86, 492–506.
Bevan, A., & Conolly, J. (2006). Multi-scalar approaches to settlement pattern analysis. In G. Lock, & B. Molyneaux
(Eds.), Confronting scale in archaeology: Issues of theory and practice (pp. 217–234). New York, US: Springer.
Bevan, A., & Conolly, J. (2013). Mediterranean Islands, fragile communities and persistent landscapes: Antikythera in long-term
perspective. Cambridge: Cambridge University Press.
Bevan, A., & Wilson, A. (2013). Models of settlement hierarchy based on partial evidence. Journal of Archaeological
Science, 40(5), 2415–2427.
Bevan, A., Crema, E. R., Li, X., & Palmisano, A. (2013). Intensities, interactions and uncertainties: Some new
approaches to archaeological distributions. In A. Bevan & M. Lake (Eds.), Computational approaches to archaeological
spaces (pp. 27–51). Walnut Creek, US: Left Coast Press.
Bevan, A., Frederick, C., & Krahtopoulou, N. (2004). A digital Mediterranean countryside: GIS approaches to the
spatial structure of the post-Medieval landscape on Kythera (Greece). Archaeologia e Calcolatori, 14, 217–236.
Clark, P. J., & Evans, F. C. (1954). Distance to nearest neighbour as a measure of spatial relationships in populations.
Ecology, 35, 445–453.
Clarke, D. L. (1968). Analytical archaeology. London, UK: Methuen.
Clarke, D. L. (Ed.). (1977). Spatial archaeology. Boston: Academic Press.
Conolly, J., & Lake, M. (2006). Geographical information systems in archaeology. Cambridge: Cambridge University Press.
Crawford, O. G. S. (1912). The distribution of early Bronze Age settlements in Britain. The Geographical Journal,
40(2), 184–197.
Crema, E., Bevan, A., & Lake, M. (2010). A probabilistic framework for assessing spatio-temporal point patterns in
the archaeological record. Journal of Archaeological Science, 37(5), 1118–1130.
Diggle, P. (2013). Statistical analysis of spatial and spatio-temporal point patterns. Boca Raton, US: CRC and Taylor and
Francis.
Donnelly, K. (1978). Simulations to determine the variance and edge-effect of total nearest neighbour distance. In
I. Hodder (Ed.), Simulation studies in archaeology (pp. 91–95). New York, US: Cambridge University Press.
76 Andrew Bevan
Eve, S., & Crema, E. (2014). A house with a view? Multi-model inference, visibility fields, and point process analysis
of a Bronze Age settlement on Leskernick Hill (Cornwall, UK). Journal of Archaeological Science, 43, 267–277.
Gelfand, A., Diggle, P., Fuentes, M., & Guttorp, P. (2010). Handbook of Spatial Statistics. London, UK: CRC and Taylor
and Francis.
Hodder, I. (1977). Spatial studies in archaeology. Progress in Human Geography, 1, 33–64.
Hodder, I., & Hassell, M. (1971). The non-random spacing of Romano-British walled towns. Man, 6, 391–407.
Hodder, I., & Orton, C. (1976). Spatial analysis in archaeology. Cambridge, UK: Cambridge University Press.
Illian, J., Penttinen, A., Stoyan, H., & Stoyan, D. (2008). Statistical analysis and modelling of spatial point patterns. New
York, US: Wiley-Interscience.
Kelsall, J., & Diggle, P. (1995). Non-parametric estimation of spatial variation in relative risk. Statistics in Medicine,
14, 2335–2343.
Ladefoged, T., & Pearson, R. (2000). Fortified castles on Okinawa Island during the Gusnku Period, AD 1200–1600.
Antiquity, 74, 404–412.
Nakoinz, O., & Knitter, D. (2016). Modelling human behaviour in landscapes: Basic concepts and modelling elements. New
York: Springer.
Orton, C. (2004). Point pattern analysis revisited. Archaeologia e Calcolatori, 15, 299–315.
R Development Core Team. (2011). R: A language and environment for statistical computing. Vienna: R Foundation for
Statistical Computing. Retrieved from www.R-project.org/
Ripley, B. (1977). Modelling spatial patterns. Journal of the Royal Statistical Society B, 39(2), 172–212.
Robert, C., & Casella, G. (2004). Monte Carlo statistical methods (2nd ed.). New York, US: Springer.
Smith, B., Davies. T., & Higham, C. (2015). Spatial and social variables in the Bronze Age Phase 4 cemetery of Ban
Non Wat, Northeast Thailand. Journal of Archaeological Science Reports, 4, 362–370.
Vanzetti, A., Vidale, M., Gallinaro, M., Frayer, D. W., & Bondioli, L. (2010). The iceman as a burial. Antiquity, 84,
681–692.
Wheatley, D., & Gillings, M. (2002). Spatial technology and archaeology: The archaeological applications of GIS.
London, UK: Taylor and Francis.
5
Percolation analysis
M. Simon Maddison
Introduction
Percolation analysis is a technique for identifying and demarking clusters within a set of spatially arranged
points. The percolation theory was originally developed in the 1940s as a means of describing gelation
processes in materials. This occurs as small branching molecules chemically bonded to become progres-
sively larger macro-molecules, through the formation of more and more bonds (Stauffer & Aharony,
1991) and is attributed to the work of Flory (1941) and Stockmayer (1943). An everyday example is the
change that happens to an egg when it is boiled.
There are two key aspects of this early work. First, it is based on a cellular lattice model, whereby each
cell is either occupied or not, and may or may not have neighbours. The second aspect is that the process
is applied on a sequential basis. Given a configuration of occupied cells, the process is applied step by
step, so that on the first step any given cell forms a cluster with its nearest neighbours, on the second step
the process is reapplied to the newly clustered neighbours and so on. Early research interest was in this
development within the lattice and the conditions that would allow it to progress (Stauffer & Aharony,
1991). For low occupation density within the cellular lattice clusters will be limited and will not spread
far. However, at a critical density the cluster can grow across the lattice indefinitely, as in a boiled egg. It
was on the conditions that would allow this to occur that this early work was focused.
It was later that Broadbent and Hammersley (1957) dealt with the lattice progress mathematically (see
following) and gave the theory and method its name. Frisch and Hammersley (1963) describe the mecha-
nism as fluid spreading through a medium and draw a clear distinction between percolation and diffusion;
the behaviour is determined respectively either by the nature of the medium or of the fluid. These two
approaches offer significantly different mathematical challenges, but it is the medium that is of interest in
this case, hence the technique and term used is percolation.
Other applications perhaps better explain the percolation name, as the same model theory can be
used to describe, for example, the percolation of water through porous stone, hydrogen through solids, or
natural gas through porous rocks. The conditions sought are those that allow this to happen throughout
the material. A conceptually accessible application is for the propagation of fire through a forest. Fire will
spread from tree to tree within a cluster when they are close enough, but at a critical density the fire can
spread indefinitely. The temporal aspect of the theory is important as it can be used to model the length
78 M. Simon Maddison
of time before the fire burns itself out, based on the number of steps it takes to propagate (Stauffer &
Aharony, 1991).
In summary, the percolation theory is a way of mathematically describing clusters of spatially arranged
points and analysing related behaviour. A cluster is based on a defined distance threshold, so that for any
given point all neighbouring points falling within this threshold are part of the cluster. The test is then
re-applied for each of these neighbours, and any further points meeting this criterion are also deemed
to be part of the cluster (see following). The test is based on the distance between points as defined by
the number of cells between them (Stauffer & Aharony, 1991) and it is important to note, as can be seen
from the examples quoted, that this can be applied at any scale, from the molecular to the geographical
and beyond.
Method
Extending this widely diverse range of applications, percolation theory has more recently been used in
geography to identify metropolitan areas, based on population density. The City Clustering Algorithm
(CCA) has been developed out of the percolation theory by Rozenfeld, Rybski, Gabaix, and Makse
(2011) using British population data recorded for geographical cells. Described as ‘discrete CCA’, this is
based on a lattice model as described above and illustrated in Figure 5.1; the cellular structure and cluster
development are shown, illustrating the step by step approach of the cluster being identified. The top
Figure 5.1 Discrete City Clustering Algorithm applied to population density in the UK (Rozenfeld et al.,
2008, Figure 1). This shows the step by step approach of the cluster being identified on a given lattice. Top
left shows a populated lattice and top right a cell is chosen as the starting point, and its immediate neighbours
are then incorporated (bottom left). In the final bottom right quadrant the process has been reapplied to those
neighbours as well.
Percolation analysis 79
left quadrant shows a populated lattice. In the top right quadrant a populated cell is arbitrarily chosen as
the starting point, and its immediate neighbours are then incorporated (bottom left). In the final bottom
right quadrant the process has been reapplied to those neighbours as well. This approach using population
density has been carried forward by Arcaute et al. (2015).
This technique was further developed by Rozenfeld et al. (2011) to apply to US population data,
which was not available on a cellular basis as it is in Britain. In this development, the City Cluster-
ing Algorithm was modified to operate within a continuous (two-dimensional) space and use the
Euclidean distance between points, as opposed to distance within a cellular lattice for the ‘discrete
CCA’. The technique is described as ‘continuum CCA’, shown in Figure 5.2, where an arbitrary
point is selected as a start. Any point falling within a defined threshold distance ‘l’ becomes part of
the cluster, and the process is then re-applied to each of these points in turn, until the cluster grows
no further.
Arcaute et al. (2016) have more recently adopted this technique for defining urban areas, using
the density of street interconnections rather than population. Interestingly from the archaeological
point of view, they note that this approach reveals patterns that have evolved over millennia through
the influence of culture, politics, administration and trade, expressed as the modern patterns of streets
and roads. They have also developed analytical techniques for identifying transition points in cluster
growth as the distance threshold is progressively increased (see below). Before we go on to consider
the application of percolation techniques in archaeology it is important to note that the focus of these
Figure 5.2 Continuum City Clustering Algorithm (CCA) (Rozenfeld et al., 2011, Figure 2). With Con-
tinuum CCA, the technique is applied in a continuous space (as opposed to a lattice) and neighbours are
defined as falling within a given radius ‘l’. The technique is applied sequentially, starting with an arbitrarily
selected point (top left quadrant), and is then applied repeatedly to the newly included neighbours until the
cluster grows no more.
80 M. Simon Maddison
studies has not been purely academic; they have the potential for directly influencing regional develop-
ment policy, for example.
The first application of the ‘percolation method’ to the spatial analysis of archaeological data came
with the work of De Guio and Secco (1988) who were studying landscapes of power in Mesopotamia.
Importantly and fundamentally they recognised it as a technique of pattern recognition to identify
‘natural groupings’, which might be compared with other hypotheses and sources of information. They
however used a model that incorporated several weighting parameters for each site pair in addition to the
simple Euclidian distance between them; these were mainly based on estimates of site population size, and
include ‘weighted density’, ‘demographic energy’ and ‘dominance’. Unlike Euclidian distance, this intro-
duces significant dependencies on estimates and a degree of subjectivity. Possibly due to this complexity
their work does not appear to have been taken up or more widely used.
More successful have been approaches that built upon the CCA studies discussed earlier. For example
Arcaute, Brookes, Brown, Lake, and Reynolds (forthcoming), have collaborated in applying the same
technique to sites extracted from the Domesday Book (e.g. The Domesday Book Online). They have used
Domesday vills and the administrative territories of hundreds, wapentakes and shires as recorded in 1086
AD to peel back the palimpsest of regions, territories and administrative boundaries in order to reveal
Domesday administrative organisation and relate it to modern Britain, through studies based on road
intersections. This builds on earlier work on the Anglo-Saxon state by Brookes and Reynolds (2011) and
the Landscapes of Governance Project.1 Likewise, Brown (2015) has developed a GRASS GIS routine for
percolation, again using the same core technique, and applied it to the database of rural settlements in
England created by Roberts and Wrathmell (2000). The approaches developed in the Domesday work
have been influential, directly inspiring the hillfort analysis that forms a case study in this chapter (Mad-
dison, 2016, 2017).
For the study of hillforts in Britain and Ireland, which forms one of the case studies described below,
a suite of programs has been developed to perform percolation analysis (Maddison, 2016) in the R sta-
tistical programming language.2 This is based on core code provided by Elsa Arcaute originally written
for the Domesday study. It uses an algorithm very similar to the continuum City Clustering Algorithm
developed by Rozenfeld et al. (2011) described above. In this approach clusters are identified by creating
a ‘graph’ of nodes based on distances within a defined radius. The process can be repeated for a range
of different percolation radii (threshold distances), given as parameters. It should be noted that whilst
this technique works satisfactorily for datasets of a few thousand points, as in the case of the study of
hillforts in Britain, it is not able to cope with the many tens of thousands of points that make up datasets
such as the UK street intersection analysis. To deal with these volumes of data bespoke solutions have
been developed using the C programming language (Arcaute et al., 2016). Brown (2015) implemented
a similar approach through a GRASS GIS function (written in C) with some variance to optimizing the
data handling, described below. In all these bespoke solutions, data are processed in the form of x and y
coordinates for each point (e.g. the location of a hillfort or vill site), and typically read in from a .csv file,
including a unique identifier for each.
Examples of cluster plots (for hillforts) are shown in Figure 5.7. The colour indicates the size
ranking of the cluster, with red being the largest cluster, blue the next and so forth. Only the largest
15 clusters are coloured, lesser ranked clusters are shown in grey, so that all sites are still plotted. This
provides a qualitative indication of clusters, which will be discussed further below, but it is invalu-
able to also have some way of identifying the significant clustering transitions as the percolation radius
increases. After exploring various different approaches Arcaute et al. (2016) developed the percolation
transition graph building upon earlier experiments in their city (Arcaute et al., 2015) and Domesday
studies (Arcaute, Ferguson, Brookes, & Reynolds, 2014). An example of this is shown in Figure 5.3,
Percolation analysis 81
Figure 5.3 Percolation transition plot – max. cluster size vs. percolation radius.
Source: Taken from the hillfort case study, the vertical axis shows the normalized maximum cluster size plotted for each percolation
radius. In this example it shows a ‘super-cluster’ forming at a percolation radius of 35km as well as other larger transitions such as
those at 12km. The chart enables transitions of potential interest to be identified and further investigated.
from the hillfort case study. The vertical axis shows the normalized maximum cluster size plotted for
each percolation radius. In this example it shows a ‘super-cluster’ forming at a percolation radius of
35km. This enables transitions of potential interest to be identified and further investigated. It is worth
noting that transitions will not necessarily occur for the same values in different regions. This may
depend on topography and site density for example.
This method of identifying clusters within a set of spatial points, based on Euclidian distance, is
similar to the Density Based Spatial Clustering of Applications with Noise (DBSCAN) (Sander, Ester,
82 M. Simon Maddison
Kriegel, & Xu, 1998; Schubert, Sander, Ester, Kriegel, & Xu, 2017). The dbscan provides two levels of
cluster membership, and defines clusters based on a minimum number of points that must be within
the given radius. Points that satisfy this condition are ‘core’, and those that are not but are reachable via
core points are ‘density-reachable’ or ‘non-core’. There is also a category of ‘outliers’ or ‘noise’ which are
neither. Depending on the given value of the minimum number of points necessary to make a cluster
(k) this reduces the effect of single points or thin chains of points linking clusters. Percolation analysis is
essentially a reduced case of this with the minimum number of points per cluster k being 2, and therefore
having only a single category of cluster membership, that is ‘core’.
Note that there is no sense in which percolation analysis seeks to explain the existence of clusters,
rather it is a descriptive process that inductively highlights underlying patterns. The question then is
whether these clusters are in some sense a relict of former socio-political entities, and other evidence can
then be sought to establish if this is the case.
It is also important to remember that the percolation analysis has evolved from statistical techniques
in materials science where it is applied to sets of spatially distributed points, which are notionally identi-
cal and where there is no interest in distinguishing particular individuals. The same could be said for
road intersections when applied in geography. However, when used in archaeology then there is poten-
tially very great interest in the individual points, reflecting as they do distinct archaeological entities; for
example when clusters merge as the radius threshold increases, specific sites may play a key role, acting as
a link to join them together and the ready identification of such sites is therefore of great value. This is
currently a limitation of the statistical functions described above, as has been recognised both by Brown
(2015) and Maddison (2017), and is a highly relevant topic for development for specifically archaeologi-
cal applications.
Case studies
Three case studies are presented below, which provide examples of the application of percolation analysis
to archaeological and historical databases; they illustrate the value and the further potential of the tech-
nique as well as providing some clear pointers as to where it might be developed in the future.
Domesday vills
Arcaute et al. (forthcoming) applied the continuum city clustering algorithm to Domesday vill sites, in
order to identify patterns of settlement in England at Domesday. The hypothesis driving the analysis was
that there was a hierarchy of territories, defined by the degree of interconnectivity between vill sites, for
which percolation is an ideal analytical tool. The dataset comprises 13,448 sites and were analysed using a
suite of programs written in R (see following). The cluster transitions are neatly shown in Figure 5.4, with
thumbnails of the clusters plotted over an outline of England and Wales. Clusters have been computed for
percolation radius incremented in steps of 0.1km, reflecting the geographical scale of the data. The step
size was determined empirically; using a larger step size of for example 1km would mean missing some
of the potentially interesting intermediate clusters. Using a smaller step size would result in additional
computation and plots of little value. By 8.6km, all sites fall within a single cluster.
The plots for radii of 3km and 2.9km overlaid on an outline of Domesday counties show clusters that
conform well to these boundaries (Figure 5.5). At 3.2km clusters also correspond well with the known
boundaries of older kingdoms derived by traditional historical methods, including those of Mercia, the
East Angles and Kent, as well as some other counties (Figure 5.6(A)). Subdivisions of the Iron Age and
Roman Province of Dumnonia are also identifiable. The visual power of the cluster plots, overlaid on
Percolation analysis 83
Figure 5.4 Percolation cluster transitions for Domesday settlement. Evolution of the largest cluster in the per-
colation process of Domesday settlement, overlaid on the transition plot (as in Figure 5.3). Maps of the clusters
at the distance threshold for each transition are depicted. Each vector point colour represents membership,
when two or more nodes are close enough to be part of the same cluster. A colour version of this figure can
be found in the plates section.
the Domesday county boundaries is striking, particularly as it corresponds so well with evidence from
historical sources of different periods.
Figure 5.6 Domesday vill and 19th-century settlement clusters. (a) Domesday vill clusters at 3.2km overlaid
on coastline and Domesday counties, generated from Domesday vill datasets provided by Stuart Brookes; (b &
c) Roberts and Wrathmell’s 19th Century Settlement Nucleation dataset at 3km (Brown, 2015, p. 37) and at
3.5km overlaid by Roberts and Wrathmell’s central province (Brown, 2015, p. 57).
Percolation analysis 85
Brown applied his percolation function to this database, using a radius step size of 500m. As above for
the Domesday example, this was chosen empirically to strike a balance between unnecessary computation
and output charts, and sufficient granularity to observe the development of clusters. By 13km all points
are included within a single cluster. Regions start to appear at 2.5km and agglomerate to three much
bigger clusters at 3km (Figure 5.6(B)), with a ‘central province’ convincingly appearing at a radius of
3.5km (Figure 5.6(C)). As the radius increases, Cornwall and part of Devon as well as Cumbria remain
independent at 5km as other areas are absorbed. There are some differences to Roberts and Wrathmell’s
central province, but this highlights areas for focused investigation in order to identify possible explana-
tions. There is much more than can be covered here, but the use of percolation analysis for this dataset
clearly provides great value not only in corroborating some of the core conclusions of Roberts and
Wrathmell’s work, but also highlighting nuances as well as deeper and older patterns worthy of further
detailed investigation.
Figure 5.7 Hillfort clusters in Britain, at (a) 34km, (b) 12km and (c) 9km percolation radius.
these values were established empirically as a balance between distinctiveness and lack of differentiation
of the generated cluster plots. By 35km all sites form a single large cluster, except for the Hebrides,
Shetlands, Isle of Man and Scillies. Apart from southeast Scotland, the most interesting transitions occur
in the 6–13km range (see Figure 5.3). Above this there are few transitions and the clusters are very large.
Below this range the clusters fragment excessively, although they may be of value for very local studies.
Plots for 34km, 12km and 9km radii are shown in Figure 5.7.
At 34km (Figure 5.7(A)) the predominantly Scottish cluster includes sites in England as far south as a
line roughly between Morecambe and Flamborough Head, with more southern sites forming the largest
cluster. Looking at England as the radius values reduce, sites in the Pennines and the east progressively
break out of the bigger cluster, and at 14km, the southwest peninsula forms its own cluster in Cornwall
and part of Devon, with other clusters appearing in the southeast.
The plot for 12km (Figure 5.7(B)) shows for example Cornwall and Devon/part of Somerset as
individual clusters, and a cluster along the Chilterns. The plot for 9km (Figure 5.7 (C)) shows northwest
Wales, the Clwydian Range, southwest Wales, the Gower, central Wales and the Marches and two clusters
on the northwest and the southeast of the Severn Valley, the latter being the Cotswolds. Some of these
clusters have been the subject of more detailed analysis (Maddison, 2016, 2017) and two are discussed
below.
To illustrate, two clusters in Britain are selected for discussion, namely central Wales and the Marches,
and the Cotswolds and lower Severn Valley. These have been identified from the cluster plots in Fig-
ure 5.8 and Figure 5.9. They have been plotted using a GIS on a topographical map, and the site size in
hectares has been used to scale the symbols. Specific sites and key rivers are also indicated.
Figure 5.8 Central Wales cluster at 9km with sites plotted according to area, and the rivers Wye and Severn.
88 M. Simon Maddison
Cotswold cluster at 10km radius with sites plotted according to area, and the rivers Wye, Severn
Figure 5.9
and Thames.
The Central Wales and the Marches cluster is situated around a high hilly region, with the upper
stretches of the Severn being a key feature. It incorporates sites that lead around the south to the upper
Wye, as well as the large site at Titterstone Clee over the river Teme to the east.
There are two dominant sites on the upper reaches of the Severn River; Llanymynech Hill and Y Bre-
iddin. Llanymynech is one of the largest sites in Britain at 57ha, and is located where the rivers Vyrnwy,
Tanat and Cain reach the Severn Plain. It was very important as a source of copper, zinc and lead through
the Iron Age, and there is evidence of metal working dating earlier than this. Y Breiddin at 28ha is located
on the southern side of the Severn, shortly below where it is joined by the river Vyrnwy. These larger
sites suggest roles as dominant control points or entrepots for goods moving from the hilly hinterland
to the plains and low lands beyond, down the River Severn as Brown (2008, pp. 196–204) has argued in
detail building on from the ideas of Sherratt (1996).
Figure 5.9 shows the Cotswold cluster at 10km radius located quite distinctly in the topography of
the hills, with two large and ten other sites on the north-west edge of the escarpment overlooking the
Severn Valley, Gloucester and Cheltenham. The upper reaches of the Thames are also shown to the south,
with Trewsbury Hillfort next to its source, as well as the Fosse Way, a railway and the Thames and Severn
Canal. One other site close to the Severn is Towbury Hill Camp, next to the M50 Severn bridge 4km
north of Tewkesbury. The location of the larger sites reflects not only the longevity of their importance,
through incorporation of much older monuments, for example from the Neolithic and Bronze Ages at
Percolation analysis 89
Nottingham Hill Camp (shown), but also their possible role in trade, being positioned on key waterways
and routes which have continued in importance up to modern times (e.g. Roman road, canal and railway,
M50 river crossing).
These brief analyses strongly hint at groups with distinct regional identities, tied to topographical
regions, with the potential for specific sites to be important either in terms of landscape dominance or as
key linking sites through important transshipment routes. The Cotswold group is also notable for fitting
in well within the old Gloucestershire county boundary. Studies of other clusters (Maddison, 2016) draw
comparable results. Together these suggest that further detailed comparative work using additional data
attributes from the Atlas database, (e.g. architectural features) and different data sources (e.g. finds records)
might together build a strong case for identifying clearly defined regions for the period.
Implementation details
The following notes describe some key implementation details for the case studies described above.
The Domesday vills study used R code developed by Elsa Arcaute. This was then further developed
by the author for the hillfort study. The studies on street intersections in Britain used bespoke programs
written in C to handle the large datasets of many tens of thousands of points, which the R code could not.
The x and y coordinates for each point, along with a unique identifier, are read in from a .csv file. Dis-
tances between points are computed and stored in a sparse matrix by systematically working through each
point to every other point. An upper distance limit is set as a parameter, and values above this are not stored;
this keeps the file size from becoming unnecessarily large. Initial runs with the hillfort dataset for instance
showed that by a threshold of 40km almost all such sites in Britain fall within a single cluster; there was
therefore no point in storing the distance for points further apart, such as between those in Cornwall and
Scotland, for example. This improves speed of processing and keeps data files manageable in size.
For a given percolation radius, the matrix is reduced to those point pairs where the inter-point dis-
tances are less than or equal to the radius. A graph generation process is then applied to this sub-matrix,
and the clusters computed (a cluster comprising at least two points); each point is assigned a cluster
identifier for that particular percolation radius. The R functions used are: graph.edglist() to create the
graph from the pair list matrix, and clusters() to generate the clusters from the generated graph (R igraph
manual pages – Connected components of a graph, 2015). This process is repeated for each required value of
the percolation radius.
For mapping, the clusters are ranked by size, and a colour sequence assigned, so that clusters can be
displayed with a colour coding according to rank (see Figure 5.7 for example).
Further R code has been developed to generate the data in map form using overlay boundary outlines,
provided as shape files, and accommodating the appropriate geographical reference frame. In the hillfort
case study earlier, these frames are different for Ireland and the British mainland.
Comparison with the R implementation of DBSCAN for hillforts has shown that the results are
indeed the same for k=2, with visual inspection of the clusters and the cluster transition plots show-
ing them to be identical. The recent R implementation of dbscan was used (Hahsler, Matthew, Arya, &
Mount, 2017) and integrated into the percolation analysis package described above, so the same source
data, parameters, mapping and analysis programs could be used. According to the R documentation
(dbscan), this implementation is significantly faster than earlier versions published in the fpc package
(Hennig, 2015) and works with larger datasets.
There is also a GRASS GIS function for clustering computations, v.cluster which includes a DBSCAN
method (Metz, 2015), but, to date, this has not been compared with the R implementation described and
may not handle large datasets.
90 M. Simon Maddison
Brown (2015, pp. 22–24) took a different approach in that he implemented a GRASS function, in C.
He similarly starts with the computation of a distance matrix for every point pair. However, he then con-
verts the matrix to an edge list, and sorts the list on the basis of edge length. Clusters are then identified by
a membership algorithm which assigns cluster membership to points whose associated edge distances are
less than the defined percolation threshold. A key benefit of this approach is that the cluster identity (an
integer) is consistent as it grows through different radii (until it is either the largest cluster or is absorbed
into a larger cluster). This is unlike the R solutions above where the cluster identity is arbitrarily assigned
for each computation. This means it is much easier to inspect and interpret clusters over a range of radii
and makes it more accessible for archaeological applications.
Conclusion
Percolation analysis is a technique for identifying clusters within a set of spatial points. The examples
described above have used simple Euclidean distance, avoiding unnecessary complexity. In this way it is
a simple application of Tobler’s Law (Tobler, 1970) in that relatedness increases with proximity. The case
studies suggest that percolation analysis has great potential within archaeology for exploring and investi-
gating spatial data for evidence of past socio-political entities and distinct regions with their own identity.
In the first two studies historical datasets presenting tight snapshots of particular points in time have
been used for comparison with other evidence to explore hierarchies of territory. For the Domesday vills
this has provided corroboration of some county/administrative boundaries, as well as highlighting the
relicts of older kingdoms that no longer existed. The enduring character of these boundaries suggest the
factors that created them in the first place continued to be influential in later periods and in some cases,
up to modern times. Where there are differences, as in the central province of 19th-century settlements,
then they provide focus for more detailed investigation to better understand those specific areas. The fact
that there is good historical evidence from other sources for comparison, and that percolation analysis
provides not only corroboration but leads to other channels of investigation, gives a robust validation for
the application of the technique.
The hillfort study is different in a number of ways. There is no historic evidence for comparison, and
the dataset embodies sites that were created, developed and abandoned over different periods and over
very many centuries. In this case, percolation analysis has been applied in a totally exploratory way, to
identify groups of sites for investigation based on their relative proximity, rather than something poten-
tially much more arbitrary, as for example a modern county boundary. The results are visually compelling
and the comparison of clusters with topography, combined with symbology scaled by the enclosed area
of the sites, gives strong support to an argument for regionality, which warrants pursuing with other evi-
dence such as coins, pottery, building styles and so forth (see Cunliffe, 1991). It also provides the potential
for detailed comparisons with other datasets such as the Portable Antiquities Scheme,3 finds, place names,
population genetics (Leslie et al., 2015) and indeed the results of the other case studies.
A major weakness of course is that no account has been taken of site dating. The data and the clusters
generated reflect the final state of hillfort construction, and do not necessarily represent the situation that
may have prevailed in earlier times. However, the approach lends itself readily to more refined analy-
ses with such dating information as exists, and of course can be run repeatedly over time as more data
becomes available.
Percolation analysis is a new technique for archaeology and there are clear opportunities for further
development. As noted earlier it comes from a statistical background where points are identical, but this
is not the case in archaeology. The ability to readily identify individual points out of the analysis, such
as the peripheral ‘linking’ sites that bring clusters together as the radius is extended, would be of great
Percolation analysis 91
value. Other potentially important features are establishing a metric for cluster ‘robustness’ as the radius is
changed, and the need to find a way of labelling clusters as they grow. This would aid comparisons with
other datasets, and other groupings over long periods of time. As the available statistical methods are not
designed to provide this, development and adaptation of tools specifically for archaeology is an obvious
next step, as has been argued by Brown (2015). To this end, DBSCAN with its different classes of cluster
membership may provide a useful starting point.
All the examples here are based on simple Euclidean distance between points, and this has the advan-
tage of being easy to justify. However, as De Guio and Secco (1988) attempted, other parameters could
also be used, including weighted distances as a way of exploring cluster relationship to human movement
(Brown, 2015). Even for the transitions observed based on Euclidean distance, (e.g. in Figure 5.4) some
comparisons with ‘day’s walk’ distances for different terrains might help explain variations in cluster
transition distances in different regions (see Herzog, this volume).
Percolation analysis provides a powerful way of visualizing clusters within an archaeological spatial
dataset and deserves to become a core tool for spatial analysis in archaeology. It is not a magic tool to
elicit the past, but it can generate useful hypotheses and the starting point for more detailed work, which
can then focus on relevant details and incorporate other sources of data, providing guidance, support and
corroboration. It hints at possible prehistoric groupings and cultural/socio-political entities, but as always
detailed investigation on a case by case basis is required.
Notes
1 www.ucl.ac.uk/archaeology/research/projects/assembly, Reynolds, A., Yorke, B., Carroll, J., Baker, J., & Brookes,
S. Landscapes of governance project. Retrieved from http://www.ucl.ac.uk/archaeology/research/projects/assembly.
Accessed 2016.
2 www.r-project.org, The R project for statistical computing. Retrieved from https://www.r-project.org. Accessed
2015, 2016.
3 https://finds.org.uk/, The portable antiquities scheme. Retrieved from https://finds.org.uk/. Accessed 2016.
References
Arcaute, E., Brookes, S., Brown, T., Lake, M., & Reynolds, A. (forthcoming). Case studies in percolation analysis:
The distribution of English settlement in the 11th and 19th centuries compared. Journal of Archaeological Science.
Arcaute, E., Ferguson, P., Brookes, S., & Reynolds, A. (2014). Natural regional divisions of places in Domesday Book. Paper
presented at the The Connected Past, Imperial College London.
Arcaute, E., Hatna, E., Ferguson, P., Youn, H., Johansson, A., & Batty, M. (2015). Constructing cities, deconstructing
scaling laws. Journal of the Royal Society Interface, 12(102), Article 20140745.
Arcaute, E., Molinero, C., Hatna, E., Murcio, R., Vargas-Ruiz, C., Masucci, A. P., & Batty, M. (2016). Cities and regions
in Britain through hierarchical percolation. Royal Society Open Science, 3(150691). http://dx.doi.org/10.1098/
rsos.150691
Broadbent, S. R., & Hammersley, J. M. (1957). Percolation processes. Mathematical Proceedings of the Cambridge Philo-
sophical Society, 53(3), 629–641. doi:10.1017/S0305004100032680
Brookes, S., & Reynolds, A. (2011). The origins of political order and the Anglo-Saxon state. Archaeology International,
13/14, 84–93. http://dx.doi.org/10.5334/ai.1302
Brown, I. (2008). “Beacons” in the landscape: The hillforts of England and Wales. Bollington: Windgather.
Brown, T. (2015). The potential for percolation analysis within archaeology: Constructing and implementing an accessible percola-
tion method (Unpublished MSc thesis). University College London.
Cunliffe, B. (Ed.). (1991). Iron Age communities in Britain (3rd ed.). London and New York: Routledge.
dbscan. Retrieved from www.rdocumentation.org/packages/dbscan/versions/1.1-1/topics/dbscan
De Guio, A., & Secco, G. (1988). Archaeological applications of the percolation method. Computer and quantitative methods
in archaeology, University of Birmingham.
92 M. Simon Maddison
Introduction
Geostatistics offers a framework for characterising the spatial structure of archaeological variables and
for predicting their values at locations where no sample observations are available. It can also guide sam-
pling strategies by relating spatial variation to sample spacing. In this chapter, we review the principles
of geostatistics at a basic and accessible level and we consider, via case studies, how geostatistics can ben-
efit archaeological applications. We do this with the overall intention of encouraging continued wider
adoption of geostatistics by archaeologists. The chapter begins by introducing some key concepts which
underlie geostatistics. Spatially referenced variables such as measurements of heights on earthworks,
resistivity surveys, or artefacts, have spatial structure and measuring this structure may offer considerable
insights. A key application of geostatistical tools for spatial prediction, is cases where there are sparse
samples (e.g. soil geochemistry or elevation) and a complete set of gridded values is desired for the whole
study area (see also Banning this volume; Conolly this volume). In such cases, kriging prediction can
be used to derive values with the predictions informed by the variogram – a measure of the scale and
magnitude of spatial variation. Kriging and related approaches are introduced before providing some real
case studies to bring these principles and techniques to life in an archaeological context.
Geostatistical methods have been used to analyse a wide variety of variables in archaeological studies.
Introductions to the topic are provided by Robinson and Zubrow (1999), Wheatley and Gillings (2002),
Lloyd and Atkinson (2004), and Conolly and Lake (2006). Lancelotti, Negre Pérez, Alcaina-Mateos,
and Carrer (2017) offer a brief review of kriging and allied methods. Applications of geostatistics in
archaeology are clearly diverse, focusing on a variety of variables: digital elevation models (DEMs)
(Bentley & Schneider, 2000; Hageman & Bennett, 2003; Hesse, 2010; this chapter); soils (Lloyd &
Atkinson, 2004; Entwistle, McCaffrey, & Dodgshon, 2007; Wells, 2010; Burnett et al., 2012); tephra
thickness (Athanassas, Modis, Alçiçek, & Theodorakopoulou, 2018); multiple variables including pollen,
non-pollen palynomorphs, macrofossils, and loss on ignition (Revelles et al., 2017); human remains (14C)
(Bocquet-Appel & Demars, 2000); electrical resistivity (Webster & Burgess, 1980); settlement terminal
dates (Neiman, 1997); site density (Zubrow & Harbaugh, 1978); coins (this chapter); pottery (Bentley &
Schneider, 2000; Lloyd & Atkinson, 2004; Bevan & Conolly, 2009) and lithics (Ebert, 2002; Barrientos,
Catella, & Oliva, 2015).
94 Christopher D. Lloyd and Peter M. Atkinson
An early application was concerned with archaeological site location prediction: Zubrow and Har-
baugh (1978) made use of kriging in an application focused on reducing the effort expended in locat-
ing archaeological sites in Cañada del Alfaro in Guanajuato, Mexico, and the Hay Hollow Valley in
east-central Arizona, USA. The study aimed to predict, from a sample of the sites identified through
fieldwork, the expected number of sites in each cell of a regular grid. The authors found that increasing
the initial sample from 12.5% of the surveyed area to 50% made relatively little difference to the number
of sites found in cells predicted by kriging. In other words, kriging enabled the location of almost as
many of the total sites from 12.5% of the total sample as it did from 50% of the total sample. The study
demonstrated that the density of sites was spatially dependent.
Applications of geostatistics in archaeology are concerned with the spatial distribution of environ-
mental characteristics (elevations, soils, tephra, etc.) or structures or other objects (e.g. pottery, coins
or lithics) produced by humans. While geostatistics has its roots in mining and most applications have
focused on features of the physical environment, using a random function model to represent uncer-
tainty in, for example, settlements or artefacts is logical given that we generally do not understand the
complex sets of interacting processes which produce the spatial distributions we observe. Athanassas
et al. (2018) used kriging to map Minoan tephra; the authors found that there was no spatial structure
for seaborne tephra while airborne tephra exhibited strong structure. An example of the mapping of
artefact distributions using kriging is provided by Barrientos et al. (2015) who mapped lithics from
the Late Holocene (c. 3000–200 14C years BP) in part of east-central Argentina; the authors considered
the production of spatially continuous models to be helpful in understanding the spatial distribution
of lithics, although they comment on the need for richer sources of information to help construct
explanatory models of lithic landscapes. Bocquet-Appel and Demars (2000) used geostatistics to
analyse the spatial structure of 14C dates associated with archaeological levels or directly from human
remains. The authors used kriging to chart the expansion of modern humans and the spatial contrac-
tion of the Neanderthals.
Method
provides a useful summary of the spatial distribution of data values. This tendency has been referred
to as the ‘the first law of geography’ (Tobler, 1970). A related concept is spatial autocorrelation; cor-
relation refers to different variables (e.g. how strong is the correlation between distance from a river
and artefact density?), whereas spatial autocorrelation refers to the correlation of a variable with itself
(see also Hacıgüzeller this volume). Measures such as the Moran’s I spatial autocorrelation coefficient
(see Lloyd, 2010 for an introduction) represent the magnitude of the correlation between data values
and values of the same variable at neighbouring locations. Such measures can be computed over a range
of different neighbourhoods and this provides a representation of spatial structure. Returning to the
example of elevation, a mountainous area has a very different spatial structure to a river flood plain. In
the former case, the spatial variation is short range (or high frequency), as elevation values may differ
even over very short distances. In the latter case, the spatial variation is long range (or low frequency),
as elevation values tend to be quite similar over even large distances. Similarly, spatial properties such
as find locations of pottery sherds or measurements of soil phosphate have a distinct spatial structure.
This chapter introduces tools for measuring spatial structure and it suggests ways in which the derived
information may be useful.
Introducing geostatistics
The basis of geostatistics is the theory of regionalised variables. In geostatistical analysis, spatial varia-
tion (at a location, x) is divided into two distinct parts: a deterministic component (m(x)) (representing
‘gradual’ change over the study area) and a stochastic (or ‘random’) component (R(x)):
Z ( x ) = µ( x ) + R( x ) (6.1)
This is termed a random function (RF) model. The random part reflects our uncertainty about spatial
variables – what seems random to the observer is a function of a multiplicity of factors that may be
impossible to model directly (but this does not mean that we really think variation is random; Isaaks &
Srivastava, 1989). In geostatistics, a spatially-referenced variable, z(x), is treated as an outcome of a RF,
Z(x). That is, we consider an observation to have been generated by the RF model and this gives us a
framework to work with these data. A realisation of a RF is called a regionalized variable (ReV; a spatial
observation set). The theory of regionalised variables (RVT) (Matheron, 1971) is the fundamental frame-
work on which geostatistics is based.
Just as we estimate parameters, for example, the mean and variance of a distribution, we estimate
parameters of the RF model using the data. These parameters, like the mean and variance, summarise the
variable. The mean and variance describe the Gaussian (or normal) distribution and so are only useful
if the empirical distribution is fitted well by a Gaussian distribution. Similarly, the parameters of the RF
model are only meaningful in certain conditions. Where the properties of the variable of interest are
the same, or at least similar in some sense, across the region of interest we can employ what is termed a
spatially stationary model. In other words, we can use the same model parameters at all locations. If the
properties of the variable are clearly spatially variable, then a standard stationary RF model may not be
appropriate (see Crema, this volume). There are different degrees of stationarity, but for present purposes
we will only consider one, termed intrinsic stationarity. There are two requirements of intrinsic stationar-
ity. Firstly, the mean is assumed to be constant across the region of interest. In other words, the expected
value of the variable does not depend on the location, x:
Secondly, the expected squared difference between paired RFs (i.e. the observations) (summarised by the
variogram, g(h)) should depend only on the separation distance and direction (the lag h) between the RFs
and not on the location of the RFs:
1
γ (h) = E[{Z (x ) − Z (x+h )}2 ] for all h (6.3)
2
Where x + h indicates a distance (and direction) h from location x.
In terms of the data, the expected semivariance should be the same for all observations separated
by a particular lag irrespective of where the paired observations are located. In practical terms, the
geostatistical approach can be applied irrespective of these conditions, but the results will clearly
be sub-optimal if the data depart markedly from them. In some cases the mean is allowed to vary
from place to place, but be constant within a moving window. This is termed quasi-stationarity
(Webster & Oliver, 2007).
The variogram
Analysis of the degree to which values differ according to how far apart they are can be conducted by
computing the variogram (or semivariogram). With reference to the variogram, the term lag is used to
describe the distance and direction by which observations are separated. For example, two observations
may be 5 km apart and one may be directly north of the other. In simple terms, the variogram is estimated
by calculating the squared differences between all the available paired observations and obtaining half the
average for all observations separated by that lag (or within a lag tolerance, e.g. 5 km +/-2.5 km, where
the observations are not on a regular grid). The term semivariance refers to half the squared difference
between data values. Figure 6.1 gives a simple example of a transect along which observations have been
made at regular intervals to estimate the variogram. Lags h of 1 and 2 are indicated. So in this case, half
the average squared difference between observations separated by a lag of 1 is calculated and the process
is repeated for a lag of 2 and so on. In many cases the distance between observations will not be regular,
so ranges of distances are grouped. The selection of the bin size (e.g. 0–5 km, >5–10 km, >10–15 km, . . .
or 0–10 km, >10–20 km, . . .) is important. Smaller bin sizes will result in more noisy variograms while
a bin size which is too large will smooth out too much spatial structure and it will not be possible to
capture the spatial variation of interest. In other words, the plotted values in a variogram with too small a
bin size will appear to be widely scattered, while the values in a variogram with a larger bin size will tend
to be more similar to neighbouring values on the plot. Finding an appropriate bin size (usually through
trial and error) is important in characterising spatial structure and in guiding the selection and fitting of
a model, as detailed below.
The variogram can be estimated for different directions to enable the identification of directional vari-
ation (termed anisotropy). That is, rather than consider all observations 5 km from a given observation,
Figure 6.1 Transect with paired points selected for lags of 1 and 2 units.
Geostatistics and spatial structure 97
we may consider only observations that are directly north or south (for example) of the observation of
interest within a particular angular tolerance (e.g. north or south +/-45 degrees). In variogram estimation
(see equation (6.4)), observations z(xi) and z(xj) may be included in calculations if location xj is north
of location xi (indicated by 0º in clockwise from north) within an angular tolerance of 22.5º. Using
an angular tolerance of 45º (45º either side of the directional lines), one strategy would be to compute
directional variograms for 0º, 45º, 90º and 135º, thus, giving complete coverage and with no overlap
between the directions (Webster & Oliver, 2007). The selection of paired data is illustrated graphically in
Figure 6.2 where the specified direction is 45º clockwise from north and the angular tolerance is 22.5º
(i.e. 22.5º either side of the 45º directional line). In this case, z(xi) would be paired with z(xj) since it is
within the specified tolerance.
In summary, the variogram characterises the degree of difference in values as a function of the distance
by which they are separated. The experimental variogram, γˆ(h ) , relates semivariances to distances (and
directions). It, thus, places distance (for a given direction) – (the lag) – on the x axis and semivariance on
the y axis. If a property is spatially autocorrelated, we would expect the semivariance to increase as the
distance between observations increases.
As an example, if we take a distance range of 1000 to 2000 m and there are 346 pairs of observations
separated by a distance within that band then p(h), the number of paired observations, is 346. Note that for
each pair, the semivariance is calculated twice – once with respect to the first location and once with respect
to the second. We then calculate the squared difference between each of these paired values. The first value
in each pair is given by z(xi) and the value separated from it by the specified lag h (in this example the dis-
tance is 1000 to 2000 m and we are concerned with all directions), is given by z( xi + h ) . So, their squared
2
difference is given by {z( xi ) − z( xi + h )} . The summed values are then divided by two – hence the term
semivariance. Putting this together, the experimental variogram for lag h is computed with:
1 p( h )
∑
2
γˆ(h ) = {z( xi ) − z( xi + h )} (6.4)
2 p(h ) i =1
98 Christopher D. Lloyd and Peter M. Atkinson
As an example, if the lag is 5 km +/-2.5 km (i.e. 2.5 km to 7.5 km) and two values are separated by
6.2 km, then these paired observations qualify and we compute the squared difference. If the two values
are 26.2 and 43.3 units then their squared difference is:
2 2 2
{z( xi ) − z( xi + h )} = {26.3 − 43.4} = {−17.1} = 292.41 units
In the same way, we compute the squared difference for all other pairs separated by 2.5 km to 7.5 km and
at each stage add the computed value to the previous values computed for that lag. Once this is done, we
multiply the summed values by 1/(2p(h)).
Figure 6.3 gives an example of an experimental variogram; the fitted model is described later. The
lags are 0–5000 m, 5000–10000 m and so on in groups of 5000 up to 70000 m. In this case, values are
compared irrespective of the direction by which they are aligned. That is, whether they are aligned
(approximately or absolutely) on a line north-south or east-west etc. of one another is irrelevant. A var-
iogram computed from data in all directions is termed omnidirectional.
In Figure 6.3, the semivariance values (that is, the average squared difference between observations
separated by each distance band) tend to be smaller for small lags and they generally increase with an
increase in lag size until perhaps 25,000 m where the values tend to level out (this is demonstrated further
down). This indicates that values are spatially dependent up to approximately this distance. At distances
larger than this, there is no spatial correlation. The variogram provides a useful means of summarising
how values change with separation distance. Using topography as an example, data representing a
‘smooth’ surface like a flood plain will have a very different variogram to data representing a ‘rough’
surface like a mountain range.
A mathematical model may be fitted to the experimental variogram and the coefficients of this model
can be used for spatial prediction using kriging or for conditional simulation (defined further down). A
model can be fitted using some fitting procedure such as ordinary least squares or weighted least squares.
A model is usually selected from one of a set of ‘authorised’ models (see Webster & Oliver, 2007). There
are two principal classes of variogram model. Transitive (bounded) models have a sill (finite variance); that
is, the variogram model levels out as it reaches a particular lag. Unbounded models do not reach an upper
bound. Figure 6.4 shows the components of a bounded variogram model. These will be defined and
then practical examples given. The nugget effect c0 represents unresolved variation (a mixture of spatial
variation at a finer scale than the sample spacing and measurement error). The structured component c
represents the spatially correlated variation. The sill (or sill variance), c0 + c, is the a priori variance. The
range, a, represents the scale (or frequency) of spatial variation. For example, if a region is mountainous
and elevation varies markedly over quite small distances then the elevation can be said to have a high
frequency of spatial variation (a short range a) while if the elevation is quite similar over much of the
area (e.g. it is a river flood plain) and varies markedly only at the extremes of the site (that is, at large
separation distances) then the elevation can be said to have a low frequency of spatial variation (a long
range). The structured component captures the magnitude of variation, while the range represents the
spatial scale of variation.
As noted above, there are many different models that can be fitted to variograms. The variogram in
Figure 6.3, was fitted with a nugget effect and a spherical model component – as defined in Equation 6.6.
The nugget effect (nugget variance) is given as:
0 if h = 0
γ (h ) = (6.5)
c 0 if h > 0
In words, the modelled semivariance has a value of zero for a lag of zero, but is equal to c0 for all positive
lags. In Figure 6.4, the nugget effect is indicated on the y axis of the graph.
The spherical model, a bounded model (that is, it reaches a sill), is defined as:
Kriging
There are many varieties of kriging and a summary of the procedure is also presented by Conolly in this
volume. Its simplest form is called simple kriging (SK). To use SK it is necessary to know the mean of the
property of interest and this must be constant across the region of interest. In practice this is rarely the case.
The most widely used variant of kriging, ordinary kriging (OK), allows the mean to vary and the mean is
estimated for each prediction neighbourhood. The OK predictions are weighted averages of the n available
data (that is, the predictions are based on the n nearest neighbours of the prediction location). The OK
prediction, zˆ( x0 ) , is defined as:
n
zˆ( x0 ) = ∑ λi z( xi ) (6.7)
i =1
with the constraint that the weights, li, sum to 1 (this is to ensure an unbiased prediction):
n
∑λ
i =1
i = 1 (6.8)
In words, the objective of the kriging system is to find appropriate weights by which the available
observations will be multiplied before summing them to obtain the predicted value. These weights are
determined using the coefficients of a model fitted to the variogram (or another function such as the
covariance function).
The weights are obtained by solving (that is, finding the values of unknown coefficients in) the OK
system:
n
∑ λ j γ ( xi − x j ) + ψ = γ ( xi − x0 ) i = 1,..., n
j =1 (6.9)
n
∑ λ j = 1
j =1
Geostatistics and spatial structure 101
where y is what is called a Lagrange multiplier. This equation may seem at first sight complicated. In
words, it says that the sum of the weights multiplied by the modelled semivariance for the lag separating
locations xi and xj plus the Lagrange multiplier equals the semivariance between locations xj and the pre-
diction location x0 with the constraint that the weights must sum to one. The way we find the weights
and the Lagrange multiplier is outlined below.
Computing the weights and a value of the Lagrange multiplier, y, allows us to obtain the prediction
variance of OK, a by-product of OK, which can be given as:
n
σˆOK
2
= ∑ λi γ ( xi − x0 ) + ψ (6.10)
i =1
The kriging variance is a measure of confidence in predictions and is a function of the form of the var-
iogram, the sample configuration and the sample support (the area over which an observation is made,
which may be approximated as a point or may be an area) (Journel & Huijbregts, 1978). If the variogram
model range is short, then the kriging variance will increase markedly with distance from the nearest
sample(s). There are two varieties of OK: punctual OK and block OK. With punctual OK the predic-
tions cover the same area (the support, V) as the observations. In block OK, the predictions are made to
a larger support than the observations (e.g. prediction from points to areas of 2 m by 2 m). The system
presented here is for the more commonly used form, punctual OK.
Returning to Equation 6.9, using matrix notation, the OK system can be written as:
Kλ = k (6.11)
where K is the n+1 by n+1 (with n nearest neighbours used for prediction) matrix of semivariances
between each of the observations:
γ ( x1 − x1 ) γ ( x1 − x n ) 1
K =
γ ( x n − x1 ) γ (xn − xn ) 1
1 1 0
l are the OK weights and k are the semivariances for the observations to the prediction location (with
one placed in the bottom position):
λ1 γ ( x1 − x 0 )
λ = k =
λn γ ( xn − x1 )
ψ 1
To obtain the OK weights, the inverse of the data semivariance matrix is multiplied by the vector of data
to prediction semivariances:
λ = K −1k (6.12)
The OK variance is then obtained with:
σ OK
2
= kT λ (6.13)
102 Christopher D. Lloyd and Peter M. Atkinson
Note that the semivariance between a given location and itself is set to zero.
Solving the OK system, the weights are as follows: l1 = 0.368, l2 = 0.227, l3 = 0.234, l4 = 0.171 and
y = 33.332.
The predicted value is then given by: (0.368 × 68) + (0.227 × 29) + (0.234 × 48) + (0.171 × 53) =
51.889.
The kriging variance is given by: (0.368 × 268.116) + (0.227 × 311.250) + (0.234 × 311.983) +
(0.171 × 367.662) + (33.332 × 1) = 338.537.
The kriging variance is a useful by-product which, as detailed earlier, provides a guide to the uncer-
tainty in the predicted values. Where values of the kriging variance are large, this suggests a higher level
of uncertainty; values will be larger as distance from the nearest samples increases and for short range
variation (as defined previously), as this indicates greater spatial variation.
With SK, the mean is assumed to be constant across the study area; this is very unlikely in reality as most
real-world properties (including, for example, elevation or artefacts) have spatially-variable means. OK
allows for variation in the mean. In some cases there is a strong spatial trend (e.g. large values in the east
and small values in the west). In such cases, an alternative to OK, kriging with a trend model (KT) may
be advisable and may make more accurate predictions (cf. Lloyd, 2014). There are several other forms of
kriging; cokriging, for example, allows the integration of information about secondary variables. In cases
where we have a secondary variable (or variables) which is cross-correlated with the primary variable
both (or all) variables may be used simultaneously to make predictions using cokriging. With cokriging,
the variograms (which can be termed autovariograms) of both (or all) variables and the cross-variogram
(describing the spatial dependence between the two variables) must be estimated and models fitted to
all of these. Cokriging is based on what is called the linear model of coregionalization (see Atkinson,
Webster, & Curran, 1992), which provides a means to model the autovariograms and cross-variograms,
so as to ensure that the variances of any combination of the variables are positive. For cokriging to be
beneficial, the secondary variable should be cheaper to obtain or more readily available than the primary
variable (i.e. the variable which will be mapped) such as in the case of precipitation maps produced using
information on elevation, with which precipitation is positively correlated. An archaeological example is
given by Conolly and Lake (2006) who suggest the case of lithic artefacts and slope values. If the variables
are strongly related linearly then cokriging may provide more accurate predictions than OK.
Conditional simulation
Kriging predictions are weighted moving averages of the available sample data. Kriging is, therefore,
a smoothing interpolator. Conditional simulation (also called stochastic imaging) is not subject to the
smoothing associated with kriging (conceptually, the variation lost by kriging due to smoothing is added
back) as predictions are drawn from equally probable joint realisations of the random variables (RVs)
which make up an RF model (Deutsch & Journel, 1998). In other words, simulated values are not the
Geostatistics and spatial structure 103
expected values (i.e. the mean) but are values drawn randomly from the conditional cumulative distribu-
tion function (ccdf): a function of the available observations and the modelled spatial variation (Dungan,
1999). The simulation is considered ‘conditional’ if the simulated values ‘honour’ (that is, at data locations,
the simulated values match the observed values) the observations at their locations (Deutsch & Journel,
1998). Simulated realisations represent a possible reality whereas kriging does not. Simulation allows the
generation of many different possible realisations that may be used as a guide to spatial uncertainty in the
construction of a map (Journel, 1996), that is, encapsulating the uncertainty in spatial prediction.
Probably the most widely used form of conditional simulation is sequential Gaussian simulation (SGS).
With sequential simulation, simulated values are conditional on the original data and previously simulated
values (Deutsch & Journel, 1998). With SGS, all locations at which simulated values are required are
visited in random order and the neighbouring values are used to derive the simulated value. By using
different random number seeds the order of visiting locations is varied and, therefore, multiple realisations
can be obtained. In other words, since the simulated values are added to the dataset, the values available
for use in simulation are partly dependent on the locations at which simulations have already been made
and, because of this, the values simulated at any one location vary as the available data vary. Using SGS,
multiple alternative realisations can be generated and the distribution of values simulated at each location
can be used to assess spatial uncertainty.
Case studies
Two case studies are used here to illustrate the application of some key geostatistical tools. These focus on
(1) an earthwork in Northern Ireland and (2) Roman coins in southern Britain. All of the analyses make
use of the R statistical language (R Core Team, 2018) and the Gstat package1 in particular (see Bivand,
Pebesma, & Gómez-Rubio, 2013).
concave upwards at the origin. These model coefficients were used for kriging with a trend model (KT).
Variograms provide a useful way of summarising spatial variation and these could be used, for example,
to characterise a set of earthworks and to provide an index of surface roughness in each case.
The KT predictions are shown in Figure 6.10; the shape of the rath, with part of the western side
damaged, is apparent. Given the finely spaced source data, and the smoothly varying elevations, use of
a simpler interpolation method such as inverse distance weighting (IDW) may produce similar results
(see Conolly, this volume) and the added-value of kriging tends to increase as sample spacing increases
and spatial variation increases. Nonetheless, kriging offers the optimal prediction amongst linear predic-
tors as, unlike IDW, it makes use of information on underlying spatial variation through the use of the
variogram. In addition, a by-product of kriging is provided – the kriging variance (Figure 6.11). This
is a function of the sampling configuration and the form of the variogram and it constitutes a guide to
uncertainty in predictions.
106 Christopher D. Lloyd and Peter M. Atkinson
Figure 6.7 Experimental directional variogram of GPS measured heights, Ballyhenry rath.
Conditional simulation provides a way to construct multiple equally-probable realisations (in this case,
multiple sets of different elevation values). These possible realities are arguably a more appropriate rep-
resentation of real-world properties such as elevation than are the over-smoothed grids derived through
interpolation; Figure 6.12 shows ’2.5D’ representations of the rath derived using KT (6.12(A)) and a
single realisation derived using conditional simulation (6.12(B)). Note that the simulated map exhibits
greater variation than the KT map and is strictly half as precise as the KT map. However, whereas the
KT map can never exist in reality (because it is over-smooth) the conditionally simulated map might (it
is one of multiple possible realities).
Coins of Allectus
In AD 286/7 a breakaway empire was formed in Britain and northern Gaul by the usurper Carausius.
After his death in 293 he was replaced by Allectus (died 296), finance minister to Carausius. Carausius
and later Allectus struck billon (debased silver) coins marked with an ‘L’, denoting Londinium (London)
Geostatistics and spatial structure 107
Figure 6.8 Experimental detrended directional variogram of GPS measured heights, Ballyhenry rath.
and a ‘C’, which has been variously attributed to Camulodunum (Colchester), Corinium (Cirencester),
Glevum (Gloucester; on the grounds that C and G are indistinguishable on the coins), or a travelling
mint. Here, the percentage of coins within sites (defined later) which are from the C mint is analysed
using a geostatistical approach. It is worth noting that Portable Antiquities Scheme data are point events
and they could be treated instead as a point pattern; kernel estimation could then be used to produce an
intensity grid (see Bevan this volume). However, the data used here may be considered a ‘random’ sample
(i.e. realisations) of a much larger population (i.e. constituting a RF) of coins and we are not as interested
in the density of coins as in the expectation of the proportional attribution to L or C assuming that coins
may be found anywhere. Geostatistical methods were, therefore, considered appropriate.
Lloyd (1998) noted a western focus for C mint coins, suggesting that Glevum may be the most likely
attribution. In contrast, Walton (2011) observed no obvious trends in the products of the two mints for
either Carausius or Allectus. Here, as in Walton (2011), coins recorded as a part of the Portable Antiquities
Scheme2 are the subject of analysis. Point locations of coin finds were aggregated into grid cells of 5 km
by 5 km and only those cells containing at least two coins marked L or C were retained. The percentage
108 Christopher D. Lloyd and Peter M. Atkinson
Figure 6.9 Experimental detrended variogram of GPS measured heights, Ballyhenry rath, with fitted model
(Bessel model with a sill of 0.257 and a range of 10.379 m).
of coins from the C mint are shown in Figure 6.13. There are clear localised concentrations of coins
from London (corresponding to small percentages of C mint coins) or the C mint, although there is no
obvious region-wide trend. The focus here is on determining if there is any spatial structure or if their
distribution appears unstructured. Analysis of raw percentages using statistical methods is not appropri-
ate and the percentages were log-ratio transformed (see Aitchison, 1986) prior to analysis: C mint log
ratio = ln((C mint % + 0.01)/(100.0 – C mint % + 0.01)) (with 0.01 added to prevent logging zeros). In
the final kriged output the log-ratios were back-transformed to percentages with: exp(C mint log ratio)/
(1+exp(C mint log ratio)) * 100.0.
Figure 6.14 shows the directional variogram for C mint percentages. This suggests little spatial struc-
ture in most directions – for 0, 45 and 135° the models fitted to the semivariances would be close to flat;
this indicates no spatial structure. However, for 90° (east-west direction) there is clear spatial structure
with semivariances increasing systematically from the smallest lag to 50 km and then levelling out. Fig-
ure 6.15 shows the variogram for 90° with a fitted model (nugget effect of 15.4, and a spherical model
Figure 6.10 Elevation estimates (in metres), derived using kriging with a trend model.
Figure 6.11 Kriging variances.
Figure 6.12 ‘2.5D’ representation of (a) kriged elevations and (b) conditionally simulated values (viewed from
the southwest). A colour version of this figure can be found in the plates section.
Figure 6.13 Radiate of Allectus: C mint percentages in 5 km grid cells. A colour version of this figure can be
found in the plates section.
Geostatistics and spatial structure 113
component with a structured component of 14.2 and a range of 32,549 m; the units are percentages).
This suggests that there is structure in the east-west direction, corresponding to bands of small/large C
mint percentages with a range of approximately 32 km. Figure 6.16 shows a map of C mint percentages
derived using kriging; this indicates that there are several localised concentrations of C mint coins – an
area to the east of the River Severn, an area around Essex, parts of the English midlands, and areas around
Leicestershire and Lincolnshire. The values in the far southwest are discounted as they fall at the edges
of the study area. A problem underlying analyses based on these data is that they are often single finds
rather than assemblages from sites and here all grid cells with more than two coins are used. Extending
the analysis to include sets of finds from archaeological excavations (as in Lloyd, 1998) would be benefi-
cial. But the provisional findings do suggest that there may be spatial structure in mint products and that
114 Christopher D. Lloyd and Peter M. Atkinson
Figure 6.15 Directional variogram of C mint percentages: 90º clockwise from north (east-west); with fitted
model.
circulation of coins has not removed all evidence of such structure. However, based on this analysis, there
is clearly no strong evidence for any of the possible candidates for the C mint.
Conclusion
The chapter has introduced some key concepts and standard basic methods in geostatistics. The field is
dynamic and methodological innovation continues. In most analyses, the variogram is assumed to be con-
stant across the study area; however, in many cases the underlying spatial structure is not constant and an
array of methods for estimation of local variograms in such cases have been developed (see Lloyd, 2014 for
a review). Multiple point geostatistics (Mariethoz & Caers, 2015) offers a powerful means to incorporate
information on physical reality in stochastic modelling. Other innovations include the use of non-linear
distance measures; a simple example is the use of cost surfaces to model travel time between places rather
than using straight line (Euclidean) distances. Negre, Muñoz, and Lancelotti (2016) provide an archaeo-
logical example whereby the walls of a house were used as barriers to the distribution of calcium residues.
The scope and number of applications of geostatistics in archaeology has grown rapidly in the last
decade as the availability of propriety and free open source software environments (e.g. in the R statistical
environment) to implement the methods has increased. A growth in examples has also facilitated new
analyses. Geostatistics has considerable potential in archaeological applications ranging from site prospec-
tion, through to analysis of artefact distributions, soil properties, and construction of earthwork digital
models. In detailing some key principles and outlining some ways in which geostatistical methods have
added to the study of archaeological variables it is hoped that this chapter will encourage further analyses,
Figure 6.16 Kriged map of C mint percentages. A colour version of this figure can be found in the plates
section.
116 Christopher D. Lloyd and Peter M. Atkinson
especially using new and exciting datasets such as those provided via the Portable Antiquities Scheme in
England and Wales.
Acknowledgements
Conor Graham of Queen’s University Belfast is thanked for allowing the use of the Ballyhenry rath data.
The staff of the Portable Antiquities Scheme (https://finds.org.uk/) are thanked for provision of the data
on coins of Allectus.
Notes
1 The gstat package for R was authored by Edzer Pebesma and Benedikt Graeler (https://cran.r-project.org/web/
packages/gstat/gstat.pdf).
2 The Portable Antiquities Scheme (PAS) is a joint initiative between the British Museum and Amgueddfa Cymru –
National Museum Wales that encourages the general public in England and Wales to record any archaeological
objects they find (https://finds.org.uk/).
References
Aitchison, J. (1986). The statistical analysis of compositional data. London: Chapman and Hall.
Athanassas, C. D., Modis, K., Alçiçek, M. C., & Theodorakopoulou, K. (2018). Contouring the cataclysm: A geo-
graphical analysis of the effects of the Minoan eruption of the Santorini volcano. Environmental Archaeology, 23,
160–176.
Atkinson, P. M., Webster, R., & Curran, P. J. (1992). Cokriging with ground-based radiometry. Remote Sensing of
Environment, 41, 45–60.
Barrientos, G., Catella, L., & Oliva, F. (2015). The spatial structure of lithic landscapes: The Late Holocene record of
East-Central Argentina as a case study. Journal of Archaeological Method and Theory, 22, 1151–1192.
Bentley, J., & Schneider, T. J. (2000). Statistics and archaeology in Israel. Computational Statistics and Data Analysis,
32, 465–483.
Bevan, A., & Conolly, J. (2009). Modelling spatial heterogeneity and nonstationarity in artifact-r ich landscapes.
Journal of Archaeological Science, 36, 956–964.
Bivand, R. S., Pebesma, E., & Gómez-Rubio, V. (2013). Applied spatial data analysis with R (2nd ed.). UseR! Series.
New York: Springer.
Bocquet-Appel, J. P., & Demars, P. Y. (2000). Neanderthal contraction and modern human colonization of Europe.
Antiquity, 74, 544–552.
Burnett, R. L., Terry, R. E., Alvarez, M., Balzotti, C., Murtha, T., Webster, D., & Silverstein, J. (2012). The ancient
agricultural landscape of the satellite settlement of Ramonal near Tikal, Guatemala. Quaternary International, 265,
101–115.
Conolly, J., & Lake, M. (2006). Geographical information systems in archaeology. Cambridge: Cambridge University Press.
Deutsch, C. V., & Journel, A. G. (1998). GSLIB: Geostatistical software library and user’s guide (2nd ed.). New York:
Oxford University Press.
Dungan, J. L. (1999). Conditional simulation. In A. Stein, F. van der Meer, & B. Gorte (Eds.), Spatial statistics for remote
sensing (pp. 135–152). Dordrecht: Kluwer Academic Publishers.
Ebert, D. (2002). The potential of geostatistics in the analysis of fieldwalking data. In D. Wheatley, G. Earl, &
S. Poppy (Eds.), Contemporary themes in archaeological computing (pp. 82–89). University of Southampton Depart-
ment of Archaeology Monograph, 3. Oxford: Oxbow Books.
Entwistle, J., McCaffrey, K., & Dodgshon, R. (2007). Geostatistical and multi-elemental analysis of soils to interpret
land-use history in the Hebrides, Scotland. Geoarchaeology, 22, 391–415.
Geostatistics and spatial structure 117
Hageman, J. B., & Bennett, D. A. (2003). Construction of digital elevation models for archaeological applications.
In K. L. Wescott & R. J. Brandon (Eds.), Practical applications of GIS for archaeologists: A predictive modelling toolkit
(pp. 113–127). Boca Raton: CRC Press.
Hesse, R. (2010). LiDAR-derived local relief models: A new tool for archaeological prospection. Archaeological Prospec-
tion, 17, 67–72.
Isaaks, E. H., & Srivastava, R. M. (1989). An introduction to applied geostatistics. New York: Oxford University Press.
Journel, A. G. (1996). Modelling uncertainty and spatial dependence: Stochastic imaging. International Journal of
Geographical Information Systems, 10, 517–522.
Journel, A. G., & Huijbregts, C. J. (1978). Mining geostatistics. London: Academic Press.
Lancelotti, C., Negre Pérez, J., Alcaina-Mateos, J., & Carrer, F. (2017). Intra-site spatial analysis in ethnoarchaeology.
Environmental Archaeology, 22, 354–364.
Lloyd, C. D. (1998). The C mint of Carausius and Allectus. British Numismatic Journal, 68, 1–10.
Lloyd, C. D. (2010). Spatial data analysis: An introduction for GIS Users. Oxford: Oxford University Press.
Lloyd, C. D. (2014). Exploring spatial scale in geography. Chichester: Wiley-Blackwell.
Lloyd, C. D., & Atkinson, P. M. (2004). Archaeology and geostatistics. Journal of Archaeological Science, 31, 151–165.
Lynn, C. J. (1984). Two raths at Ballyhenry, County Antrim early Christian period, each overlying prehistoric mate-
rial. Ulster Journal of Archaeology, Series 3, 46, 67–91.
Mariethoz, G., & Caers, J. (2015). Multiple-point geostatistics: Stochastic modeling with training images. Chichester: Wiley.
Matheron, G. (1971). The theory of regionalized variables and its applications. Les Cahiers du Centre de Morphologie
Mathématique de Fontainebleau, 5. Fontainebleau: École Nationale Supérieure des Mines.
Negre, J., Muñoz, F., & Lancelotti, C. (2016). Geostatistical modelling of chemical residues on archaeological floors
in the presence of barriers. Journal of Archaeological Science, 70, 91–101.
Neiman, F. D. (1997). Conspicuous consumption as wasteful advertising: A Darwinian perspective on spatial patterns
in Classic Maya terminal monuments dates. In M. C. Barton & G. A. Clark (Eds.), Rediscovering darwin: Evolution-
ary theory and archaeological explanation (pp. 267–290). Archaeological Papers of the American Anthropological
Association, 7. Washington, DC: American Anthropological Association.
Openshaw, S. (1984). The modifiable areal unit problem. Concepts and Techniques in Modern Geography, 38. Norwich
Geo Books. Retrieved from http://qmrg.org.uk/files/2008/11/38-maup-openshaw.pdf
Pebesma, E. J. (2000). Gstat manual. Utrecht: Utrecht University.
R Core Team. (2018). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical
Computing. Retrieved from www.R-project.org/
Revelles, J., Burjachs, F., Morera, N., Barceló, J. A., Berrocal, A., López-Bultó, O., . . . Terradas, X. (2017). Use of
space and site formation processes in a Neolithic lakeside settlement: Pollen and non-pollen palynomorphs spatial
analysis in La Draga (Banyoles, NE Iberia). Journal of Archaeological Science, 81, 101–115.
Robinson, J. M., & Zubrow, E. (1999). Between spaces: Interpolation in archaeology. In M. Gillings, D. Mattingly, &
J. van Dalen (Eds.), The archaeology of mediterranean landscapes (pp. 65–83). Oxford: Oxbow Books.
Tobler, W. R. (1970). A computer movie simulating urban growth in the Detroit region. Economic Geography, 46,
234–240.
Walton, P. J. (2011). Rethinking Roman Britain: An applied numismatic analysis of the Roman coin data recorded by the
portable antiquities scheme (Unpublished PhD thesis). University College London.
Webster, R., & Burgess, T. M. (1980). Optimal interpolation and isarithmic mapping of soil properties III: Changing
drift and universal kriging. Journal of Soil Science, 31, 505–524.
Webster, R., & Oliver, M. A. (2007). Geostatistics for environmental scientists (2nd ed.). Chichester: John Wiley and Sons.
Wells, E. C. (2010). Sampling design and inferential bias in archaeological soil chemistry. Journal of Archaeological
Method and Theory, 17, 209–230.
Wheatley, D., & Gillings, M. (2002). Spatial technology and archaeology: The archaeological applications of GIS. London:
Taylor and Francis.
Zubrow, E. B. W., & Harbaugh, J. W. (1978). Archaeological prospecting: Kriging and simulation. In I. Hodder (Ed.),
Simulation studies in archaeology (pp. 109–122). Cambridge: Cambridge University Press.
7
Spatial interpolation
James Conolly
Introduction
To estimate the value of a phenomenon at an unsampled location from samples of surrounding data requires
interpolation. This contrasts with extrapolation which is a process of estimating values beyond the extent of a
sample. Both approaches require a model to provide the estimate, which is based on a prediction function.
This might be a heuristic model – visually estimating or just educated guessing – but if the goal of predic-
tion is to estimate values at multiple unsampled locations in space, then a statistical approach provides a more
robust solution. Often, surface interpolation is used to generate a continuous raster-based surface model
from a scatter of discrete point samples. The utility of spatial interpolation means that it is a commonly
applied tool used across a wide range of disciplines with interests in visualizing and predicting spatial patterns
and processes. In archaeology, it provides opportunities for the visualization and prediction of a wide variety
of geographically variable phenomenon, such as artifact intensities (densities), topographic features, or even
more complex models of processes such as the space-time dynamics of cultural change. Conversely, spatial
extrapolation is more prone to error and has more limited value, but might be useful in cases where a clear
trend such as declining artifact densities needs to be estimated beyond a survey zone. For more informa-
tion on extrapolation, see Miller, Turner, Smithwick, Dent, and Stanley (2004) and Peters, Herrick, Urban,
Gardner, and Breshears (2004) for a discussion of issues and applications.
The basics of spatial interpolation are relatively easy to define, but there are several distinct types of
methods – Li and Heap (2008) review over forty – and they vary in their assumptions (or lack of assump-
tions) about the source data and degree of statistical complexity. Choosing an appropriate interpolation
method can be difficult, as archaeologists deal with a variety of different data (from biotic, physical, and
cultural), data are unlikely to have been optimally sampled, and samples may contain sources of noise or
error. To add to this, interpolation outcomes can also be considerably different depending on the method
chosen, so care is therefore required to ensure that the most appropriate method is selected that is sensitive
to the underlying data as well as the goals of the analysis.
The purpose of this chapter is to describe the basic concepts of spatial interpolation, to review and to
provide guidance on the use of three common interpolation methods, and to offer some examples and
discussion of how spatial interpolation provides opportunities for data visualization and prediction that
can build insight into the behaviours which generate patterns in the archaeological record.
Spatial interpolation 119
First, to illustrate the utility of spatial interpolation, consider the following three scenarios, which
capture a representative variety of the types of applications that can benefit from spatial interpolation.
Scenario One You are a conservation or commercial archaeologist, and you need to define the
spatial characteristics (i.e. the varying intensity) of a large artifact scatter identified by a sample
of test units. In this scenario, the goal is primarily data visualization, such that the spatial proper-
ties of the cultural materials can be effectively portrayed and communicated to planners for the
purposes of avoidance. Spatial interpolation provides a solution for illustrating where the highest
artifact intensities are located, and for estimating the edges of the scatter. If removal of the site is
required, spatial interpolation can also offer a guide to the placement and number of excavation
units, and even estimates of artifact recovery rates, to aid in the calculation of costs. A basic but
robust method such as an inverse distance weighted (IDW) algorithm is a good first approach.
Scenario Two You are a graduate student and your research project involves modelling an inun-
dated (underwater) cultural landscape. The goal is to obtain a more precise understanding of the
surface of the lake bed to assist in the identification of submerged shorelines. The data consists
of a combined set of terrestrial topographic and underwater bathymetric measurements that
together provide an opportunity to model a palaeolandscape. The interpolation methods require
consideration of the possibility of localized features such as submerged shorelines but also to the
possibility of errors in data capture. An appropriate resolution sensitive to the required analytic
scale and the intensity of data observations must also be considered. A spline interpolation with a
tension set to respond to local features is a reasonable starting point.
Scenario Three You are researching the spatial characteristics of a sample of radiocarbon dates
obtained from a cultural phenomenon, for example, the spread of a cultivar such as maize
throughout northeastern North America. The goals of the analysis are to define and visualize
the geospatial trend and to identify regions where radiocarbon dates deviate from a global trend,
perhaps exhibiting an early or late adoption pattern. In this scenario the data comprise a set of
georeferenced radiocarbon dates across a large geographic region (1000s of square kilometres).
The spatial interpolation algorithm needs to be sensitive to potential local effects of varying scale,
as well as a likely directional north-south trend. For these types of highly complex problems, an
appropriate choice of spatial interpolation is kriging, in which the parameters of the interpola-
tion are derived empirically from the data. Geospatial interpolation improves the outcome as it
provides analysts with the ability to identify where the modelled surface is more or less accurate
due to data sampling issues.
Although these three examples are typical uses of spatial interpolation, they do not cover the full range
of scenarios in which spatial interpolation provides a solution to an archaeological problem. Other
applications of archaeological interest, such as the use of spatial interpolation to model a continuous
distribution of soil chemistry values taken from point samples across a historic living surface, have similar
requirements. At its most basic, the process begins with a set of discrete point observations recording the
changing values of a phenomenon distributed across geographic space. This is followed by the selection
and implementation of an appropriate interpolation method, which requires the selection of a range of
parameters, as well as the grid resolution of the output model. The first issue is the selection of an algo-
rithm; options are presented in the following section, including some guidance on which of the myriad
options is likely to provide the best balance between the data requirements, computational and statistical
simplicity, and the archaeological problem to be solved.
120 James Conolly
Method
In this section I explain the methods behind different forms of spatial interpolation, beginning with the
relatively straightforward, followed by the more complex. My goal here is to provide sufficient knowledge
such that the underlying concepts of different forms of spatial interpolation are sufficiently understood so
that practitioners can make an appropriate decision as to which method is the most suitable for a specific
application. Although some forms of spatial interpolation are highly complex and require some under-
standing of more advanced statistical concepts, other approaches are relatively easy yet are also extremely
versatile and powerful tools for surface modelling and data visualization.
Fundamentally, interpolation is a predictive modelling tool used to estimate the value of a quantitative
phenomenon (such as an artifact count) between measurements. Interpolation differs from methods such
as kernel density estimation (KDE – see Bevan, this volume) because interpolation is concerned with
prediction, rather than characterization. Whereas KDE can take a set of point-based frequency data, such
as artifact counts, and convert observations to densities, KDE does not predict the densities between point
locations – it only characterizes them. Conversely, interpolation predicts the values between observations.
Interpolation is thus by its nature a more complex method than density estimations, but I have written
this chapter for archaeologists without a specialized background in spatial statistics, so I’ve avoided the use
of mathematical formula and have focused instead on written descriptions of the methods to present the
core concepts. It is nevertheless worth defining the fundamental concept of spatial interpolation, to show
how it is a weighted average of sampled data. This is illustrated by the equation (Li & Heap, 2008, p. 4):
n
zˆ( x0 ) = ∑ λi z( xi ) (7.1)
i =1
which simply says that to obtain an interpolated estimate ẑ at location x0, then one must take the sum n
of the weighted values from known locations z(xi), the weighting of each location being given by a pro-
cedure defined by λi. Where interpolation methods vary is in how many samples are needed to estimate
accurately the value at the point of interpolation, and how these samples are to be weighted. At the most
basic level, different methods frame how many of the surrounding known values should be considered,
and to what extent should nearby points be given greater weight than those far away.
With reference to the three scenarios presented in the previous section, I next consider three classes of
interpolation: (1) distance-weighting; (2) thin plate splines; and (3) kriging (also see Lloyd & Atkinson,
this volume). There are many more approaches to spatial interpolation beyond these three, a few of which
I mention at the end of this section; but most archaeological problems in which spatial interpolation offers
a solution can be solved by the judicious application of one of these approaches.
Distance-weighting interpolation
Linear interpolation is illustrated in Figure 7.1(a). Using data presented in the figure as an example, it
can be observed that at any location on the imaginary line joining the two points in geographic space,
an interpolated value can be reasonably estimated as a function of the linear distance (d) between the
known values. Thus, the value at the mid-point of the line will be the mean of the two known observa-
tions; but as the point of interpolation moves closer to one of the two points the interpolated value is
linearly weighted towards the closest point. On this basis, point D in Figure 7.1(a) is therefore reasonably
estimated as being equal to 15.
Figure 7.1(b) shows a scenario in which the values at three points are known. This enables an
interpolation to be made at any point within the polygon formed by the three points (A, B and C).
Spatial interpolation 121
B = 21 B = 21
d=6 d=6
D D
d=3 d=3
North
North
d=
5
A = 12 A = 12
C=8
a East b East
Figure 7.1 Simple interpolation examples: (a) with two known point values, using linear interpolation esti-
mates D = 15; (b) adding a third sample location and using inverse distance weighted squared interpolation
estimates D = 12.5.
This is in fact the basis of a form of interpolation that is called a triangulated irregular network (TIN), in
which three neighbouring points form the vertices of a triangular surface that is empty of other points.
TIN-based approaches were common in the formative days of GIS-based spatial analysis, as they were
computationally easy and visually effective; however, they are one of the least accurate interpolation
methods and have been superseded by alternative ways of surface modelling. To interpolate the value,
rather than two points, we form an estimate from three or more points. As with the first example,
we consider the distances between the interpolation point and other known points and provide more
weight to the closer points. This assumes that points closer to the unsampled location are more similar
than those lying further away – in other words, that there is some positive spatial autocorrelation in
the dataset. A basic way of weighting points is to use the inverse (reciprocal) of the distance, or more
often the inverse of the distance raised to a power of two (i.e. 1/d2), to reduce the weight of more
distant points. Thus, the weight of each of the known point values to the interpolation value decreases
with distance to the location of interpolation, giving rise to the method known as an inverse distance
weighted (IDW) interpolation. The formula defines the method for estimating the value Z at point p,
based on the values of the other i points and their distances.
∑
n
(Zi / d i2 )
Zp = i =1
(7.2)
∑
n
i =1
(1 / d i2 )
With reference to Figure 7.1(b), the equation 7.2 is applied to give the interpolated value at point D:
12 21 8
+ +
32 62 52
D= = 12.5
1
+ 1 + 1
32 62 52
As the number of available sample points with known values increases beyond three, then the accuracy
of the predication may be improved by using more points to calculate the prediction. This is defined as
the neighbourhood search area, which can be limited either by a distance or a defined number of closest
points. Because most GIS programs with interpolation tools will by default use the nearest 12 neigh-
bours and apply a power of 2 to the distance weighting, this gives rise to its common shorthand name
of IDW-12.
122 James Conolly
As the neighbourhood search size or number of neighbouring points increases, this will influence the
prediction, but it is not always clear what search size to choose and how to weight more distant points. In
fact, the choice of neighbourhood size (n) and the power weighing (p) is arbitrary and depends both on
the goals of the interpolation and the characteristics of the sample points. For example, increasing p to val-
ues above 2, to 3 or even 4, will increase ‘bumpiness’ as it pays more attention to local values. Conversely,
the smoothness of the surface can be controlled by using a greater number of neighbours. Selecting
n = 12 or more may work well with uneven or noisy samples, in which the goal is to find a more general
trend. IDW methods thus usually require some degree of experimentation to dial-in the balance between
a surface that is locally sensitive and one that illustrates the regional pattern.
It is worth considering an alternative interpolation method in cases where the underlying data leads to
problems with IDW results. This is typically manifest by the surface showing multiple peaks and pits around
the original data – and adjustment of the weighting parameters does not improve the result. The two most
common alternatives are thin plate splines (TPS) and interpolation with geostatistics, called kriging.
Elevation
Elevation
a East b East
Figure 7.2 Spline as a concept: (a) regularized with high weighting, allowing the interpolation estimates to
exceed the z-values to maintain smoothness at points marked by arrows; (b) a tension spline, which adheres to
the original data values at the expense of smoothness.
the needs of the analysis. Overall, however, thin plate spline approaches are usually better than IDW
methods for data in which smoothness is valued in the final product, such as in elevation models. Be
aware though that smoothness may be misleading if the underlying data is in fact rough and the goal of
the analysis is to illustrate or model this characteristic. In instances where roughness could be attributed
to measurement noise (e.g. in bathymetric or topographic survey in which vegetation may be impacting
true ground height), or in which more generalized trends are desired (at the expense of local accuracy),
then splines do offer a robust solution.
One inherent problem with both IDW and spline interpolation methods is that the neighbourhood
size and weighting parameters are typically arbitrarily assigned and visually evaluated. Although it is
certainly possible to estimate parameters less subjectively by using cross-validation, this is rarely imple-
mented in the GIS environments in which interpolation is typically performed, and thus it is not routinely
applied. This is potentially problematic, as a visually-satisfying surface may be erroneous, and without
some form of evaluation of its accuracy it may be uncritically adopted and used as the basis for further
interpretation, compounding the error.
Kriging
Kriging (pronounced with a hard ‘g’, after the South African Daniel Krige) is an interpolation method
in which the parameters of the interpolation are estimated empirically using geostatistics. The integra-
tion of geostatistics means that kriging is a more complicated form of interpolation, but it provides some
advantages over IDW and spline approaches. The primary advantage is that by integrating geostatistics
to estimate weighting parameters, the interpolation is sensitive to the characteristics of the samples, and
generally produces a more accurate surface model. Some forms of kriging can also generate an error
surface so that the interpolation’s accuracy can be evaluated across the sampling window. Finally, because
kriging requires the analyst to examine the spatial characteristics of the data set before interpolation, it
provides opportunities to examine underlying spatial patterns in sample data, which can lead to a better
overall decision about the type of kriging to apply that will in turn improve the outcome.
Kriging works by first measuring the spatial autocorrelation of sample points. Spatial autocorrelation
is a measure of the relationship between distance and similarity: positive spatial autocorrelation describes
a situation in which the value difference between pairs of points is correlated with distance, such that the
closer two points are in space, the more similar they are likely to be (also see Hacıgüzeller, this volume).
124 James Conolly
semivariance (γ)
0 10 20 30 40 50 60 70 80 90 100 m
distance (lag)
Figure 7.3 A variogram showing increasing variance between samples of values drawn from increasing dis-
tances apart. After a distance of 60 m there is no increase in variance.
This relationship between distance and similarity is expressed on a graph called a variogram, which plots
distances between pairs of points on the x axis (referred to as the ‘lag’), and a statistic called the variance,
denoted by γ on the y axis. The variance is a measure of the variability in differences between all pairs of
point values within a defined lag (such that as lag increases, the variance normally increases too, up to the
level of the variation in the entire sample). The reason that a graph of lag against variance is useful is that
for each unsampled location surrounded by points at varying distance, the variogram provides informa-
tion about the distance-based weighting needed to estimate the value at unsampled locations. This means
that a sample-specific distance weighting can be derived that is sensitive to the original data and removes
some of the subjectivity inherent in IDW or TPS interpolation methods. For example, a variogram like
the one shown in Figure 7.3 shows that the variance in differences between observed pairs of values
increases rapidly up to about 15 m, variance then increases slowly to about 40 m and hits a maximum
that does not increase any further from distances of 60 m and high. This means an appropriate weighting
factor will consider values within 15 m more significantly than those further away up as these have the
lowest variances and thus predictive power; whereas points more than 60 m away have little predictable
influence on estimates of local values.
As well as the changing influence of distance, some forms of kriging can include the directionality (or
geographic orientation) between pairs of points into the weighting calculation. The term anisotropic refers
to situations when direction independently influences the rate of change in a sample (e.g. Figure 7.4).
This occurs commonly in topographic surfaces but also in other situations when there is a directional-
ity to the source of the response variable, such as might be expected in the amount of material reaching
settlements at progressive distances if primarily distributed along geographically oriented distribution
routes. To detect the presence of anisotropy requires the creation of a two-dimensional variogram sur-
face, in which the degree of change by direction can be measured, which will typically manifest itself
as an ellipse on the variogram surface. If there is anisotropy, then a method called anisotropic kriging will
produce more accurate surfaces. This requires calculating separate variograms at major and minor axes
of the ellipse, and these angles are then provided to the kriging solution to provide a weighting function
that considers the directionality of the surface.
These are the foundational concepts for all kriging methods, but they can be implemented differently
depending on starting assumptions and the use of additional parameters. Although there are many forms
Spatial interpolation 125
90
85
Northing
80
75
70
Easting
Figure 7.4 Anisotropy in a hypothetical sample of semi-regularly spaced test units. The isolines depict sherd
counts in 5-sherd intervals, illustrating how the rate of change is greater on the north-south axis than on the
east-west axis.
of kriging, the three basic forms are simple, ordinary, and universal. Note that all three of these methods
can be implemented to estimate values at unsampled point locations (which is typically the default), or in
large polygon units, which is referred to as block kriging. Simple and ordinary kriging both assume that
the variance in point samples within distance ranges is stationary across the sample (i.e. the observed vari-
ation between point samples is not higher in one part of the distribution than another; also see Crema this
volume). Unlike simple kriging, ordinary kriging does not assume a constant mean value within a sample
of points – it allows for the likelihood that values may have a trend and be higher in one part of the sam-
pling window. Of the two, ordinary kriging is usually the better choice as it has the fewest assumptions.
Finally, universal kriging establishes, in a separate process, an equation that describes the first order trend
of the observed data within the neighbourhood search window. The kriging function then is a model
of the residuals from the trend function. The advantage of universal kriging is that first-order patterns
are managed by a separate model, allowing the kriging function to focus on the variability around the
global trend. In addition to these three are further forms of kriging, such as co-kriging, which integrates
a secondary (correlated) variable into the weighting function to provide estimates that are described well
in overviews of geostatistics (see, for example, Haining, Kerry, & Oliver, 2010; Webster & Oliver, 2007).
In general, kriging methods work well when there is spatial structure to the underlying data – i.e. some
trend can be observed or is expected in the mean or variation in observations in one or more directions
across the sampling window. In addition, as error surfaces can be created in some forms of kriging, a
measure of certainty can also be provided. This may be more useful in some situations than others, but
if interpolated surfaces are being used for forming decisions about where the highest concentrations in a
distribution, or where the edge of artifact scatter is likely to be, then having some understanding of the
probable error in the modelled surface is likely to be valuable.
Finally, as these descriptions of kriging methods show, interpolation with geostatistics involves some
additional calculations and interpretations to generate the weighting functions, and most GIS packages
have limited capacity for full geostatistical analysis. Dedicated spatial statistics software or geostatisti-
cal plugins provide more customizable solutions including construction of permutation analysis for
126 James Conolly
evaluating the stability of the models. Lloyd and Atkinson provide a detailed explanation of kriging in
this volume; for further extended discussions see Sen (2016).
A
p = w⋅ (7.3)
N
where p is the grid resolution (i.e. pixel edge length), which is based on the sample area (A) and number
of observation points (N) multiplied by a weighting factor of w. Hengl suggests that a weighting of w =
0.0791 to w = 0.25 is appropriate for random samples, but if the observation sample is regularly spaced
(e.g. as might be the case following a defined sampling interval), then a more appropriate weight is w =
0.5. It is better to err on the side of less precision (i.e. larger pixels) to maintain accuracy, although (as
shown later) an evaluation of the impact of larger or smaller resolutions is often worthwhile.
Edge effects are a major concern in interpolation. Many processes we wish to model are continuous
beyond our sampling window, and thus suffer from a lack of sampled information beyond the window.
As the accuracy of a prediction is partially dependent on being surrounded by locations where the true
values are known, one solution is to reduce the extent of the area to be interpolated so that it is sur-
rounded by known points. This is formally known as a border-area edge correction (Yamada, 2009).
How much to step in though is dependent on the intensity of the point sample and thus is related to
average n-th nearest neighbour distance, where n is the number of neighbours used in the interpola-
tion. For example, in Figure 7.5, to avoid edge effects in a routine interpolation based on eight nearest
neighbours, an estimate of the step-size can be calculated by deriving the average distance from each
point to its eighth nearest neighbour (nn). In this case, the average nn distance is 11 m, and this provides
a width of the internal buffer around the sample distribution. The obvious disadvantage to this approach
is potentially considerable data loss at the edges, but the error rates in this zone are high because of edge
effects, so keeping it risks false confidence in the model. There are other methods, but all solutions require
compromises between predictive accuracy and data loss. Yamada (2009) provides several examples of ways
to manage these concerns.
Spatial interpolation 127
Model comparisons
There can be considerable variation in the output between different interpolation methods and as well
within methods when parameters are adjusted. For example, consider the following typical scenario
which consists of a scatter of 202 small (30 × 30 cm) test units across an area approximately 70 m wide by
130 m long (Figure 7.5(a)). The data was collected in advance of an excavation project on a grassy field
interspersed with buildings (the most substantial of which is marked as such), and each test unit records
the count of pottery artifacts observed at that location (for general context, please see Conolly et al.,
2014). There is a considerable amount of local variation (noise) in the data, but there is a global trend of
higher values in the southwest declining to the northeast. The goal of interpolation is to visualize this
trend as a continuous process in order to estimate the scale of the sub-area(s) with substantially higher
artifact counts.
Three interpolation methods were used to construct the surfaces: inverse distance weighting (IDW),
splines, and kriging. In each, different parameters were selected to evaluate the impact these had on the
predictive accuracy of the modelled surface. Model resolution was also considered. There are 202 obser-
vations in the study area of 9100 m2, and from equation 7.3 with a weighting factor of 0.4 to reflect the
semi-regular spacing, an appropriate pixel dimension is 2.7 m. To evaluate the impact of resolution on
accuracy, two models were run using each set of parameters for method: one with a pixel dimension of
1m, and one at 3m. Table 7.1 summarizes the methods and parameters used.
Figure 7.5 Archaeological point sample. (a) location of samples and artifact counts; (b) sample with border-
area edge correction. The random test samples are designated by an ‘x’; the building location by the rectangle.
128 James Conolly
To evaluate the relative predictive error in each model, a random evaluation sample of thirty test units
was selected and removed from the analysis. The models were constructed on the remaining sample of
172 observations. A root-mean-square (RMS) error was calculated for the difference between the evalu-
ation sample and the predictive surface for each model. (Visual comparison of the models and the RMS
results are presented in Figure 7.6 and Table 7.2).
First, as expected, there is a slight increase in accuracy afforded by a 3-m resolution over 1-m pixel res-
olution, but it is not significant and the latter has been selected for model outputs. Second, the RMS results
establish that kriging is consistently able to produce more accurate predictions than other interpolation
Figure 7.6 Visual differences in the surfaces of nine interpolation methods at 1 m resolution. RMS errors
for each model are provided in Table 7.2. A colour version of this figure can be found in the plates section.
130 James Conolly
methods, even if in this instance the results are only marginally more accurate over IDW-12. In this set of
data, the spline models performed relatively poorly, as they are better suited to phenomena with smoother
transitions and higher spatial autocorrelation, such as elevation surfaces. Clearly, a more locally sensitive
interpolation method like IDW provides some advantages over splines. In fact, in this implementation,
IDW-12 is roughly equivalent to ordinary kriging, suggesting that it is a reasonable model to use if sim-
plicity in implementation is worth more than the additional insight and potential increase in accuracy
geostatistical modelling provides.
Case studies
Two case studies are considered, at two different scales of analysis, that illustrate how interpolation meth-
ods can convert samples of discrete observations into critical insights into the spatial patterns of human
behaviour. The first case study concerns the reconstruction of use-of-space in Late Neolithic (LN) and
Copper Age (CA) sites in Hungary by Salisbury (2013). The second examines the use of interpolation
to provide insight into the geographic patterns at continental scale generated by the spread of Neolithic
agricultural practices from Southwest Asia into and across Europe by Fort (2015).
In the first example, the goal of the analysis is to visualize the spatial variation in soil chemistry from
habitation sites in order to reveal patterns in use of space. This is a popular form of spatial analysis that
depends on interpolation and has been approached in different ways in a variety of contexts (see, for
example, Rondelli et al., 2014; Mikołajczyk & Milek, 2016; Negre, Muñoz, & Lancelotti, 2016). To
illustrate the potentials and a few pitfalls in the application of interpolation methods work by Salisbury
(2013) is used. The specifics involve a sample of six LN and CA habitation locales in eastern Hungary
and the data consists of element abundance (ppm) in soil samples taken systematically (at a 10 m or 5 m
grid interval) across each site. The multivariate data was reduced using principal components analysis
(PCA) to identify correlated variation in groups of elements that are assumed to reflect different anthro-
pogenic processes (e.g. cooking, food discard, metal working). The first five of the PCA component axes
(cumulatively representing over 80% of the variation) were then used as the variables to be interpolated.
Each component was examined separately by assigning each of the original samples a value based on the
sample’s position on the component’s axis.
Salisbury (2013) used ordinary kriging to produce the interpolated maps for visualizing the spatial
patterns. As described in the previous section, this is an excellent approach as it measures local variation
in the mean observed values across different portions of the data to generate a weighting function for
the interpolation. The maps generated from this analysis illustrate clear differences between the differ-
ent PCA components, which the author uses to interpret patterns in the use of space and their different
chemical signatures. However, note that the output scale appears to have been defined at too fine a grain
given the distances between sample points and this appears to have led to some instability in at least one
of the outputs (Figure 7.7).
Unfortunately, the author did not provide any information about the specifics of the scale used, nor
about the variogram model, leaving it to readers to trust that kriging methods were correctly and appro-
priately derived. More fundamentally, as the raw data is not provided in this paper, there is no opportunity
for interested readers to replicate the analysis or build on this work using different methods. I highlight
this not to criticize the analysis or interpretation, only to illustrate that without sufficient supplementary
data the interpolation methods and output can only be taken on trust. Nevertheless, the interpolations
do provide information about patterns in different chemical signatures in the soil, and these patterns can
be evaluated and substantiated using other archaeological data. Without the methods provided by spatial
interpolation, these insights would be difficult to obtain.
Spatial interpolation 131
Figure 7.7 Interpolation example modified from Salisbury (2013, Figure 5). Note that the high resolution
(small pixel size) has exceeded the limits of the original data. The interpolation is thus unstable and noisy where
there is higher local variance, for example in the area north of the ‘trample zone’ (arrows).
The next illustrative case study concerns a much smaller scale and correspondingly larger region of
analysis, which is the spread of agriculture across Europe. There are many papers on this topic which use
some form of interpolation to calculate patterns of movement (e.g. Ammerman & Cavalli-Sforza, 1984;
Gkiasta, Shennan, & Steele, 2003; Bocquet-Appel, Naji, Vander Linden, & Kozlowski, 2009). A recent
illustrative example by Fort (2015) uses interpolation to visualize and derive estimates for the absolute
speed of demic (i.e. movement of people) versus cultural diffusion related to this economic transforma-
tion. Fort’s work is based on a point sample of nearly 1000 radiocarbon dates scattered across Europe that
record the dates and locations of early farming communities. As there has been a long-standing debate
over the relative importance of demic versus cultural processes in this transition, Fort’s stated goal was to
use the temporal patterns to distinguish between the two processes.
Fort first derives mathematical models for demic, cultural, and demic-cultural diffusion rates and
shows that demic-cultural diffusion will spread over space faster than just demic or cultural diffusion
alone. Second, Fort interpolates the radiocarbon point data to generate a surface model to visualize the
temporal patterns related to the appearance of agriculture. Because there is a clear southeast-northwest
spatial trend he uses universal kriging to allow for the first order trend surfaces to be incorporated into
the weighting algorithm, although he does note that experimentation with different methods produced
132 James Conolly
Figure 7.8 Interpolation example modified from Fort (2015, Figure 1). An interpolated surface model of
radiocarbon dates from Neolithic sites (black dots) depicting the space-time process of the spread of agriculture
across Europe. Note the areas indicated by arrows in the southwest and northeast of the model showing where
data sparseness causes instability in estimates. A colour version of this figure can be found in the plates section.
similar results. Like the Salisbury paper discussed earlier, there is no information on the model and vario-
gram, but the raw data is made available in Isern, Fort, and Vander Linden (2012) for further analysis and
verification by interested readers. Following the interpolation, Fort uses the predictive surface to derive a
local ‘directional’ surface that visualizes the speed and geographic direction of the transmission.
Note in the model (Figure 7.8) how the sparseness of the data samples in some locations (e.g. in Spain)
causes some instability in the predictions that create artificial looking temporal boundaries. This is an
unavoidable problem with sample data that are unevenly distributed, but contributing to the difficulties in
this case could be a grain that is too fine for the density of points. Creating sub-interpolations at different
resolutions to consider the heterogeneity of the sample and then combining into a single map is a potential
solution. As a further possibility, because calibrated radiocarbon dates are non-normal probability distri-
butions, it would be useful to consider applying a Monte Carlo approach to generate multiple predictive
surfaces based on radiocarbon values randomly selected from under each site’s pooled probability curve.
This would allow for estimation of uncertainty in the surface prediction reflecting the challenge of using
radiocarbon data for time-space models. The details of this are beyond the scope of this chapter but it serves
to illustrate the potential additional uses of interpolation for the visualization of space-time dynamics.
Conclusion
I have described the core concepts underlying all forms of spatial interpolation, along with the primary
methods that archaeologists have used in the past and continue to use when they need to build and
Spatial interpolation 133
interpret continuous surfaces from point observations. As so much of archaeological data is collected as
point observations – or can easily be converted to point observations – this means that interpolation has
a very wide range of potential applications.
From the case studies and scenarios described, it should be clear that interpolation methods are
also very flexible and offer multiple ways to tailor an analysis to fit the character of the data and the
inferred spatial process that is being modelled. However, inexperienced users are strongly advised to
not accept the defaults in many of the GIS software platforms, as these rarely provide optimum solu-
tions. Instead, first consider questions such as whether there is a spatial trend in the data, whether there
is a potential for anisotropy, whether the surface process is likely to be smooth or rough, whether there
are boundaries and how these are to be managed, how the edge effect is going to be managed, and what
the appropriate output resolution should be. Careful consideration of these questions will certainly
lead to a more successful application of interpolation and is likely to produce a more accurate surface.
Experimenting with several approaches and parameters and evaluating them against a test sample of
known points withheld from the analysis or in a more formalized way may be the best option. For
critical application formal evaluation of predictive accuracy is the preferred solution, such as through
test and training samples, cross-validation or by using a method such as ordinary or universal kriging,
which is generally seen as a more robust approach than IDW or TPS approaches, especially with data
having a global trend.
As a final note,interpolation does not have to end with visualization. There are ways of generating additional
insights into spatial processes by using the interpolated surfaces to derive other measurements. The obvious
ones are using maps of elevation to derive slope and aspect maps, but as shown earlier, Fort (2015) explains how
he obtains a direction of change map based on his interpolation of radiocarbon dates. If maps are produced
representing artifact density, they can similarly be converted into insightful visualizations that locate major
rates of change using similar methods. Interpolated maps showing different artifact densities (e.g. comparing
pottery to lithics) can also be manipulated with map algebra to illustrate how the two are correlated. These
types of approaches more generally fall under the umbrella of spatial data manipulation and raster or map
algebra but are mentioned here to emphasize that analysis need not end with the construction of a continuous
surface of a spatial process – surfaces can be the building blocks for additional forms of data visualization and
analysis.
References
Aguilar, F. J., Aguilar, M. A., & Carvajal, F. (2005). Effects of terrain morphology, sampling density, and interpolation
methods on grid DEM accuracy. Photogrammetric Engineering and Remote Sensing, 71(7), 805–816.
Ammerman, A. J., & Cavalli-Sforza, L. L. (1984). The Neolithic transition and the genetics of populations in Europe.
Princeton, NJ: Princeton University Press.
Bocquet-Appel, J.-P., Naji, S., Vander Linden, M., & Kozlowski, J. K. (2009). Detection of diffusion and contact
zones of early farming in Europe from the space-time distribution of 14c dates. Journal of Archaeological Science,
36(3), 807–820.
Conolly, J., Dillane, J., Dougherty, K., Elaschuk, K., Csenkey, K., Wagner, T., & Williams, J. (2014). Early collective
burial practices in a complex wetland setting: An interim report on mortuary patterning, paleodietary analysis,
zooarchaeology, material culture and radiocarbon dates from Jacob Island (BcGo17), Kawartha Lakes, Ontario.
Canadian Journal of Archaeology/Journal Canadien d’Archéologie, 38, 106–133.
Fort, J. (2015). Demic and cultural diffusion propagated the Neolithic transition across different regions of Europe.
Journal of the Royal Society Interface, 12, 20150166.
Gkiasta, M., Shennan, S., & Steele, J. (2003). Neolithic transition in Europe: The radiocarbon record revisited.
Antiquity, 77, 45–62.
134 James Conolly
Haining, R. P., Kerry, R., & Oliver, M. A. (2010). Geography, spatial data analysis, and geostatistics: An overview.
Geographic Analysis, 42, 7–31.
Hancock, P., & Hutchinson, M. (2006). Spatial interpolation of large climate data sets using bivariate thin plate
smoothing splines. Environmental Modelling and Software, 21(12), 1684–1694.
Hengl, T. (2006). Finding the right pixel size. Computers and Geosciences, 32(9), 1283–1298.
Hutchinson, M. (2007). Interpolating mean rainfall using thin plate smoothing splines. International Journal of Geo-
graphical Information Systems, 4, 385–403.
Isern, N., Fort, J., & Vander Linden, M. (2012). Space competition and time delays in human range expansions.
application to the neolithic transition. PLoS One, 7(12), e51106. doi:10.1371/journal.pone.0051.
Lam, N. S.-N. (2004). Fractals and scale in environmental assessment and monitoring. In E. Sheppard & E. B.
McMaster (Eds.), Scale and geographic inquiry: Nature, society, and method (pp. 23–40). Hoboken, NJ: Wiley.
Li, J., & Heap, A. D. (2008). A review of spatial interpolation methods for environmental scientists. Techni-
cal Report Record 2008/23, Retrieved from Geoscience Australia, Department of Resources, Energy
and Tourism, Commonwealth of Australia. Retrieved December 5, 2018, from https://data.gov.au/
dataset/a-review-of-spatial-interpolation-methods-for-environmental-scientists
Mikołajczyk, L., & Milek, K. (2016). Geostatistical approach to spatial, multi-elemental dataset from an archaeological
site in Vatnsfjörður, Iceland. Journal of Archaeological Science: Reports, 9, 577–585.
Miller, J. R., Turner, M. G., Smithwick, E. A. H., Dent, C. L., & Stanley, E. H. (2004). Spatial extrapolation: The
science of predicting ecological patterns and processes. Bioscience, 54(4), 310–320.
Negre, J., Muñoz, F., & Lancelotti, C. (2016). Geostatistical modelling of chemical residues on archaeological floors
in the presence of barriers. Journal of Archaeological Science, 70, 91–101.
Peters, D. P. C., Herrick, J. E., Urban, D. L., Gardner, R. H., & Breshears, D. D. (2004). Strategies for ecological
extrapolation. OIKOS, 106(3), 627–636.
Rondelli, B., Lancelotti, C., Madella, M., Pecci, A., Balbo, A., Pérez, J. R., . . . Ajithprasad, P. (2014). Anthropic activ-
ity markers and spatial variability: An ethnoarchaeological experiment in a domestic unit of Northern Gujarat
(India). Journal of Archaeological Science, 41, 482–492.
Salisbury, R. B. (2013). Interpolating geochemical patterning of activity zones at Late Neolithic and Early Copper
Age settlements in eastern Hungary. Journal of Archaeological Science, 40(2), 926–934.
Sen, Z. (2016). Spatial modeling principles in earth sciences (2nd ed.). New York: Springer.
Webster, R., & Oliver, M. A. (2007). Geostatistics for environmental scientists. Chichester, UK: Wiley.
Yamada, I. (2009). Edge effects. In R. Kitchin and N. Thrift (Eds.), International encyclopedia of human geography
(pp. 381–388). Elsevier.
8
Spatial applications of correlation
and linear regression
Piraye Hacıgüzeller
Introduction
Identifying the relationships that exist between variables and exploring the nature of these relationships is
an essential element of any study of archaeological spatial phenomena. Methods of correlation and regres-
sion analysis serve this purpose by modelling how change in one variable (or more variables) is accounted
for in other variable(s). Once the association between the dependent/response and independent/explana-
tory variables is quantified, the next step is to interpret this relationship. This process requires domain
knowledge and a critical approach since the mathematical identification of an association does not neces-
sarily mean there is a meaningful, real-world association in the observed data set.
There are various methods for linear and non-linear regression modelling, some of which have
found widespread application on spatially explicit archaeological data sets (i.e. data sets that comprise
observations with geospatial coordinates). One of the major themes in correlation analysis and regres-
sion modelling in archaeology has been the diffusion of populations or “cultures”. For example, Silva
et al. (2015) concentrated on the diffusion of rice cultivation in Asia while Bicho, Cascalheira, and
Gonçalves (2017), which also forms the case study in this chapter, investigated the demic dispersal of
the Anatomically Modern Humans (AMH) across Europe (see also e.g. Cobo, Fort, & Isern, 2019;
Jerardino, Fort, Isern, & Rondelli, 2014; Pinhasi, Fort, & Ammerman, 2005). Another spatial archaeo-
logical theme where one can frequently come across correlation and regression is the association
between “site” or find locations and environmental, cultural and/or administrative variables such as
terrain curvature (Bevan & Conolly, 2004) or modern land use (Bevan, 2012; see also e.g. Carrero-
Pazos, 2018; Contreras et al., 2018; Winter-Livneh, Svoray, & Gilead, 2010). In this context binary
logistic regression has also been widely used to model the absence-presence of archaeological features
and sites (e.g. Bevan & Conolly, 2011; Carrer, 2013; Spencer & Bevan, 2018; cf. Kvamme, this vol-
ume; Verhagen & Whitley, this volume). Among other examples of archaeological spatial phenomena
researched through regression modelling is the association between food collection strategies and
environmental variables (Zhang, Bevan, Fuller, & Fang, 2010), intra-site artefact densities (Domínguez-
Rodrigo et al., 2017) and landscape terracing (Fall et al., 2012).
Here I provide an overview of the bivariate correlation and bivariate linear regression with Ordinary
Least Squares (OLS) methods, both commonly used techniques in archaeological spatial analysis. I also
136 Piraye Hacıgüzeller
explain that in spatial applications of these methods (as well as their multiple, multivariate and, in the case
of regression analysis, non-linear versions) there are particular issues involved caused by spatial autocorre-
lation. The chapter includes methods and examples to deal with these issues both for bivariate correlation
and bivariate linear regression with OLS. Whilst these effects have largely been overlooked in archaeo-
logical applications of correlation and regression, spatial autocorrelation is a ubiquitous issue in spatial
phenomena and results in the replication of information and, hence, redundant information being used
in these analyses and models. As discussed in more detail later and in the case study, disregarding spatial
autocorrelation in correlation and regression studies can have significant effects on the results including
increased chances of committing Type I error (i.e. rejecting the null hypothesis even though it is true),
and yielding less precise and biased regression coefficient estimates.
Method
Correlation
The corresponding sample statistic, that is the correlation coefficient of a sample taken from the popula-
tion, is represented by the Roman letter r. One of the many versions of the equation used to calculate r is
given by Equation 8.1 where n is the sample size, x and y are the average value of observations for each
variable, and sx and sy are the sample standard deviations for each variable.
n xi − x
y
1 yi −
r=
n−1
∑ i= 0
sx s y
(8.1)
The resulting correlation coefficient can be any real number between –1 and 1. A perfect linear associa-
tion (where all points can be connected through a straight line) will have a magnitude –1 or 1, depending
on the direction of the correlation. In reality, the linear relationship between the variables of a data set
will almost always be imperfect meaning that the points will not sit neatly on a straight trend line and
the correlation coefficient will get a value between –1 and 1, where a strong negative association will be
close to –1 and a strong positive association close to 1. Importantly, a correlation coefficient value of zero does
not mean that there is no correlation between the variables in question but rather that there is no linear
correlation. It is, in fact, possible to have a strong non-linear association between two variables and yet have
zero as a correlation coefficient (for instance in u-or n-shaped relationships).
Significance testing for Pearson’s correlation coefficient and the issue with
spatial autocorrelation
There are no hard-and-fast rules to decide whether a certain calculated value of r is “sufficiently high to
make the researcher happy about the level of correlation” (Rogerson, 2010, p. 190). Therefore, the results
of correlation analysis should involve significance testing to judge how reliable r is. Specifically, signifi-
cance testing for Pearson’s correlation coefficient is used to calculate the probability that a sample has a
correlation coefficient, r, which comes from a population with a correlation coefficient, ρ, not different
from zero. If the probability (or p value) of the true correlation coefficient being zero is small in com-
parison to the significance level decided on prior to analysis, denoted by α, then the null hypothesis (that
the population correlation coefficient is equal to zero) is rejected. It can therefore be concluded that in
the parent population of the sample, the two variables in question are also very likely to be correlated. Of
course, here, like in any significance testing, there is always the chance that the null hypothesis is rejected
even though it is true (Type I error) and, as discussed below, with the presence of spatial autocorrelation in
the variable values this chance increases. If, on the other hand, the p value is equal to or larger than α, the
null hypothesis cannot be rejected. This means that there is considerable probability (or at least enough
not to disregard it) that the sample comes from a population where there is no correlation between the
variables in question and the sample might in fact be selected by chance. Here, then, making a Type II
error, where the null hypothesis is failed to be rejected even though it is false, remains a possibility. As a
rule of thumb in statistics, the best way to minimise the risk of making Type I and Type II errors is to
have as large a sample size as possible.
The conditions for the significance testing of r are bivariate normality1 and independence. If these
conditions are met, the t-statistic for significance testing can be calculated in terms of sample size n and
correlation coefficient r as given in Equation 8.2 and then used in a t-table to estimate the p value for
comparison to the selected significance level. Importantly, if there is a clear positive or negative correlation
138 Piraye Hacıgüzeller
between two variables, a one-tailed significance test is performed to test for the possibility of either posi-
tive or negative correlation. If the direction of correlation is unclear, however, a two-tailed test should be
used to test for the possibility of correlation in both directions.
r (n − 2 )
t= (8.2)
(1 − r 2 )
Coming to the issue of spatial autocorrelation, Gangodagamage, Zhou, and Lin (2017, p. 92) define spatial
autocorrelation as the “natural inclination of a variable to exhibit similar values as a function of distance
between the spatial locations at which it is being measured” (see also Lloyd & Atkinson, this volume). As
mentioned earlier, independence is one of the conditions for significance testing for the correlation coef-
ficient. This condition requires that the data set in question comprises only independent observations of
the two variables. For x values, this means that each value of x is not affected by and does not affect other
values of x. The same goes for each y value. If the variables in the correlation analysis are spatially auto-
correlated however (which is often the case in spatial applications), this condition is not fulfilled. In such
a situation “a certain amount of information is shared and duplicated among neighbouring locations,
and thus, an entire data set possesses a certain amount of redundant information” (Lee, 2017, p. 361). The
result is that the actual sample size is larger than the effective sample size, the latter being the number of real
independent observations (Gangodagamage et al., 2017, p. 93; Griffith & Paelinck, 2011, p. xxii). Such
artificial inflation of the sample size is problematic when judging whether a correlation coefficient value
in a given study is high or low enough (in the cases of positive and negative correlation, respectively) for
a statistically significant association depends on sample size. As Rogerson (2010, pp. 189–190) explains in
detail, the minimum absolute value of r needed to attain significance decreases as sample size increases.
So, if the sample size is relatively large (e.g. n = 250), a seemingly low r value (e.g. r = 0.2) may point to
a statistically significant correlation. Yet when the same r value is calculated for a smaller sample (e.g. n =
20), the researcher may fail to reject the null hypothesis at the same significance level. Therefore, when
the effective sample size is rendered smaller than the actual sample size due to spatial autocorrelation,
the chances will be elevated that the null hypothesis is rejected even though it is true (i.e. a Type I error
is committed) (cf. Lee, 2017). As will be discussed below, spatial autocorrelation causes similar issues in
regression analysis. There are different ways of measuring spatial autocorrelation in a data set, one of the
most common being through Moran’s I statistic (used also in the case study later). The detailed coverage
of its methodology and calculations is beyond the scope of this paper but readers can confer Rogerson
(2010, pp. 268–273).
Rogerson (2010, p. 194) explains that when only one variable shows spatial autocorrelation in a bivari-
ate correlation analysis, there is no effect on the significance testing result and so spatial autocorrelation is
not an issue. However, when both variables are spatially autocorrelated, corrective measures are necessary
to mitigate the risk of a Type I error occurring. One of the methods that can be used to address this issue
is modifying the significance testing to take the degree of autocorrelation into account. As Lee (2017,
pp. 364–365) points out, this can be done by replacing actual sample size n with effective sample size n*
calculated via Equation 8.3. Here, R̂ x and R̂ y are the estimated n × n spatial autocorrelation matrices
for each of the two variables and trace is a matrix operation summing the diagonal elements of a matrix,
which in this case is the product of R̂ x and R̂ y (see also Haining, 2003, p. 279). The diagonal elements
of this particular product matrix provide the relative degree of bivariate spatial autocorrelation at each
observation location (where 1 corresponds to no spatial autocorrelation and values of more than 1 to
positive spatial autocorrelation) and their sum quantifies the overall degree of spatial autocorrelation. The
effective sample size calculated in this way is then used to form a new t-statistic for significance testing
Applications of correlation and linear regression 139
(Equation 8.2) which arguably takes spatial autocorrelation at each location into account (see case study
below).
−1
n∗ = 1 + n 2 trace (R
ˆ R ˆ )
y (8.3)
x
ŷ = mx + b (8.4)
sy
m= r (8.5)
sx
The best-fit regression line determined by the OLS method always passes through the point with values
equal to the sample mean of the two variables, x̅ and y̅. Therefore, the y-intercept can be calculated using
Equation 8.6 once slope value is known. Given the definition of regression residuals above, the observed
ith value of the y variable, yi, will be equal to the sum of the corresponding predicted value, ŷi and a
residual term (also known as error term) ei, as given in Equation 8.7.
b = y − mx (8.6)
yi = yˆi + ei = mx + b + ei (8.7)
140 Piraye Hacıgüzeller
∑
n
( yˆí − y )2
r 2
= i=1 (8.8)
∑
n
( y − y )2
i=1 í
The value of the coefficient of determination is equal to the squared value of Pearson’s correlation coef-
ficient r (hence its notation as R2 or r2 or alternatively R-squared) although discussing why this is the case
mathematically is beyond the scope of this chapter. The significance testing for R2 will involve testing the
null hypothesis that the true coefficient of determination (i.e. the coefficient of the determination of the
population), ρ2, equals zero (i.e. H0: ρ2 = 0). The F-statistic formed for this test through Equation 8.9 has
1 and n-2 degrees of freedom for the numerator and denominator respectively and is the square of the
t-statistic used for the significance testing for Pearson’s correlation coefficient, r (Equation 8.2). Impor-
tantly, the two tests are identical, providing identical p-values and conclusions when the same significance
level is selected (cf. Rogerson, 2010, pp. 209–210).
r 2 (n − 2)
F= (8.9)
(1 − r 2 )
RMSD, on the other hand, is concerned with the variation in the y variable that is not explained by the
variation in x. It is also known as the standard deviation of the residuals (or Root Mean Square Error
(RMSE)). Given that standard deviation is about spread, this alternative name implies that it measures the
spread around the regression line in the y direction or, in other words, how precisely the regression line
fits the data points in terms of y values. For bivariate regression, it can be calculated with Equation 8.10
where y – ŷ denotes the difference between actual observations for the y variable and the corresponding
values predicted by the regression line. The denominator, n – 2, is the number of degrees of freedom.
∑
n
( yi − yˆi )2
RMSD = i=1
(8.10)
n− 2
autocorrelated which means that residuals will exhibit similar values as a function of distance between their
spatial locations. Crucially, this violates assumptions about the residual term in simple regression which are
the zero mean, independence, constant variance and normal distribution (Rogerson, 2010, p. 281; Sriniva-
san, 2017, p. 2067). Moreover, much like the case for correlation, redundant information is produced by
spatial autocorrelation and hence the linear model estimated through OLS without adjusting for spatial
autocorrelation will be erroneous. A spatial regression model needs to be created instead.
One should, however, not conflate larger scale spatial structures in a study area with the mainly
neighbourhood-scale (Kühn & Dormann, 2012, p. 995) spatial autocorrelation issues caused by intrinsic
factors. In a perfect regression model with the right choice of independent variables, spatial dependency
in the dependent variable will be fully explained by the spatial dependency in the independent variable(s)
if there is no additional spatial autocorrelation within the dependent variable caused by intrinsic factors.
In that case, the residuals will not be spatially autocorrelated and the spatial autocorrelation in the regres-
sion analysis will not be an issue (Beale, Lennon, Yearsley, Brewer, & Elston, 2010, p. 248; Bini et al., 2009,
p. 194; Kühn & Dormann, 2012, pp. 995–996). The spatial autocorrelation observed within the dependent
variable due to intrinsic factors, on the other hand, will result in the replication of information and hence
redundancy that cannot be explained by the independent variable regardless of how well the latter is chosen
in the modelling process. In such cases the spatial autocorrelation is observed in model residuals and a spatial
regression model is needed to obtain more accurate regression coefficient estimates (Bini et al., 2009, p. 194;
Kühn & Dormann, 2012, pp. 995–996; see also Bevan, this volume; Bevan & Conolly, 2011, pp. 1306–1307
on first-order and second-order effects). That said, Beale et al. (2010, p. 248) also stress that the theoretical
conditions mentioned here for the case of broader scale spatial structures (i.e. spatial autocorrelation in the
dependent variable being simply a function of spatial autocorrelation in the independent variable(s)) are
almost never encountered in practice meaning that the presence of spatial autocorrelation almost always
produces spatially autocorrelated residuals. Hence, spatial regression modelling needs to be used for almost
all phenomena where spatial autocorrelation in regression variables is observed.
In real-world data sets it is impossible to identify the true effects of spatial autocorrelation on regres-
sion analysis because “one can never know if the results are a true reflection of the input data or an arte-
fact of the analytical method” (Beale et al., 2010, p. 247). Therefore, Beale et al. (2010) use simulations
to compare true values from realistic simulation scenarios with regression model parameter estimates and
test and compare the performance of non-spatial (OLS) and a range of spatial regression methods (e.g. the
Simultaneous Autoregressive Model, Generalised Least Squares and Bayesian Conditional Autoregressive
Model). Their results show that using OLS regression on data sets that are spatially autocorrelated leads to
results with low precision (i.e. high variation around the true value; Beale et al., 2010, p. 247) and, similar
to the aforementioned effect of spatial autocorrelation on the results of correlation studies, increases the
possibility of Type I errors (Beale et al., 2010).
A major point of debate in spatial regression modelling is that different models can provide substan-
tially different regression coefficients for the same data set. Precisely why this happens is unclear and still
needs to be investigated (Beale et al., 2010; Bini et al., 2009). Yet it is clear that with spatial autocorrelation
in regression residuals, spatial regression methods will provide less biased and more precise model coef-
ficient estimates than the non-spatial OLS method and reduce the chances of Type I errors.
The effects of spatial autocorrelation can be incorporated in linear regression models in two major
ways: through the error term and as co-variates (cf. Anselin, 2009; Beale et al., 2010, pp. 250–251). In
this chapter, an accessible method to error modelling is presented which is, in fact, an example of a Simul-
taneous Autoregressive Model (SAR), often referred to as the Autocorrelated Errors Model (Bailey &
Gatrell, 1995, pp. 282–286; see Rogerson, 2010, pp. 283–284). For this method, a spatial regression model
is specified in the same way as the OLS linear regression method. The difference is that each residual is
142 Piraye Hacıgüzeller
modelled as a function of the nearby residuals (Rogerson, 2010, p. 283). The method is applied in the
following case study. Thorough discussions on SAR and other spatial regression models as well as further
references can be found elsewhere (e.g. Anselin, 2009; Chun & Griffith, 2013; Srinivasan, 2017).
The method involves calculating two sets of quantities using Equations 8.11. These equations specify
that a new set of values is defined on the basis of weight, w, which indicates spatial proximity, and a ρ
value. The latter is selected by trying a range of possible positive ρ values and observing how these dif-
ferent values improve the residuals when y* is regressed against x*. As Rogerson (2010, pp. 290–291)
explains, the ρ value associated with the “best” set of residuals is selected and among other methods the
decision can be based on minimizing the RMSD calculated for each regression model as further illustrated
in the case study below. Statistically more sophisticated methods for estimating ρ values do exist and
involve a computationally intensive maximum likelihood procedure (cf. Bailey & Gatrell, 1995, pp. 286–
289). Finally, Beale et al. (2010, p. 253) stress, on the basis of their simulation studies, that the regression
models that assign the effect of spatial autocorrelation to an error term, such as the one presented here,
will retain some spatial autocorrelation in the residuals “but the important difference is that these models
are tolerant of such autocorrelation and should provide [more] precise estimates and correct error rates”.
y∗ = y − ρ∑ j =1 w ij y j
n
(8.11)
x∗ = x − ρ∑ j =1 w ij x j
n
Case study
In their article “Early Upper Paleolithic colonization across Europe: Time and mode of the Gravettian
diffusion”, Bicho, Cascalheira, and Gonçalves (2017) aim to model demic dispersal of Anatomically Mod-
ern Humans (AMH) between c. 37,000 and 30,000 years ago across Europe using correlation and linear
regression. The dispersal phenomenon they study leads to the replacement of the previous Aurignacian
tradition and Neanderthal populations in marginal areas. Some parts of Europe at the time, however, were
still devoid of hominins and were occupied for the first time by Gravettian diffusion during the period
studied. The analyses involve the oldest Gravettian calibrated Accelerator Mass Spectrometry (AMS)
dates from 33 sites spread across Europe. Hence, there is a single date corresponding to each site. The
authors explain carefully how they filter the data set they use in the study. They identify three potential
locations as the oldest Gravettian sites, namely Buran Kaya III, Geissenklösterle and Krems-Hundssteig.
They hypothesise that each of these sites may be the origin of the Gravettian techno-complex in Europe.
Following Fort, Pujol, and Cavalli-Sforza (2004), a 150 km radius for Paleolithic waves-of-advance
is adopted in the study and the authors compute three sets of 150 km isopleths starting in each of the
three potential origin sites using the least-cost distance method. They then plot the site locations and
select the oldest site within each two isopleths in different cardinal directions (Figure 8.1). Subsequently,
they carry out least-cost distance calculations from each of the three potential origin sites to each one
of the selected sites (Table 8.1). They also calculate the difference between the mean calibrated date of
each origin site and each one of the remaining sites. In the next step of their analysis, they create a scat-
ter plot for these three data sets where they place the potential dependent variable (i.e. the time interval
between each site and the origin site in years) on the vertical axis and the potential independent variable
(i.e. least-cost distance between each pair of sites in kms) on the horizontal axis. With this, they intend
to examine whether the time difference between the appearance of the Gravettian at each site and the
possible origin site may have been affected by the least-cost distance between them. Consequently, they
calculate the correlation coefficient r, a p value for its significance testing and a regression line with an
Figure 8.1 Cost-distance surface with 150 km isopleths having (a) Buran Kaya III (BK); (b) Geissenklösterle
(GEISSE); (c) Krems-Hundssteig (KRE-H) as origin sites (Bicho et al., 2017, Figure 1). A colour version of this
figure can be found in the plates section.
Table 8.1 Early Gravettian calibrated Accelerator Mass Spectrometry (AMS) dates of sites included in the study
together with least-cost path distances from the three earliest sites to the sites included in each correlation and regression.
80 percent confidence interval (Figure 8.2). Using the slope of each of the regression lines they calculate
the speed and spread of the Gravettian techno-complex.
For Buran Kaya, the sample size is n = 21 and the correlation coefficient is r = 0.358; for Geissen-
klösterle, n = 19 and r = 0.657; and for Krems-Hundssteig, n = 17 and r = 0.568.2 For one of the three
data sets, where Geissenklösterle is taken as the origin site (Figure 8.2(b)), the details of the correlation
coefficient calculation are as follows: the average least-cost distance of all sites to Geissenklösterle ( x ) is
1432.526 km; the average time difference ( y) is 4125.421 years; the sum of the products of the differences
at each location between least-cost distance and x, and time interval and y̅ (i.e. Σ( yi – y) × (xi – x )),
is 19471293.789 (Table 8.2); the sample standard deviations for the x and y variables are 847.302 km
and 1942.505 years respectively. The product of the two standard deviations and n – 1 = 18 (i.e. the
Figure 8.2 Linear regression models created with the Ordinary Least Squares (OLS) method to determine the
association between the time difference for the appearance of the Gravettian techno-complex at different sites
and their least-cost distance to three origin sites. (a) model with Buran Kaya III as origin; (b) model with Geissen
klösterle as origin; (c) model with Krems-Hundssteig as origin (related data is presented in Table 8.1; Bicho et al.,
2017, Figure 2).
146 Piraye Hacıgüzeller
Table 8.2 Details of calculations for the numerator of Equation 8.1 where Geissenklösterle is taken as the origin.
Site Code Least-Cost Distance Time Difference (xi – xave) (yi – yave) (yi – yave)
from Geissenklösterle (yrs) (y-values) (xi – xave)
(km) (x-values)
0.657 × 17
t= = 3.593 (8.12)
(1 − 0.6572 )
As discussed above, the conditions for significance testing for Pearson’s correlation coefficient are bivariate
normality and independence. In order not to elaborate beyond the scope of this chapter, let us assume
Applications of correlation and linear regression 147
that the normality condition for this data set was checked by the authors and met. The independence
condition, however, will not be fulfilled if both variables happen to be spatially autocorrelated. The
strength and scale of spatial autocorrelation in the two variables is not discussed by the authors. The maps
in Figure 8.3 show the difference between average and actual values for each location and each variable.
They display fine-scale positive spatial autocorrelation since pairs of locations in close proximity to one
another often both score either above average or below average contributing positively to the calculation
of Moran’s I. We can define weights to indicate spatial proximity and calculate Moran’s I to quantify
this spatial autocorrelation. Although there are less arbitrary ways to do this, let us simply assign weights
other than zero only to the two nearest neighbours of each site. Specifically, let us employ a function
for inverse-decaying distances, w(d), where the distance of two closest neighbouring sites to the site in
d
question (measured as great circles and in km) is d. When d £ 1000, w(d ) = 1 − 1000 and when d ³ 1000,
w(d) is equal to zero. This means that, for instance, for the case of Antonilako Koba (AK), only its two
closest neighbours El Castillo (CASTI) and Tarte (TARTE), which are both less than 1000 km away from
AK, are assigned weights other than zero, specifically 0.9 and 0.7 corresponding to a distance of 104 and
301 km, respectively. Accordingly, we calculate Moran’s I for the least-cost distance, x, variable as 0.672
and for the time, y, variable as 0.633.
Figure 8.3 (a) Map illustrating the difference between x value (i.e. least-cost distance in kms) at each location
and average x value in order to give an indication of spatial autocorrelation (after Bicho et al., 2017, Figure 1).
(b) Map illustrating the difference between y value (i.e. time difference in years) at each location and average
y value in order to give an indication of spatial autocorrelation (after Bicho et al., 2017, Figure 1). A colour
version of this figure can be found in the plates section.
148 Piraye Hacıgüzeller
In order to get an idea about the effect of spatial autocorrelation on significance testing of the correla-
tion coefficient and keep the discussion relatively brief and simple, let us simply follow the example of Lee
(2017, p. 365; cf. Chessel [1981] on the spatial autocorrelation matrix) and check how a hypothetical positive
bivariate spatial autocorrelation score of 2.0 on average across locations (which would be calculated through
the trace matrix operation explained above) would affect our results here. The effective sample size n* in this
case can be calculated as 10.5 (1 + 192 / 38; Equation 8.3), rounded to 10 to be on the safe side. Hence,
the sample size drops from its actual size of 19 to an effective size of 10 and a new t-statistic can be calcu-
lated as 2.465 using Equation 8.2. In a student’s t distribution table, we can see that for n*-2 = 8 degrees of
freedom and α = 0.01, this t-statistic is smaller than the critical values of t both for the one-(t = 2.896) and
two-tailed (t = 3.355) tests. Hence, we fail to reject the null hypothesis this time meaning that the correla-
tion is no longer significant at this level and illustrating how not taking spatial autocorrelation into account
may lead to inaccurate inferences in correlation studies of spatial phenomena.
Moving to regression analysis, we calculate the slope parameter for the OLS regression using Equa-
tion 8.5 as 1.507 yrs/km. We then calculate the y-intercept for the linear model using Equation 8.6 as
1966.939 yrs. This means that at the origin site of Geissenklösterle (where x is equal to 0) the Gravet-
tian techno-complex arrived approximately 1967 years later than at the origin site. This of course does
not make sense and forms an example of how interpretations of y-intercept in the regression model or,
Applications of correlation and linear regression 149
in fact, the parameter estimates of the regression models in general may not always be meaningful and
should be approached carefully and critically. Now that we have both parameters for our bivariate linear
model, we can write down the equation for the OLS regression line for Geissenklösterle (which is shown
in Figure 8.2(b)) as:
yˆ = 1.507x + 1966.939
The equation indicates that for every 1000 km increase in the least-cost distance from the origin site
Geissenklösterle, the appearance of the Gravettian techno-complex is delayed for approximately 1507
years. The coefficient of determination for this regression model, r2, is equal to 0.432 which means that in
this linear regression model approximately 43 percent of the variation in the dependent variable, the time
interval, is explained by variation in the independent variable, least-cost distance. The RMSD is calculated
as 1506.480 yrs. As discussed, the conclusions of the significance testing for r2 will be same as those for r.
Hence, we can say that at a 0.01 significance level, we can reject the null hypothesis and infer that there
is a statistically significant relationship between time interval and least-cost distance.
Spatial autocorrelation, however, changes things considerably. An examination of Figure 8.4 which
presents regression residuals at each location shows that some of the residuals in close proximity are
Figure 8.4 Map showing residuals at each location in order to give an indication of spatial autocorrelation
(after Bicho et al., 2017, Figure 1). A colour version of this figure can be found in the plates section.
150 Piraye Hacıgüzeller
Table 8.3 Details of calculations for the Autocorrelated Errors Model (ρ = 0.36, Equation 8.11).
similar in sign and magnitude especially in the case of the first-degree neighbours. Calculating Moran’s
I as equal to 0.500 with the same weighing function used above confirms positive spatial autocor-
relation effect. In order to build the Autocorrelated Errors Model explained above (Equation 8.11),
we use the same weighing scheme and calculate new quantities of y* and x* for different values of ρ
starting from zero. When ρ is equal to 0.36 (Table 8.3), the RMSD value for the regression of y* versus
x* is minimized and equals 1349.358. This value is smaller than the RMSD value of 1506.480 for the
original linear regression model. The equation for the linear regression model that takes spatial auto-
correlation into account is:
This new model roughly indicates that instead of a 1507-year increase in time difference with every
1000 km increase in least-cost distance to the origin site, as suggested by the original model, an 1850-
year increase ought to be considered. The authors calculate the speed of advance for the Gravettian
1
with the equation speed = slope following Jerardino et al. (2014). The new regression model, which
1
adjusts for spatial autocorrelation, indicates that this dispersion rate drops from 1.507 = 0.664km/yr to
1
1.850
= 0. 541km/yr . Moreover, the coefficient of determination increases from 0.432 to 0.533. The
Applications of correlation and linear regression 151
F-statistic calculated for this new model using Equation 8.9 and the new r2 value, is 6.730 which has
1 and (since there are 19 observations in the sample) 19 -2 = 17 degrees of freedom for the numera-
tor and denominator, respectively. The F-table shows that at a 0.01 significance level the critical value
is 8.40. So, the null hypothesis that r2 is equal to zero can no longer be rejected at a 0.01 significance
level and, hence, the new slope value is no longer statistically significant. It therefore appears that not
taking spatial autocorrelation into account in the regression modelling does cause a Type I error in
this particular case.
Conclusion
The aim of this chapter has been to give short summaries of the bivariate correlation and bivariate linear
regression with Ordinary Least Squares (OLS) methods which are both widely applied to archaeological
spatial phenomena, as well as presenting accessible methods and examples to account for the effects of
spatial autocorrelation on such analyses. As highlighted, these effects mainly manifest themselves in the
results of significance testing, but also lead to less precise linear regression models with a potential bias
in the regression model parameter estimates. Even though it is argued that null hypothesis testing is not
always the best way to deal with spatial data sets (and alternative methods for model selection are sug-
gested which can bypass the Type I error issue (Bini et al., 2009, p. 194; Hawkins, 2012; cf. Burnham &
Anderson, 2002)), it has also been demonstrated that ignoring spatial autocorrelation in regression mod-
elling may even lead to a dramatic inversion of the slope sign of the linear regression model turning an
estimated positive linear relation to a negative one (Kühn, 2007)! So, the issues with coefficient estimates
in the case of ignoring spatial autocorrelation in regression analysis certainly remain.
On the basis of the observation that both in simulated and real data sets different spatial regression
methods produce different regression coefficients, the main research questions concern how much these
coefficient shifts differ and why they occur (Beale et al., 2010; Bini et al., 2009). Therefore, difficulties
remain when choosing the best method to account for spatial autocorrelation during regression (and
correlation) analysis. Yet “it is not good practice to use a statistical method when the data do not meet its
(sic) underlying assumptions” (Bini et al., 2009, p. 2002). Aiming to remedy the effects of spatial autocor-
relation on correlation and regression applications remains best practice and a spatially explicit method
will provide a more accurate regression model than a non-spatially explicit one.
While the effects of spatial autocorrelation in correlation and regression analyses in archaeology are
largely overlooked (see, however, Gil et al., 2016), the results of related studies in other disciplines are
certainly alarming and show that the effects for archaeological models can potentially be dramatic too.
The topic promises to be thoroughly researched and discussed in the social science. A recent book titled
Spatial regression models for the social sciences by Chi and Zhu (2020) is, for instance, of great interest to
archaeologists carrying out such analyses. Archaeology, with its own discipline-specific spatially autocor-
related phenomena, can add valuable data, information and insights to interdisciplinary research on spatial
regression in the future.
Acknowledgements
I would like to thank Frank Carpentier, Mark Gillings and Gary Lock for copyediting the chapter and
for their insightful comments. I am also grateful to Serkan Kemeç and Sumeeta Srinivasan who provided
valuable remarks on the content. A special thank you goes to Ingolf Kühn for a detailed and very helpful
review. Any remaining inaccuracies or mistakes are my own.
152 Piraye Hacıgüzeller
Notes
1 Bivariate normality implies that both variables considered in a correlation come from normal distributions and
their joint distribution is also normal-shaped (a three-dimensional bell curve). It is important to realise here that
even if each random variable X and Y is normally distributed, they will not necessarily be jointly bivariate normal.
2 The authors choose to include the origin site in the analyses of each case and this significantly increases the cor-
relation coefficient. For the case of Buran Kaya, r without the origin site (i.e. Buran Kaya) in the calculations is
equal to 0.146; for the case of Geissenklösterle, r = 0.571; and for Krems-Hundssteig, r = 0.467. The inclusion is
not a good choice in terms of the accuracy of the results because, as explained earlier, the authors are questioning
the association between the least-cost distance between non-origin and each of the three origin sites and time
difference between the appearance of Gravettian for each pair. It is clear that including the origin sites in these
calculations (i.e. where both least-cost distance and time difference are zero, and hence below average) will only
strengthen the expected positive correlation, and in this case significantly so. It is not clear, however, what kind of
interpretative advantage this inclusion provides to the researchers.
References
Anselin, L. (2009). Spatial regression. In A. S. Fotheringham & P. Rogerson (Eds.), The SAGE handbook of spatial
analysis (pp. 255–275). Los Angeles and London: SAGE Publications.
Bailey, T. C., & Gatrell, A. C. (1995). Interactive spatial data analysis. Harlow: Longman Scientific & Technical.
Beale, C. M., Lennon, J. J., Yearsley, J. M., Brewer, M. J., & Elston, D. A. (2010). Regression analysis of spatial data.
Ecology Letters, 13(2), 246–264. doi:10.1111/j.1461–0248.2009.01422.x. Retrieved from https://onlinelibrary.
wiley.com/doi/abs/10.1111/j.1461-0248.2009.01422.x
Bevan, A. (2012). Spatial methods for analysing large-scale artefact inventories. Antiquity, 86(332), 492–506.
doi:10.1017/S0003598X0006289X. Retrieved from www.cambridge.org/core/article/spatial-methods-
for-analysing-largescale-artefact-inventories/F52E018213D69DBC559D1AD2DAF5F8DD
Bevan, A., & Conolly, J. (2004). GIS, archaeological survey, and landscape archaeology on the Island of Kythera,
Greece. Journal of Field Archaeology, 29(1–2), 123–138. doi:10.1179/jfa.2004.29.1-2.123. Retrieved from https://
doi.org/10.1179/jfa.2004.29.1-2.123
Bevan, A., & Conolly, J. (2011). Terraced fields and Mediterranean landscape structure: An analytical case study from
Antikythera, Greece. Ecological Modelling, 222(7), 1303–1314. https://doi.org/10.1016/j.ecolmodel.2010.12.016.
Retrieved from www.sciencedirect.com/science/article/pii/S0304380010006824
Bicho, N., Cascalheira, J., & Gonçalves, C. (2017). Early Upper Paleolithic colonization across Europe: Time and
mode of the Gravettian diffusion. PLoS One, 12(5), e0178506. doi:10.1371/journal.pone.0178506. Retrieved
from https://doi.org/10.1371/journal.pone.0178506
Bini, L. M., Diniz-Filho, J. A. F., Rangel, T. F. L. V. B., Akre, T. S. B., Albaladejo, R. G., Albuquerque, F. S., . . . Hawkins,
B. A. (2009). Coefficient shifts in geographical ecology: An empirical evaluation of spatial and non-spatial regres-
sion. Ecography, 32(2), 193–204. doi:10.1111/j.1600-0587.2009.05717.x. Retrieved from https://onlinelibrary.
wiley.com/doi/abs/10.1111/j.1600-0587.2009.05717.x
Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-theoretic
approach (2nd ed.). New York: Springer.
Carrer, F. (2013). An ethnoarchaeological inductive model for predicting archaeological site location: A case-study of
pastoral settlement patterns in the Val di Fiemme and Val di Sole (Trentino, Italian Alps). Journal of Anthropological
Archaeology, 32(1), 54–62. https://doi.org/10.1016/j.jaa.2012.10.001. Retrieved from www.sciencedirect.com/
science/article/pii/S0278416512000530
Carrero-Pazos, M. (2018). Beyond the scale: Building formal approaches for the study of spatial patterns
in Galician moundscapes (NW Iberian Peninsula). Journal of Archaeological Science: Reports, 19, 538–551.
https://doi.org/10.1016/j.jasrep.2018.03.026. Retrieved from www.sciencedirect.com/science/article/pii/
S2352409X17308052
Chessel, D. (1981). The spatial autocorrelation matrix. In P. Poissonet, F. Romane, M. A. Austin, E. van der Maarel, &
W. Schmidt (Eds.), Vegetation dynamics in grasslans, healthlands and mediterranean ligneous formations: Symposium of the
Working Groups for Succession research on permanent plots, and data-processing in phytosociology of the International Society
for Vegetation Science, held at Montpellier, France, September 1980 (pp. 177–180). Dordrecht: Springer Netherlands.
Applications of correlation and linear regression 153
Chi, G., & Zhu, J. (2020). Spatial regression models for the social sciences. Los Angeles: SAGE.
Chun, Y., & Griffith, D. A. (2013). Spatial statistics & geostatistics: Theory and applications for geographic information sci-
ence & technology. Los Angeles: SAGE.
Cobo, J. M., Fort, J., & Isern, N. (2019). The spread of domesticated rice in eastern and southeastern Asia was mainly
demic. Journal of Archaeological Science, 101, 123–130. https://doi.org/10.1016/j.jas.2018.12.001. Retrieved from
www.sciencedirect.com/science/article/pii/S0305440318303765
Contreras, D. A., Hiriart, E., Bondeau, A., Kirman, A., Guiot, J., Bernard, L., . . . Van Der Leeuw, S. (2018). Regional
paleoclimates and local consequences: Integrating GIS analysis of diachronic settlement patterns and process-based
agroecosystem modeling of potential agricultural productivity in Provence (France). PLoS One, 13(12), e0207622.
doi:10.1371/journal.pone.0207622. Retrieved from https://doi.org/10.1371/journal.pone.0207622
Domínguez-Rodrigo, M., Cobo-Sánchez, L., Uribelarrea, D., Arriaza María, C., Yravedra, J., Gidna, A., . . . Mabulla,
A. (2017). Spatial simulation and modelling of the early Pleistocene site of DS (Bed I, Olduvai Gorge, Tanzania): A
powerful tool for predicting potential archaeological information from unexcavated areas. Boreas, 46(4), 805–815.
doi:10.1111/bor.12252. Retrieved from https://doi.org/10.1111/bor.12252
Fall, P. L., Falconer, S. E., Galletti, C. S., Shirmang, T., Ridder, E., & Klinge, J. (2012). Long-term agrarian landscapes
in the Troodos foothills, Cyprus. Journal of Archaeological Science, 39(7), 2335–2347. doi:https://doi.org/10.1016/j.
jas.2012.02.010. Retrieved from www.sciencedirect.com/science/article/pii/S030544031200074X
Fort, J., Pujol, T., & Cavalli-Sforza, L. L. (2004). Palaeolithic populations and waves of advance. Cambridge Archaeologi-
cal Journal, 14(1), 53–61. doi:10.1017/S0959774304000046. Retrieved from www.cambridge.org/core/article/
palaeolithic-populations-and-waves-of-advance/B1370E9320ABED5563469999FA41FE9B
Gangodagamage, C., Zhou, X., & Lin, H. (2017). Spatial autocorrelation. In S. Shekhar, H. Xiong, & X. Zhou (Eds.),
Encyclopedia of GIS (2nd ed., pp. 92–99). New York: Springer.
Gil, A. F., Ugan, A., Otaola, C., Neme, G., Giardina, M., & Menéndez, L. (2016). Variation in camelid δ13C and δ15N
values in relation to geography and climate: Holocene patterns and archaeological implications in central western
Argentina. Journal of Archaeological Science, 66, 7–20. doi:https://doi.org/10.1016/j.jas.2015.12.002. Retrieved
from www.sciencedirect.com/science/article/pii/S0305440315003179
Griffith, D. A., & Paelinck, J. H. P. (2011). Non-standard spatial statistics and spatial econometrics. Heidelberg and Lon-
don: Springer.
Haining, R. P. (2003). Spatial data analysis: Theory and practice. Cambridge, UK: Cambridge University Press.
Hawkins, B. A. (2012). Eight (and a half) deadly sins of spatial analysis. Journal of Biogeography, 39(1), 1–9.
doi:10.1111/j.1365-2699.2011.02637.x. Retrieved from https://onlinelibrary.wiley.com/doi/abs/10.1111/
j.1365-2699.2011.02637.x
Jerardino, A., Fort, J., Isern, N., & Rondelli, B. (2014). Cultural diffusion was the main driving mechanism of the
Neolithic transition in Southern Africa. PLoS One, 9(12), e113672. doi:10.1371/journal.pone.0113672. Retrieved
from https://doi.org/10.1371/journal.pone.0113672
Kühn, I. (2007). Incorporating spatial autocorrelation may invert observed patterns. Diversity and Distributions,
13(1), 66–69. doi:10.1111/j.1472-4642.2006.00293.x. Retrieved from https://onlinelibrary.wiley.com/doi/
abs/10.1111/j.1472-4642.2006.00293.x
Kühn, I., & Dormann, C. F. (2012). Less than eight (and a half) misconceptions of spatial analysis. Journal of Biogeog-
raphy, 39(5), 995–998. doi:10.1111/j.1365-2699.2012.02707.x. Retrieved from https://onlinelibrary.wiley.com/
doi/abs/10.1111/j.1365-2699.2012.02707.x
Lee, S.-I. (2017). Correlation and spatial autocorrelation. In S. Shekhar, H. Xiong, & X. Zhou (Eds.), Encyclopedia of
GIS (2nd ed., pp. 360–368). New York: Springer.
Pinhasi, R., Fort, J., & Ammerman, A. J. (2005). Tracing the origin and spread of agriculture in Europe. PLoS Biol-
ogy, 3(12), e410.
Rogerson, P. (2010). Statistical methods for geography: A student’s guide (3rd ed.). Los Angeles: Sage.
Silva, F., Stevens, C. J., Weisskopf, A., Castillo, C., Qin, L., Bevan, A., & Fuller, D. Q. (2015). Modelling the geo-
graphical origin of rice cultivation in Asia using the rice archaeological database. PLoS One, 10(9), e0137024.
doi:10.1371/journal.pone.0137024. Retrieved from https://doi.org/10.1371/journal.pone.0137024
Spencer, C., & Bevan, A. (2018). Settlement location models, archaeological survey data and social change in Bronze
Age Crete. Journal of Anthropological Archaeology, 52, 71–86. https://doi.org/10.1016/j.jaa.2018.09.001. Retrieved
from www.sciencedirect.com/science/article/pii/S0278416517301253
154 Piraye Hacıgüzeller
Srinivasan, S. (2017). Spatial regression models. In S. Shekhar, H. Xiong, & X. Zhou (Eds.), Encyclopedia of GIS (2nd
ed., pp. 2066–2071). New York: Springer.
Winter-Livneh, R., Svoray, T., & Gilead, I. (2010). Settlement patterns, social complexity and agricultural strate-
gies during the Chalcolithic period in the Northern Negev, Israel. Journal of Archaeological Science, 37(2),
284–294. https://doi.org/10.1016/j.jas.2009.09.039. Retrieved from www.sciencedirect.com/science/article/
pii/S0305440309003458
Zhang, H., Bevan, A., Fuller, D., & Fang, Y. (2010). Archaeobotanical and GIS-based approaches to prehistoric agri-
culture in the upper Ying valley, Henan, China. Journal of Archaeological Science, 37(7), 1480–1489. https://doi.
org/10.1016/j.jas.2010.01.008. Retrieved from www.sciencedirect.com/science/article/pii/S0305440310000130
Plate 2.8 A map showing points from two surveys that were collected on a total station. Location of the total
station or origin is represented as a star, survey on archaeological features on surface is marked in green, and
the survey of topography is in brown.
Plate 2.9 The survey points overlaid on a scanned map that is geo-rectified to WGS-UTM 21. A Python script
was developed to enable rotation and transformation of points in a local coordinate system to a global coordinate
system (UTM) using two known coordinate pairs.
Plate 5.4 Percolation cluster transitions for Domesday settlement. Evolution of the largest cluster in the perco-
lation process of Domesday settlement, overlaid on the transition plot (as in Figure 5.3). Maps of the clusters at
the distance threshold for each transition are depicted. Each vector point colour represents membership, when
two or more nodes are close enough to be part of the same cluster.
Plate 5.5 Domesday vill clusters at 3km and 2.9km overlaid on English coastline and Domesday counties
(generated from datasets provided by Stuart Brookes).
Plate 5.6 Domesday vill and 19th-century settlement clusters. (a) Domesday vill clusters at 3.2km overlaid
on coastline and Domesday counties, generated from Domesday vill datasets provided by Stuart Brookes; (b &
c) Roberts and Wrathmell’s 19th Century Settlement Nucleation dataset at 3km (Brown, 2015, p. 37) and at
3.5km overlaid by Roberts and Wrathmell’s central province (Brown, 2015, p. 57).
Plate 5.7 Hillfort clusters in Britain, at (a) 34km, (b) 12km and (c) 9km percolation radius.
Plate 6.12 ‘2.5D’ representation of (a) kriged elevations and (b) conditionally simulated values (viewed from
the south west).
Plate 6.13 Radiate of Allectus: C mint percentages in 5 km grid cells.
Plate 6.16 Kriged map of C mint percentages.
Plate 7.6 Visual differences in the surfaces of nine interpolation methods at 1 m resolution. RMS errors for
each model are provided in Table 7.2.
Plate 7.8 Interpolation example modified from Fort (2015, Figure 1). An interpolated surface model of radiocarbon
dates from Neolithic sites (black dots) depicting the space-time process of the spread of agriculture across Europe.
Note the areas indicated by arrows in the southwest and northeast of the model showing where data sparseness causes
instability in estimates.
Plate 8.4 Map showing residuals at each location in order to give an indication of spatial autocorrelation (after
Bicho et al., 2017, Figure 1).
Plate 9.1 Screenshot depicting the distribution of radiocarbon dates available from the Canadian Archaeologi-
cal Radiocarbon Database, version 2.1 (Martindale et al 2016).
a
0.004
Transition IV
0.003
●
●
●
0.001
●
● ● ●
●●
●
●
● ● ●
●
● ● ●
● ●
● ●
●● ● ●
● ● ●
●
●
● ● ● ●
● ● ●
●
● ●
● ●
● ●●
●
0.000
● ●
●● ●
●
●●●●
● ●● ●●
●
● ●●●
●
● ●
●
● ●●
●
●
● ●
● ●● ●●
● ●●● ● ●
●
● ●● ●
● ● ● ●
● ●
●●
● ●● ●
● ● ●●
●●
●
● ● ●●● ● ● ● ● ●
●● ●
● ●●
● ● ●
● ●● ●
● ●
●● ●
● ●
●
● ● ●● ● ●
● ●● ●
−0.001
● ● ●
● ● ● ● ●
●● ●
● ● ●
● ●●
● ●● ●●
● ● ● ● ●●
● ● ● ●
● ● ●●
●● ●
● ●●
● ● ● ● ●● ●
● ●
● ● ●● ●
● ● ● ●● ● ● ●
●
● ●
● ● ● ●●●● ●
● ● ●●
●
● ● ● ● ● ●●● ● ● ● ● ●● ●
● ● ● ● ●
● ●● ● ● ● ● ●
● ●
●●
●● ● ● ●
● ● ● ●● ● ● ● ●● ●● ● ●
●
● ● ● ● ●● ●
● ●●
● ● ●● ● ● ● ●
● ●●
● ●● ●●● ● ●●
●
● ● ● ● ● ●
●●
●●
●
●● ● ●
●●
●
●
●
●
●
●
●
●
●●
●
●● ●
●● ● ●
● ●
●
●
● ● ● ● ● ●●●
●
● ●
●
● ● ●● ●● ●
I II III IV V
● ● ●
● ● ● ● ●●● ●
● ● ●●
● ● ● ● ●
●
●
●● ● ●●
● ● ● ● ● ●
● ● ● ●
● ● ● ●
● ● ● ● ●
● ● ●
●● ● ●● ● ●●
●● ●
●● ●
●
● ● ● ● ●● ● ● ●●
● ● ●
● ●● ● ● ● ●● ●
●● ●●●●
●● ● ●● ● ● ●
● ●
●
●●
●
●● ●
●●
● ● ●● ● ● ●
● ●
● ● ●
transition
●● ● ●● ● ●
●● ●● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ●● ●● ● ●
● ● ● ●●
● ● ●
● ● ● ● ●
●● ● ● ● ●
b
●
●● ●
●● ● ●● ● ●
●
●
● ● ● ●●
● ●
0.004
● ● ●
● ● ●
● ● ● ●
● ●● ● ●
● ●●● ●● ●● ● ● ● ● ● ● ●
●
●● ●● ●
● ● ●● ●● ●
a
●● ●● ● ●● ● ●
● ●● ●
● ● ● ● ● ● ●
●
● ● ● ● ● ●
● ● ● ●
● ● ●● ● ●● ● ● ●
● ●
● ● ● ● ● ● ●● ● ●
● ●● ● ●● ● ●●
● ● ●
●
●● ● ● ● ●● ● ● ● ● ● ● ● ● ●●●
● ● ● ●
● ●●● ● ●●
● ● ● ● ● ● ●●●●●● ●●
●● ● ● ● ●●
● ● ●● ●●● ● ●● ●
● ●● ●
● ●● ● ● ●●
●●●● ●●● ●
● ● ● ●
● ● ●● ● ● ●
● ●● ● ●
● ●● ●
● ●
● ●●
● ● ● ●●●
● ●● ● ● ● ● ● ● ●
● ●● ●
● ● ● ● ●
● ●●● ● ●● ●● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ● ●
●● ●
● ● ● ● ●●●● ●
●
● ●
● ● ●● ●● ●
● ●● ●
●
● ● ● ● ● ●
●● ●
● ●
● ● ● ● ● ● ●● ● ● ● ● ●●
● ●
●
●●● ● ● ●
●
● ●
● ● ● ●● ● ●
●
● ● ● ● ● ● ●
●●● ●●●●
●
0.003
● ● ● ● ●
●●●
●● ●●● ● ● ●
●
● ● ●
●
●
● ● ● ●● ●
●
●●●
● ● ● ● ● ● ● ● ●
●
● ● ● ● ● ● ●● ●● ● ● ● ● ●
● ●
● ● ● ●● ● ● ● ● ●●
● ●●
●
● ●● ● ●
●● ● ● ● ●
● ●● ● ● ● ● ● ● ● ●●●
●●●● ● ● ● ● ● ●
● ● ●● ● ●●
● ●● ● ● ●
● ● ●● ●●● ● ● ●
● ●
●
●●
●
●
●
● ●
●●
●
● ● ●
●●
● ● ● ● ●
●
●
●
●
●●
●
● ●
● ● ● ● ● ●● ●● ● ● ● ● ● ●
● ● ● ●● ● ●● ●●
●●● ●
● ● ●
● ●● ● ● ●●
● ● ●
● ●
●●
●
●● ●
●●
● ● ●
● ● ●
● ● ● ● ●
● ●
●●● ● ● ● ● ● ● ● ●● ●● ●
●
● ● ● ● ● ●●
● ● ● ●● ●● ● ● ● ●
● ● ● ● ● ● ● ●
● ●●● ● ● ●
● ● ●● ● ●
● ● ●
● ●
●
●● ● ● ●● ● ●
●● ● ● ● ● ●●
●
● ● ● ●
● ●● ● ● ●●
● ●
● ● ● ● ●
●● ●● ●
● ●●●
●
0.002
● ● ●
●● ● ●
● ● ● ● ●
●
● ● ● ● ●
●
●● ● ● ●
●
● ● ● ● ● ●●
● ●
●
● ●
● ●
● ●●●● ●●
● ●● ● ●
● ●● ●●
● ● ●
●● ●
● ●● ● ●● ●
● ● ●●● ● ● ● ●●
●
●●●● ●
●
● ●● ● ● ● ●
● ●●
●● ● ● ● ●
● ● ●
● ● ● ● ● ● ●●
● ●● ● ● ● ● ● ● ●●
● ●● ●● ● ●●●● ● ●
● ●●● ● ● ●● ●
●
● ●
●
●●
●●
●
●
● ●
●●
●●
●
●
●
●
●●
● ●●●
●
●● ● ●●
●
●
●
●
●
●
●
●
●
● ● ● ● ●●
● ●
●● ● ● ● ● ● ● ●● ●
●●● ●● ● ● ●
●
●
●
● ●
●
●
●
● ● ● ●
● ● ● ● ●● ● ● ●
● ● ● ●● ●● ● ● ● ● ● ● ● ●
● ●●● ● ● ●●
● ● ●
●● ● ● ●● ● ● ●
0.001
● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ●●
●
●
● ●
● ● ●
● ● ● ●● ●
●
●
● ●●
●
●
●
●●●
●
●
●
●● ●
●●● ● ● ● ● ● ●
●●
●
● ● ● ● ● ● ●●●● ●● ●
●
●
● ● ●●● ●
●● ●●
b
●● ●
●
●●●●●
●●●
● ●● ●●●● ● ● ●●
●●●●
●
●● ● ● ●●
● ●
● ● ● ● ●●
●● ●
● ● ●●● ●
● ● ●
● ● ● ● ●●●
●● ●
● ●● ● ● ●
● ●● ●●●
● ● ● ● ● ●● ● ●
●● ●
● ● ● ●
● ●
●● ● ●●
● ● ● ● ● ● ● ● ●
● ● ● ● ●● ● ●
●
● ● ●
● ●●● ● ● ●
● ●● ● ● ●
● ● ●
●● ● ● ●● ● ●
●
● ● ●● ●
●
●
−0.001 0.000
● ●● ●● ● ● ●
● ● ● ● ● ●
● ● ●
● ● ● ● ● ●
●●
● ● ● ● ●
● ● ● ●
●● ●●
● ●
● ● ● ● ●
● ●● ●● ● ●
● ●
● ●
● ●
● ●●
●● ●
● ● ● ●
● ●●●● ●
●
●● ●●
● ● ● ●● ● ●●
● ●
● ● ● ●
● ● ● ●●
●
● ●●
● ● ● ●●
●
● ●
● ●
● ● ●● ●
● ● ●●
● ●● ● ●
●
● ● ● ● ●
● ● ●
● ● ●
● ●● ● ●
●
●
●● ● ●●● ●
● ●
● ●● ●
● ●● ●
●●● ● ●● ●
● ●
● ● ●●
●●●●● ● ●
●●
● ●●
● ● ● ● ● ●
● ●
● ● ●
●
●●●●
●
●
●
●
● ● ● ●● ●
●●●●● ● ●● ●
●● ●●● ●●
● ● ●● ●● ●
● ● ● ●
●● ●
● ● ● ●● ● ● ●●
●●● ● ●
●
● ●● ●● ● ● ●● ●
● ● ●
● ● ● ● ● ●
●●● ● ●
● ● ●●
● ● ●
● ● ●●●
●●●● ●● ● ●
● ● ●●● ●●
●●
●●
I II III IV V
●
● ●●
● ●●
● ●●
●
● ●
● ● ●
transition
Plate 9.4 Local spatial permutation test of the summed probability distribution of radiocarbon dates (SPDRD)
from Neolithic Europe showing locations with higher (red) or lower (blue) geometric growth rates than the
expectation from the null hypothesis (i.e. spatial homogeneity in growth trajectories) at the transition period
between 6500–6001 and 6000–5501 cal BP. The insets on the right show the observed local geometric growth
rates and the simulation envelope for locations a and b on the map (see Crema et al., 2017, for details).
Plate 10.9 Spatiotemporal trajectories and imperfection of archaeological data in Syrian Arid Margins during
the Bronze Age.
Plate 11.3 Focal medians for 5km BASr catchments calculated from the BASr baseline for Ireland (left) and
residuals between expected 87Sr/86Sr ratios based on the BASr catchments and the observed 87Sr/86Sr ratio for
Individual A2 (right). Locations from which Individual A2 could have originated are shown in white. Loca-
tions from which Individual A2 is unlikely to have originated are shown in blue (more depleted) and orange
(more enriched).
Plate 11.5 Probability density surface (Top Left) and maximum likelihood estimations showing the locations
where Burial K from Duggleby Howe could have originated from based on the observed 87Sr/86Sr ratio (Top
Right), the observed d18O value (Bottom Left), and both the observed 87Sr/86Sr ratio and d18O value (Bottom
Right). The geographic assignments based on dual isotope tracers are unduly influenced by one of the isotopes
(oxygen) raising further questions about the utility of oxygen as a tracer isotope.
Plate 12.3 A northwest Arkansas historic data set from 1892: (a) the 18 ´ 27 km study region with 589
historic farmsteads and roads plotted over topography with towns outlined, (b) maps of the four principal
components of historic settlement with central values of legend indicating most preferred locations.
Plate 13.1 Southern portion of the coastal Georgia study area: maximum available calories for white-tailed
deer (Odocoileus virginianus) for the month of September (ca. 500 BP).
9
Non-stationarity and local
spatial analysis
Enrico R. Crema
Introduction
One of the core assumptions hold by most spatial analyses is that the generative process behind the
observed pattern is stationary. This implies that statistical properties such as the intensity of a point process,
the nature of the relationship between dependent and independent variables, or the patterns of spatial
interaction are independent of their absolute location, and hence homogenous across space. The assumption
is often adopted implicitly and not exclusively in spatial analyses; for example, when inferring population
trajectories of a particular region (using site counts or density of radiocarbon dates), the pattern observed
in the aggregate time-series is considered to be, at least to some extent, representative of the region as
a whole. The advantage of holding this view is that information can be reduced into global statistics,
enabling for example the description of complex and multi-scalar patterns of spatial interaction using
a single, distance-based function (cf. Bevan this volume, Figure 4.3). Yet in many cases holding such an
assumption might be problematic as many processes do vary in their properties across geographic space.
They are, in other words, spatially heterogeneous and non-stationary. Under these circumstances choosing
inappropriate methods that assume stationarity might at best hinder the detection of interesting variations
and outliers in the data, and at worst lead to an erroneous understanding of the overall pattern.
The common way to informally approach potential issues derived from non-stationarity is to simply
select a window of analysis where the generative process can be assumed to be spatially homogeneous.
Intuitively speaking stationarity is negatively correlated with scale as larger study areas are more likely
to incorporate variation in spatial properties, making the use of global statistics less appropriate. The
problem is that the exact scale where the assumption stops being valid can vary depending on the nature
of the process under investigation and the idiosyncrasies of the specific case study. While informal rules
of thumb might be appropriate in some situations, stationarity should not be an a priori assumption, but
rather a hypothesis to be evaluated. This is particularly the case for large scale synthetic research that
harnesses the availability of increasingly larger collections of digital data, ranging from spatial databases
of radiocarbon dates (e.g. Shennan et al., 2013; Chaput et al., 2015) to remotely sensed data (e.g. Menze,
Ur, & Sherratt, 2006; Biagetti et al., 2017).
It is worth noting that stationarity is a property of the model, and not of the observed data per se
(Fortin & Dale, 2005). This is a crucial point, as in practical terms non-stationarity arises from model
156 Enrico R. Crema
misspecification. Stationarity is an assumption whereby statistical properties such as mean or variance are
considered to be spatially invariant. But these properties do often vary over space as a consequence of
some unidentified variables, and failing to appropriately model these will drastically reduce our capacity
to explain spatial heterogeneity and lead to the incorrect use of global statistics. (Fotheringham, Bruns-
don, & Charlton, 2000). A trivial example can explain this issue. Suppose that someone is analysing the
regional distribution of archaeological sites over a rugged landscape characterised by patches of flat areas
that are more suited for human occupation. For the sake of simplicity, we can assume that the only driver
of site density is the terrain morphology, with the intensity of occupation being five times larger in the flat
patches. The density of archaeological sites will not be homogenous over space, but instead characterised
by several clusters located in these patches. Examining this data and computing a single estimate of site
density (i.e. computing a global statistic) would be inappropriate, and similarly analysing for spatial inter-
action might misleadingly suggest evidence of second-order interaction (when in fact the sites are not
attracted to each other but only to absolute spatial locations). The problem can be solved by either ana-
lysing the flat patches separately by partitioning the study area or by specifying a variable that explains the
variation in site density (i.e. terrain ruggedness). Ignoring either option will lead to incorrect inferences.
The substantial growth in the availability of Geographic Information Systems (GIS)-based spatial data
in recent years has undoubtedly eased the creation of more sophisticated and complex models that can
account for different kinds of spatial dependencies induced by environmental variables. If appropriately
identified and modelled, these advances can limit the risk of model misspecifications. However, this is not
necessarily a trivial task, and the situation is worsened for two reasons. First, spatial differences can also
arise simply as a consequence of heterogeneity in archaeological research design. Different states, regions,
and individuals often employ different sampling strategies resulting in biases that can exhibit strong spatial
structure. Figure 9.1, for example, shows the location of North American archaeological sites included in
the Canadian Archaeological Radiocarbon Database (CARD, v.2.0). The overall variation in the density
of archaeological sites with radiocarbon dates is a combined effect of past population density and differ-
ences in sampling intensity, but the remarkable strength of the latter is not always as self-evident as in the
case of the state of Wyoming, shown here as a rectangular patch with a disproportionately high sample
density. Despite the known role of these forms of sampling biases, archaeological spatial analyses have
rarely addressed this issue formally (but see Bevan (2012) for an exception; see also Banning, this volume).
Yet examples in fields such as ecology showcase how the challenging task of quantifying and formally
integrating sampling bias is not only possible but can dramatically improve the predictive power of a
model (see Syfert, Smith, and Coomes (2013) and Stolar and Nielsen (2015) for applications in species
distribution modelling).
Second, whilst model misspecification and sampling bias are, at least potentially, tractable problems,
non-stationarity can also arise because different individuals might genuinely exhibit different relation-
ships across space. Cultural, behavioural, and economic differences can in fact lead to different practices,
attitudes, and preferences towards the very same environmental variable, and at the same time these varia-
tions are likely to exhibit spatial autocorrelation. Global analysis will, by definition, ignore these potential
variations as its core assumption is that individual observations are interchangeable and originating from
the same process. This can be regarded as a particular form of model misspecification (e.g. one could,
at least in theory, specify categorical variables to depict cultural affiliations), albeit one where identify-
ing and quantifying key variables is difficult, if not impossible. From a theoretical standpoint ignoring
potential spatial heterogeneity arising from these factors is an example of environmental determinism (see
Gaffney & van Leusen, 1995; Jones & Hanham, 1995), an approach that ‘denies geography and history’ by
assuming that ‘every time and everywhere is basically the same’ (Jones 1991, p8; cited in Fotheringham
et al., 2000, p. 95).
Non-stationarity and local spatial analysis 157
Figure 9.1 Screenshot depicting the distribution of radiocarbon dates available from the Canadian Archaeo-
logical Radiocarbon Database, version 2.1 (Martindale et al., 2016). A colour version of this figure can be
found in the plates section.
Method
How then can we identify non-stationarity? How can we discern cases where using global statistics is
still appropriate in contrast to instances where model misspecifications, sampling bias, and un-modelled
cultural variables can deeply undermine the results of the spatial analysis? Within a typical modelling
framework (e.g. regression analysis), the standard way to tackle this issue is to examine for the presence
of spatial autocorrelation in model residuals. While this is an efficient solution that directly examines the
assumptions of global statistics, the detection of spatial structure in the residual provides only a general
indication of misspecification and does not provide sufficient details on the nature of the spatial variation
per se.
One way to approach this problem is to break down the average properties observed at the global
scale and focus the perspective on to its local scale constituents. Thus rather than yielding a single statistic
describing the entire window of analysis, the objective is to retrieve multiple values, one for each of the
sampled locations. By analysing these statistics or even simply visualising them on a map, regularities and
exceptions can be identified. This provides clues for identifying plausible missing variables or provides
some insights into the nature of a culturally-driven spatial heterogeneity. The growth of global infor-
mation systems (GIS) in the mid-90s has particularly fostered the development of a suite of statistical
158 Enrico R. Crema
techniques, generally referred to as local spatial analysis, that implements this shift from a global to a local
perspective. These include both local versions of pre-existing global statistics (e.g. Local Ripley’s K,
Local Moran’s I, Geographically Weighted Regression, Spatial Expansion Method, etc.) as well as pur-
posely developed new methods (e.g. the geographical analysis machine, GAM, by Openshaw, Charlton,
Wymer, & Craft, 1987, but also the locally-adaptive model of archaeological potential, LAMAP, by Carleton,
Conolly, & Iannone, 2012).
While these techniques vary in their details (see below), they generally share two main properties:
(1) statistics are computed for each observed sample location, and hence they can be “mapped”; and
(2) statistics are computed by weighting the contribution of samples based on the distance to each focal
observation, i.e. they are based on local neighbourhoods that can be specified in a variety of ways (e.g.
contiguity in polygon data, a fixed number of ‘nearest’ neighbours, cut-off distance, distance decay func-
tions etc. . . . (see Getis & Aldstadt, 2010, for a review). The subsections below provide a brief summary
of the key concepts pertaining to the most commonly used forms of local spatial analysis, and a review
of their archaeological applications.
0.30
a c
0.20
0.20
0.10
0.10
0.00
0.00
0.00 0.10 0.20 0.00 0.10 0.20
0.20
0.20
b d
0.10
0.10
0.00
0.00
0.00 0.10 0.20 0.00 0.10 0.20
0.2
Figure 9.2 Simulated point patterns with associated observed (solid line) and expected (dashed line, under
Complete Spatial Randomness) L function (a variant of Ripley’s K function where the theoretical expectation
of Complete Spatial Randomness (CSR) is a straight line): (a) homogeneous Poisson process; (b) clustered point
process; (c) spatially inhomogeneous Poisson process with different intensities between left and right sides of
the window of analysis (separated by the dashed line); (d) second-order spatial heterogeneity with a combina-
tion of regular (left) and clustered (right) patterns. The function suggests aggregation (clustering) when the
observed L function is above the expected value and segregation (regular spacing) when below.
example). If the objective is the detection of spatial interaction then using a homogeneous Poisson process
in this case can be regarded as a particular form of misspecification (see Figure 9.2(c)). The issue can be
tackled by using more sophisticated techniques that can replace the null hypothesis with a spatially inho-
mogeneous version of the Poisson model, where the intensity varies as a function of external covariates.
160 Enrico R. Crema
For example, Eve and Crema (2014) investigated the distribution of Bronze Age houses at Leskernick
Hill (Cornwall, UK) by first fitting a point process model using a range of covariates including elevation,
slope, and visibility of landmarks (i.e. modelling induced spatial dependency), and subsequently used a
residual K function to detect clustering that was not accounted for by their fitted model (i.e. inherent
spatial dependency).
This solution is feasible as long as the inhomogeneous Poisson model can be assumed to be stationary.
However, the relationship between the intensity of the point process and the external variables (described
by the parameters of the fitted model) might also vary over space. If this is the case a global fitted model
is no longer a viable option and one should adopt alternative solutions (see Baddeley (2017)) similar to
those used in geographically weighted regression (see below).
Furthermore, even when variation in the externally induced spatial dependency is taken into account,
the nature of spatial interaction (i.e. inherent spatial dependency) might still vary over space (Fig-
ure 9.2(d)). Such second-order heterogeneity (Pélissier & Goreaud, 2001) cannot be tackled by the most
commonly adopted point-pattern analysis techniques such as Ripley’s K function or Nearest Neighbour
Index, as the mathematics underpinning the methods described above are based on aggregate statistics
(e.g. the mean density within a specific radius or the average distance to the nearest neighbour) that
effectively ignore variation between observations.
The solution in this case is to measure the same statistic for each observation point and map their
variation over space. The most widely adopted example of this approach is Getis and Franklin’s (1987)
second-order neighbourhood analysis, which is effectively equivalent to a local version of Ripley’s K function.
A few archaeological examples employ this technique either in its basic form (e.g. Palmisano, 2013) or in
its bivariate version, where the inherent spatial dependency is investigated in terms of relationship attrac-
tion or repulsion between two classes of points (e.g. two different artefact types). For example, Orton
(2004) re-examined the flint artefact distribution within the Mesolithic site of Barmose I identifying
potential activity areas as an alternative to cluster analyses. Crema and Bianchi (2013), and more recently
Riris (2017), applied the same suite of techniques on survey data, operationalizing the transition from a
site-centric to artefact-centric analysis of surface scatters. Both of these studies identified local patterns
of inter-type artefact aggregation and segregation (with statistical significance obtained from random
permutation tests), and more importantly ‘mapped’ the variation of such relationships over space, iden-
tifying complex patterns within and between clusters that cannot be adequately described by standard
global spatial analysis. Figure 9.3 compares, for example, the output of a global (Figure 9.3(b)) and a local
(Figure 9.3(c)) point pattern analysis aimed to assess the aggregation/segregation of stone tools made of
different raw materials (see Crema & Bianchi (2013) for further details). The global bivariate L function
suggest an aggregation between different materials (in this case Gafsa sourced flint vs flint sourced from
elsewhere) up to 350 meters. The local version of the same analysis shows, however, that this aggregation
occurs only in some areas (see filled dots in Figure 9.3(c)).
a b c
● ●
● ● ● ●● ● ● ●●
● ● ● ● ● ● ● ●
● ●● ●● ● ●● ●●
● ● ● ● ● ●
● ● ● ● ● ●
● ●●● ● ● ● ●● ● ● ● ●●● ● ● ● ●● ● ●
● ● ● ●
● ●● ●● ● ●●
●● ● ●● ●● ● ●●
●●
60
● ●● ● ●● ● ●● ● ●● ● ●● ● ●
●● ●●
●
● ●● ● ● ●
● ● ●●●●●●
● ●●●● ●
●
●●● ●
● ●● ● ●
● ● ●●●●●
●
● ●●● ●●
●
●
● ● ● ●● ● ● ● ●●● ● ●●●●
● ● ● ● ●● ● ● ● ●● ●●●●●●●
●
●● ● ● ●●● ● ● ●
● ●● ●● ● ● ●●● ● ● ●
● ●●
● ●● ● ●●● ● ●
●● ●● ● ● ●●● ● ●
●● ●●
● ●● ●● ● ● ● ● ● ●● ● ●
●●
●●
●●
●
● ●●●● ● ●● ●
● ●●
● ●
● ● ● ● ● ● ●● ● ●
●●
●
●●
●
●● ● ●
● ●●
●● ● ●●●● ●●●●● ● ●
● ● ●● ● ● ●●●
● ●● ● ● ● ●
● ● ●●
● ● ● ● ●● ● ● ● ●●
●
●
● ● ● ●● ● ● ● ●● ● ● ● ●●
●
●
●
● ●
●● ● ●●
●●● ● ●● ●
● ● ●●●
●● ● ●●
100 m
● ● ● ● ● ● ●
● ● ● ●●●● ● ●● ● ● ● ●● ● ● ●
●
● ● ● ● ● ●
● ● ●● ●● ● ● ●● ●●
● ● ● ●
● ●●● ●● ● ●●● ●●
●● ● ● ● ●● ● ● ●● ● ● ● ●● ● ●
●●●
● ● ●● ● ● ● ●●●● ●● ● ● ●
● ●●●
●●
●
●
● ●●●●
● ● ●●●
●● ●
● ●●●●
●
● ● ●●●●
● ● ● ● ● ●● ●●
●● ● ●
● ●● ●● ● ●● ●●●●●● ● ● ●● ●● ●● ●
●●●●● ●
●● ● ● ●●
●●●● ● ● ● ● ●● ● ●●●
●●●●● ● ● ●
● ● ●● ●●
50
● ● ●● ● ● ● ● ●● ● ●● ●
● ● ●● ●● ● ●●
● ● ●● ● ●●
●
● ●● ● ● ● ●● ● ●●● ● ● ●
● ●● ●●
●●● ● ●● ● ●
● ●● ● ●●
● ●● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ●●
● ●
●● ● ●
●● ●
● ● ●
●● ●● ●
●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
● ● ●
●● ● ● ●● ●
● ● ●
●● ● ● ●●
● ● ●●● ●● ● ●● ● ● ●●
● ●● ● ●●
●● ● ●
●●
●● ●●
● ●
● ●●
● ● ●●
● ●● ● ●
●●
●● ●●
● ●● ●●
● ●●●
●
● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●
● ●● ● ●● ●
●
●●●● ● ●● ●●●●● ● ●
● ●● ● ●● ●●●●●
●●●
●
● ● ●● ●● ● ● ●●
● ●●●
●
● ● ●● ●● ● ● ● ●
●
● ● ● ●● ●● ● ● ● ● ●●● ●● ● ●
●●●●
●●
● ●● ●●
●
● ● ●● ● ● ●●●●
●●
●
● ● ●● ● ●
● ●● ●●●● ● ●●●●●● ● ● ●
● ● ●● ● ● ● ● ●● ● ● ●
● ●● ● ●●●●●
●●
●
● ● ●● ● ● ●
● ● ● ● ●● ● ●●●● ● ●●● ● ● ● ● ●● ● ●●●● ●
● ●●●
● ● ● ●● ●
● ● ● ●● ● ● ● ●● ●
● ● ● ●●
● ● ● ● ●● ● ● ● ● ●●
40
● ● ● ● ● ● ● ● ● ● ● ● ●
●●
●● ● ● ● ● ●
●● ●● ●● ● ● ● ●●● ● ● ● ● ●
●● ●● ● ● ●
●
● ●● ● ●● ●● ● ● ●●
●● ● ●● ●●
Cross L function
● ● ● ●
● ● ● ●
● ● ● ● ● ●
● ●● ● ● ● ●
● ● ●
● ●
● ● ● ● ● ●●●● ●●●
● ● ● ● ● ● ●●●●
●●●
● ●● ● ● ● ●● ● ● ●● ● ● ●●● ●
● ●● ● ● ● ●●●
● ●● ●●
● ●
● ● ● ●● ●●
●● ● ● ● ●● ●●
●●
● ●● ● ● ●● ●
●● ●
● ● ● ● ●● ●● ●
● ● ● ● ●●
●● ●
● ●● ●● ●
● ●●
● ●●● ●● ●
●● ●
●
● ●●●● ●
●● ●
●
● ● ● ● ●● ● ●●●
● ● ● ● ● ●● ● ●●●
●
● ●● ● ● ●● ● ● ●●
● ●● ● ●
● ● ● ● ●● ●● ● ●●●
● ● ● ●
● ● ● ●
30
● ●
●
● ●● ●● ●
●
● ●● ●● ●
● ● ● ●
● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ●
● ● ● ● ● ● ● ●
●● ● ● ● ● ●● ● ● ● ●
● ● ●● ● ● ● ● ●● ● ●
●
● ●● ●
● ● ● ● ●
●
● ●● ●
● ● ● ● ●
● ●● ● ● ● ●● ● ●
●
● ●● ● ● ● ●
● ●● ● ● ●
● ● ● ●● ● ● ● ● ● ● ●● ● ● ●
● ●
● ● ● ●● ● ● ● ● ●● ●
●● ●●
● ● ● ● ● ●
●● ● ● ● ●● ● ● ●
20
● ● ●
● ● ● ●
●
● ●
● ● ●● ● ● ● ●● ●
● ● ● ●
● ● ● ● ● ● ● ●
●● ● ● ●● ● ●
● ● ● ●
● ● ● ●
● ● ● ●
●● ● ● ●● ● ●
● ● ● ● ● ●
● ● ● ●
● ● ● ● ● ● ● ● ●
● ● ● ● ●● ● ●
● ● ● ● ● ● ●● ●
● ● ● ● ● ● ●
● ● ● ● ●
● ●●
● ●● ● ● ● ● ● ●
● ●●
● ●
10
● ● ●●●● ● ● ●● ●●●●
● ●● ● ●●
●● ● ●● ● ●●
●●
● ●
● ●●
●●
●●● ● ●●●
● ●● ● ● ●
● ●
●
●
●●● ● ●●●
● ●● ●
●●● ●● ● ● ●●
●● ● ●
● ●
● ●●● ●● ●
●● ●
● ●●●
● ●● ●
●
●
● ●● ●
●
● ●
●●● ●
●
●
● ● ● ● ● ● ●
● ●●● ● ● ● ●
●●●● ●● ●● ● ●
● ●● ● ●● ●
● ● ●
● ● ● ● ●
●●●● ●● ●● ● ●
● ●● ●
● ●
● ● ●● ●●
●
●
●
●
● ●● ●● ● ●● ● ● ●● ●●
●
●
●
●
● ●● ●●
● ● ●●
● ● ● ● ●
●● ● ● ● ● ● ●● ● ● ● ●
0
MC pointwise envelope
●● ●● ● ● ● ●● ●● ● ● ●● ●● ● ● ● ●● ●● ●
●● ● ● ● ● ● ●● ● ● ● ● ● ●
●● ● ● ●● ● ●
● ● ● ● ●● ● ● ● ● ● ●● ●
● ●
● ● ● ●
● ●
● ● ● ●
● ● ● ●
● ●● ● ● ●● ●
● ● ●●
● ● ● ● ● ●●
● ● ●
● ● ● ● ● ● ● ●
● ● ● ●● ● ● ● ●●
● ● ● ● ● ●
●● ●● ● ●● ●● ●
● ● ●●● ● ● ● ● ●●● ● ●
● ● ● ● ● ●
● ●● ● ● ●● ●
Figure 9.3 Lithic distribution analysis from the Sebkha Kelbia survey, Tunisia (after Crema & Bianchi, 2013),
showing contrasting results between global and local bivariate L functions of stone tools divided by their raw
material (Gafsa flint vs. flint sourced elsewhere): (a) distribution of the analysed stone tools (filled circle: Gafsa
sourced flint; hollow circle: flint sourced from elsewhere); (b) bivariate L function showing significant segrega-
tion between the two classes between 20 and 320 meters (MC: Monte-Carlo); (c) local bivariate L function
scale showing evidence of aggregation at 100 meters (black dots indicate location of Gafsa sourced flints with
a statistically significant proportion of neighbours composed by flint sourced from elsewhere).
simulations to assess statistical significance and formally define spatial neighbourhoods using a weighted
scheme (Getis & Aldstadt, 2010).
While the most conventional use of LISA is to provide a better diagnostic tool of regression analysis by
identifying where residuals exhibit strong autocorrelation, the range of archaeological applications testifies
how this suite of techniques (along with other local versions of geostatistical analyses) can be used in a
range of contexts. For example, Premo (2004) used Moran’s local I (Anselin, 1995) and Getis’s local Gi*
statistics (Getis & Ord, 1992; a related technique designed to identify clusters distinguishing whether they
are low or high values compared to the mean) to explore the spatial distribution of terminal long-count
dates carved on Classic Maya monuments. The objective in this case was to determine whether these
proxies of ‘collapse’ (the terminal dates indicate the most recent year when elites at a particular site raised
monuments) exhibit local variations in their extent of autocorrelation, and identify the presence and the
location of significant clusters of early and late dates. Crema, Bevan, and Lake (2010) also used the local
Gi* statistics as an exploratory analysis to identify areas of low or high chronological uncertainty in Middle
to Late Jomon pit-dwellings in central Japan. More recently Styring, Maier, Stephan, Schlichtherle, and
Bogaard (2016) used the same analysis on the δ15N value of cereal grains at the Neolithic site of Hornstaad-
Hörnle IA, Germany, to investigate patterns of inter-household variation in crop-husbandry practices.
the fitting of fall-off curves of proportion data (e.g. artefact type) from potential centres of production
(e.g. Eerkens, Spurling, & Gras, 2008), or the modelling of site presence/absence via logistic regression
(e.g. Carrer, 2013; see Kvamme this volume). Most of these regression models assume that: (a) samples
are independent; and (b) the observed relationships between variables are the same across space, i.e. they
assume stationarity. The latter implies that estimates of the rate of expansion are assumed to be constant
over space, decrease in the proportion of artefact types from the source is assumed to be isotropic (i.e.
there is no directionality in the fall-off), and that the independent variables are assumed to have the same
role in determining the likelihood of site presence across the study area. As for the other cases, if these
assumptions are not justified, models can potentially be misspecified and estimates biased.
While diagnosis of regression residuals can help identify problematic cases, they do not explicitly
model spatial heterogeneity and hence do not provide means to formally approach the non-stationarity
problem (i.e. they are not able to inform how the relationships vary across space). The last two decades
however have seen the development of a wide range of regression techniques designed for the analyses of
spatial data. Problems such as the non-independence and autocorrelation of sample observations are being
tackled by tailored methods such as spatial auto-regressive models (see Gil et al., 2016 for an archaeo-
logical application). Geographically Weighted Regression (GWR) (Fotheringham, Brunsdon, & Charlton,
1998; Fotheringham et al., 2002) is one such technique that is suited for instances where the relationship
between variables are known to be spatially heterogeneous. The method is essentially a ‘local’ version of
regression analysis where global model parameters are replaced by continuous functions that are depen-
dent on the spatial coordinates of each location. Thus a ‘global’ regression model can be regarded as a
special case of the GWR, whereby the output of these continuous functions do not vary across space.
By allowing model parameters to vary across space this technique takes into account spatial heterogene-
ity (reducing model misspecification), and allowing at the same time the possibility to ‘map’ the spatial
variation of the parameters (and hence the spatial variation in the relationship between dependent and
independent variables). Geographically weighted regression assumes that when estimating the parameters
for a given location i, sites in proximity have a larger impact in the estimate of the model parameters than
those that are further away. This is achieved by weighting the contribution of neighbouring data points
using some distance decay function. Geographically weighted regression shares some similarities with
the spatial expansion method (Jones & Casetti, 1992), an earlier technique that similarly highlighted the
importance of spatially varying relationships. Whilst the spatial expansion method is a relevant precursor of
GWR, it provides less flexibility in defining how parameters vary over space, as it is designed to capture
general directional trends and its form needs to be assumed a priori (Fotheringham et al., 2000).
Despite its ability to address potential issues of environmental determinism, archaeological applications
of GWR have been comparatively limited. Gkiasta, Russell, Shennan, and Steele (2003) explored local
variations in the rate of the spread of farming in Neolithic Europe, whilst Bevan and Conolly (2009)
examined how covariates such as slope, vegetation, and geology have different relationships to the surface
pot-sherd density in different parts of the Greek island of Antikythera using a geographically weighted
zero-inflated Poisson regression. The technique has also been explored in the context of predictive
modelling of site locations (Löwenborg, 2010), as well as for larger synthetic research such as Linearband-
keramik (LBK) faunal remains in western Europe (Manning et al., 2013).
Case study
The core principles shared across the methods described above can be applied to virtually any analysis that
seeks to tackle non-stationarity. One recent archaeological example is the spatial extension of the summed
probability distribution of radiocarbon dates (SPDRD). The non-spatial version of this technique has
Non-stationarity and local spatial analysis 163
recently renewed a strong interest in prehistoric demography, as the increasing availability of a large col-
lection of radiocarbon dates is providing a new proxy for inferring past population trajectories within
an absolute chronological framework. While the core assumptions of this “dates as data” (Rick, 1987)
approach are still being discussed, it is undeniable that SPDRD is quickly becoming part of the standard
toolkit in regional studies. In particular, the production of demographic time-series within an absolute
chronology is opening new possibilities to infer the role of past climatic change (e.g. Kelly, Surovell,
Shuman, & Smith, 2013; Warden et al., 2017) or to explore cross-regional divergences in demographic
trajectories (e.g. Timpson et al., 2014; Crema, Habu, Kobayashi, & Madella, 2016), potentially at the
global level (Chaput & Gajewski, 2016).
The possibility to incorporate a spatial dimension is particularly noteworthy here as it requires a care-
ful balance between sample size and the spatial extent of the window of analysis. Because the shape of
SPDRD is subject to sampling error, a formal assessment of its shape (i.e. the hypothesised demographic
trajectories) will require a sufficient number of radiocarbon dates. While some suggestions for a thresh-
old size have been suggested (e.g. Williams, 2012), the optimal size ultimately depends on the specific
null hypothesis that is being tested (the most common ones being exponential and logistic population
growths) and the effect size being sought. With other things being equal, the most straightforward solu-
tion to increase the sample size is to expand the size of the window of analysis. This, however, means that
stationarity is harder to justify as different regions are likely to experience heterogeneous demographic
histories (cf. sub-regions in Shennan et al., 2013 and Timpson et al., 2014) as well as different sampling
strategies (see Figure 9.1, Bevan et al., 2017; see also Banning, this volume). The latter in particular
hinders the straightforward application of methods such as Kernel Density Estimates (KDE; see Bevan,
this volume), as the number of radiocarbon dates is determined at least in part by local differences in
sampling intensity. Attempts to overcome this issue have been rare, with the notable exception of Chaput
and Gajewski (2016) who employ relative risk surfaces (see also supplementary materials in Bevan et al.,
2017) by taking the ratio of each KDE map by the overall sampling intensity. While this approach is a
valuable correction in the observed pattern it does not distinguish between genuine instances of spatial
heterogeneity from variations arising from sample error.
Crema, Bevan, and Shennan (2017) have recently explored this issue by developing a local spatial
analysis designed to identify presence of spatial heterogeneity in the demographic trajectories hypoth-
esised from the SPDRDs, enabling the formal assessment of non-stationarity. The method involves the
following six steps (for the full description see the original paper):
1) Compute for each site i a local SPDRD which is created by summing all radiocarbon probabilities
but weighting (using an exponential decay function) the contribution of dates from neighbouring
sites as function of distance from i.
2) Define temporal slices (e.g. 7500–7001 cal BP, 7000–6501 cal BP, etc.) and compute the geometric
growth rate between abutting pairs for each local SPDRD (e.g. between 7500–7001 and 7000–6501
cal BP, between 7000–6501 cal BP and 6500–6001 cal BP, and so on . . .).
3) Randomly permute the spatial coordinates of the radiocarbon dates, so that the entire set of dates
associated to a particular location x is given a new location y, and then execute steps 1 and 2 above.
4) Repeat step 3 n times, so that for each transition (e.g. from 7500–7001 to 7000–6501 cal BP) at each
site, there is one observed geometric growth rate (obtained in steps 1–2), and n simulated geometric
growth rates (obtained in step 3). The latter is the expected pattern under the assumption of spatial
stationarity (i.e. the same expected growth rate across space with variation entirely determined by
sampling error). Notice that the envelope of the simulated dates will be narrower in regions with a
higher sampling intensity and wider in areas with a lower sampling intensity.
164 Enrico R. Crema
5) Compare the observed and simulated growth rates for each location and compute the p-value for
significance testing, equivalent to (r+1)/(n+1) where r is the number of replicates where the simu-
lated growth rate is lower (or higher) than the observed rate.
6) Use the distribution of p-values to compute false discovery rates (q-values, Benjamini & Hochberg,
1997) to take into account expected inflation of type I error (i.e. incorrect rejection of a true null
hypothesis) due to multiple hypothesis testing.
Figure 9.4 shows the result of this local analysis applied in the context of Neolithic Europe. The red
dots indicate site locations with a significant (q-value < 0.05) local positive departure from the expected
growth rate under stationarity in the transition between 6500–6001 cal BP and 6000–5500 cal BP (transi-
tion IV), whilst the blue dots indicate the opposite (lower than expected rate). If all regions experienced
similar population trajectories (as inferred from the density of radiocarbon dates) and local variations in
the SPD were purely the result of sampling error, we would not expect to observe any significant positive
or negative departures. The insets show the result of two particular locations where the observed growth
rate (solid line with filled dots) is higher and lower than the expected rates under stationarity (dashed line
0.004
●
Transition IV
0.003
(6500−6001 to 6000−5501 cal BP)
●
0.002
●
●
●
0.001
●
● ● ●
●●
●
●
● ● ●
●
● ● ●
● ●
● ●
●● ● ●
● ● ●
●
●
● ● ● ●
● ● ●
●
● ●
● ●
● ●●
●
0.000
● ●
●● ●
●
●●●●
● ●● ●●
●
● ●●●
●
● ●
●
● ●●
●
●
● ●
● ●● ●●
● ●●● ● ●
●
● ●● ●
● ● ● ●
● ●
●●
● ●● ●
● ● ●●
●●
●
● ● ●●● ● ● ● ● ●
●● ●
● ●●
● ● ●
● ●● ●
● ●
●● ●
● ●
●
● ● ●● ● ●
● ●● ●
−0.001
● ● ●
● ● ● ● ●
●● ●
● ● ●
● ●●
● ●● ●●
● ● ● ● ●●
● ● ● ●
● ● ●●
●● ●
● ●●
● ● ● ● ●● ●
● ●
● ● ●● ●
● ● ● ●● ● ● ●
●
● ●
● ● ● ●●●● ●
● ● ●●
●
● ● ● ● ● ●●● ● ● ● ● ●● ●
● ● ● ● ●
● ●● ● ● ● ● ●
● ●
●●
●● ● ● ●
● ● ● ●● ● ● ● ●● ●● ● ●
●
● ● ● ● ●● ●
● ●●
● ● ●● ● ● ● ●
● ●●
● ●● ●●● ● ●●
●
● ● ● ● ● ●
●●
●●
●
●● ● ●
●●
●
●
●
●
●
●
●
●
●●
●
●● ●
●● ● ●
● ●
●
●
● ● ● ● ● ●●●
●
● ●
●
● ● ●● ●● ●
I II III IV V
● ● ●
● ● ● ● ●●● ●
● ● ●●
● ● ● ● ●
●
●
●● ● ●●
● ● ● ● ● ●
● ● ● ●
● ● ● ●
● ● ● ● ●
● ● ●
●● ● ●● ● ●●
●● ●
●● ●
●
● ● ● ● ●● ● ● ●●
● ● ● ●
● ●● ● ● ● ●● ●
●● ●●●●
●● ● ●● ● ● ●
● ●
●
●●
●
●● ●
●●
● ● ●● ● ● ●
● ●
● ● ●
●
transition
●● ● ●● ● ●
●● ●● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ●● ●● ● ●
● ● ● ●●
● ● ●
● ● ● ● ●
●● ● ● ●
b
● ● ●
●● ●● ● ●● ● ●
●
●
● ● ● ●●
● ●
0.004
● ● ●
● ● ●
● ● ● ●
● ●● ● ●
● ●●● ●● ●● ● ● ● ● ● ● ●
●
●● ●● ●
● ● ●● ●● ●
a
●● ●● ● ●● ● ●
● ●● ●
● ● ● ● ● ● ●
●
● ● ● ● ● ●
● ● ● ●
● ● ●● ● ●● ● ● ●
● ●
● ● ● ● ● ● ●● ● ●
● ●● ● ●● ● ●●
● ● ●
●
●● ● ● ● ●● ● ● ● ● ● ● ● ● ●●●
● ● ● ● ●●● ●
● ● ● ● ● ● ● ●●●● ●●
● ●●
●● ● ● ● ●●
● ● ●●●●●●● ●● ●
● ●● ●
● ●● ● ● ●●
●●●● ●●● ●
● ● ● ●
● ● ●● ● ● ●
● ●● ● ●
● ●● ●
● ●
● ●●
● ● ● ●●●
● ●● ● ● ● ● ● ● ●
●● ●
● ● ● ●
●
● ●●●
● ● ● ● ●● ●
●● ● ● ● ● ● ●
●● ● ● ● ● ● ● ● ●
● ● ● ● ● ●●●● ●
● ●
● ●
● ●● ● ●● ●
● ●● ●
● ● ● ● ● ●● ●
● ●
● ● ● ● ● ● ●● ● ● ● ● ●●
● ●
●
●●● ● ● ● ● ● ● ●
● ● ●● ● ●
●
● ● ● ● ● ● ●
●●● ●●●●
●
0.003
● ● ● ● ●
●●●
●● ● ●●● ● ● ●
● ● ●
●
●
● ● ● ●● ●
●
●●●
● ● ● ● ● ● ● ● ●
●
● ● ● ● ● ● ●● ●● ● ● ● ● ●
● ●
● ● ● ●● ● ● ● ● ●●
● ●●
●
● ●● ● ●
●● ● ● ● ●
● ●● ● ● ● ● ● ● ● ●●●
●●●● ● ● ● ● ● ●
● ● ●● ● ●●
● ●● ● ● ●
● ● ●● ●●● ● ● ●
● ●
●
●●
●
●
●
● ●
●●
●
● ● ●
●●
● ● ● ● ●
●
●
●
●
●●
●
● ●
● ● ● ● ● ●● ●● ● ● ● ● ● ●
● ● ● ●● ● ●● ●●
●●● ●
● ● ●
● ●● ● ● ●●
● ● ●
● ●
●●
●
●● ●
●●
● ● ●
● ● ●
● ● ● ● ●
● ●
●●● ● ● ● ● ● ● ● ●● ●● ●
●
● ● ● ● ● ●●
● ● ● ●● ●● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ●●● ●
● ●
●
●
●
● ●
● ● ●● ●
● ●● ●
●● ● ● ● ● ● ● ●● ● ●
● ● ●
● ● ●
● ●● ● ● ●●
● ●
● ● ● ● ●
●● ●● ●
● ●●●
●
0.002
● ● ●
●● ● ●
● ● ● ● ●
●
● ● ● ● ●
●
●● ● ● ●
●
● ● ● ● ● ●●
● ●
●
● ●
● ●
● ●●●● ●●
● ●● ● ●
● ●● ●●
● ● ●
●● ●
● ●● ● ●● ●
● ● ●●● ● ● ● ●●
●
●●●● ●
●
● ●● ● ● ● ●
● ●●
●● ● ● ● ●
● ● ●
● ● ● ● ● ● ●●
● ●● ● ● ● ● ● ● ●●
● ●● ●● ● ●●●● ● ●
● ●●● ● ● ●● ●
●
● ●
●
●●
●●
●
●
● ●
●●
●●
●
●
●
●
●●
● ●●●
●
●● ● ●●
●
●
●
●
●
●
●
●
●
● ● ● ● ●●
● ●
●● ● ● ● ● ● ● ●● ●
●●● ●● ● ● ●●
●
●
● ●
●
●
●
● ● ● ●
● ● ● ● ●● ● ● ●
● ● ● ●● ●● ● ● ● ● ● ● ● ●
● ●●● ● ● ●●
● ● ●
●● ● ● ●● ● ●
● ●
0.001
● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ●
●●
●
● ●
● ● ●
● ● ●
●●
●
●
●
●●
●
● ●
●
●
●● ● ● ● ●
●● ● ●●● ● ●
●●● ● ● ● ● ● ● ● ● ●● ●● ●● ●
●
● ●
● ● ●●● ●
●● ●●
b
●● ●
●
●●●●●
●●●
● ●● ●●●● ● ● ●●
●●●●
●
●● ● ● ●●
● ●● ● ● ● ●●
●● ●
● ● ●●● ●
● ● ●
● ● ● ● ● ●●
●● ●
● ●● ● ● ●
●●●
● ●
● ● ● ● ●● ●●
●
●
●● ●
● ● ● ●
● ●
●
● ● ●●
● ● ● ● ● ● ● ● ●
● ● ● ● ●● ● ●
●
● ● ●
● ●●
● ● ● ●
● ●● ● ● ●
● ● ●
● ●
● ● ●● ● ●
●
● ●● ●
● ● ●● ● ●
−0.001 0.000
● ●● ●●
● ● ● ● ● ●
● ● ●
● ● ● ● ● ●
●●
● ● ● ● ●
● ● ● ●
●● ●●
● ●
● ● ● ● ●
● ●● ●● ● ●
● ●
● ●
● ●
● ●●
●● ●
● ● ● ●
● ●●●● ●
●
●● ●●
● ● ●● ●● ● ●● ●
● ● ● ●
● ● ● ●●
●
● ●●
● ● ● ●●
●
● ●
● ●
● ● ●● ●
● ● ●●
● ●● ● ●
●
● ● ● ● ●
● ● ●
● ● ●
● ●● ● ●
●
●
●● ● ●●● ●
● ●
● ●● ●
● ●● ●
●●● ● ●● ●
● ●
● ● ● ●
●●●●● ● ●
● ●●
● ●●
● ● ● ● ●
● ●
●
● ●
●
●●●●
●
●
●
●
● ● ● ●● ●
●●●●● ● ●● ●
●● ●●● ●●
● ● ●● ●● ●
● ● ● ●
●● ●
● ● ● ●● ● ● ●●
●●● ● ●
●● ●● ●● ● ● ●● ●
● ● ●
● ● ● ● ● ●
●●● ● ●
● ● ●●
● ● ●
● ● ●●●
● ●● ●● ● ●
●●●● ●●
● ● ●●
●●
I II III IV V
●
● ●●
● ●●
● ●●
●
● ●
● ● ●
transition
Figure 9.4 Local spatial permutation test of the summed probability distribution of radiocarbon dates
(SPDRD) from Neolithic Europe showing locations with higher (red) or lower (blue) geometric growth rates
than the expectation from the null hypothesis (i.e. spatial homogeneity in growth trajectories) at the transition
period between 6500–6001 and 6000–5501 cal BP. The insets on the right show the observed local geometric
growth rates and the simulation envelope for locations a and b on the map (see Crema et al., 2017, for details).
A colour version of this figure can be found in the plates section.
Non-stationarity and local spatial analysis 165
with hollow dots) and its associated simulation envelope (grey region) obtained from 10,000 permuta-
tions. The result indicates statistically significant instances of spatial heterogeneity in growth rates, with
southern Britain, southern Ireland, the Baltic regions, and parts of central Germany experiencing higher
growth rates while most of continental Europe within the study area have the opposite pattern.
Conclusion
The substantial heterogeneity in the objectives, the type of data, and the scale of analysis, makes the appli-
cation of spatial analysis in archaeology a challenging and diverse task. Techniques are mostly developed
in other fields and come with assumptions that were valid for the particular contexts they were designed
for. Whilst generalised tools are highly desired, the underpinning assumptions are not easily transfer-
rable across different applications. The problem is exacerbated by the fact that too often we ignore the
assumptions and their implications entirely, leading to a divergence between archaeological theories and
spatial models.
The problem of non-stationarity is a good example of this; the majority of spatial statistics used in archae-
ology assume spatial homogeneity, yet the theoretical stance and interest of archaeologists is often focused
much more on heterogeneity. Despite the availability of a substantial range of techniques that are designed
to tackle non-stationarity (or to model spatially heterogeneous processes), archaeological applications are
comparatively rare with global statistics still being the most commonly adopted approach. The growing
amount of high quality data at increasingly larger spatial scales might however change this and promote the
use of local spatial analysis. This will no doubt provide new perspectives on the human past, enabling us to
answer questions that are perhaps in line with a wider range of theoretical approaches. Such a shift in scale
will, however, require the creation of more tailored techniques as well as the retrieval of data that can provide
the basis for exploring the effects of research bias. It is undeniable that with the increasing possibility to
engage with larger spatial scales, we will have to face the impact of heterogeneous research practices. These
will have a greater role in shaping the distributions we observe, hindering our ability to isolate the patterns
we truly seek to study. The adoption of local statistics can help this endeavour but it is worth noting that
these are ultimately exploratory tools and can never replace a global model where key missing variables are
correctly integrated. Detecting spatial heterogeneity tells us only that there is something missing; we might
estimate where and to some extent even how, but it will never tell us what they are. Furthermore, one should
also avoid the temptation to exclusively rely on the inductive insights offered by the output of local analyses
and conceiving them as the final stage of a research workflow. This is particularly so because the number
of statistical hypotheses is generally as many as the number of observations. As a consequence, there is an
increased possibility of incorrectly rejecting the null hypothesis even when this is false (type I-error). This is
a known problem and one that cannot be easily solved by standard correction methods, such as Bonferroni,
as tests are not entirely independent from each other and consequently an indiscriminate use of p-value
adjustment can lead to overly conservative conclusions (i.e. type II errors). This is also a known issue within
the literature of local spatial analysis (e.g. de Castro & Singer, 2006), and while some suggestions have been
proposed there is no consensus towards a single solution. Ultimately, local analyses should not be considered
as substitutes for global statistics but rather as a suite of complementary tools for evaluating assumptions,
providing clues for searching for missing variables, and refining hypotheses.
Acknowledgements
I would like to thank Mark Gillings, Gary Lock, and Piraye Hacıgüzeller for inviting me, providing
constructive feedbacks to the manuscript, and above all for being patient. I am also grateful to endless
166 Enrico R. Crema
discussions on these topics with a number of colleagues, in particular Andrew Bevan, Mark Lake, and
Alessio Palmisano. Analyses were performed using the spatstat (Baddeley, Rubak, & Turner, 2015) and
rcarbon (Bevan & Crema, 2018) packages within R statistical computing language (R Core Team, 2018).
References
Anselin, L. (1995). Local indicators of spatial association–LISA. Geographical Analysis, 27, 93–115.
Baddeley, A. (2017). Local composite likelihood for spatial point processes. Spatial Statistics, 22, 261–295.
Baddeley, A., Rubak, E., & Turner, R. (2015). Spatial point patterns: Methodology and applications with R. London: Chap-
man and Hall/CRC Press.
Bailey, T. C., & Gatrell, A. C. (1995). Interactive spatial data analysis. Harlow: Prentice Hall.
Biagetti, S., Merlo, S., Adam, E., Lobo, A., Conesa, F. C., Knight, J., . . . Madella, M. (2017). High and medium reso-
lution satellite imagery to evaluate late Holocene human–environment interactions in Arid lands: A case study
from the Central Sahara. Remote Sensing, 9, 351.
Benjamini, Y., & Hochberg, Y. (1997). Multiple hypotheses testing with weights. Scandinavian Journal of Statistics,
24m, 407–418.
Bevan, A. (2012). Spatial methods for analysing large-scale artefact inventories. Antiquity, 86, 492–506.
Bevan, A., Colledge, S., Fuller, D., Fyfe, R., Shennan, S., & Stevens, C. (2017). Holocene fluctuations in human popu-
lation demonstrate repeated links to food production and climate. Proceedings of the National Academy of Sciences of
the United States of America, 114(49), E10524–E10531.
Bevan, A., & Conolly, J. (2009). Modelling spatial heterogeneity and nonstationarity in artifact-r ich landscapes.
Journal of Archaeological Science, 36, 956–964.
Bevan, A., & Crema, E. R. (2018). Rcarbon v1.2.0: Methods for calibrating and analysing radiocarbon dates. Retrieved from
https://CRAN.R-project.org/package=rcarbon
Carrer, F. (2013). An ethnoarchaeological inductive model for predicting archaeological site location: A case-study of
pastoral settlement patterns in the Val di Fiemme and Val di Sole (Trentino, Italian Alps). Journal of Anthropological
Archaeology, 32, 54–62.
Carleton, W. C., Conolly, J., & Iannone, G. (2012). A locally-adaptive model of archaeological potential (LAMAP).
Journal of Archaeological Science, 39, 3371–3385.
Chaput, M. A., & Gajewski, K. (2016). Radiocarbon dates as estimates of ancient human population size. Anthropo-
cene, 15, 3–12.
Chaput, M. A., Kriesche, B., Betts, M., Martindale, A., Kulik, R., Schmidt, V., & Gajewski, K. (2015). Spatiotemporal
distribution of Holocene populations in North America. Proceedings of the National Academy of Sciences of the United
States of America, 112(39), 12127–12132.
Clark, P. J., & Evans, F. C. (1954). Distance to nearest neighbour as a measure of spatial relationships in populations.
Ecology, 35, 445–453.
Crema, E. R., Bevan, A., & Lake, M. (2010). A probabilistic framework for assessing spatio-temporal point patterns
in the archaeological record. Journal of Archaeological Science, 37, 1118–1130.
Crema, E. R., Bevan, A., & Shennan, S. (2017). Spatio-temporal approaches to archaeological radiocarbon dates.
Journal of Archaeological Science, 87, 1–9.
Crema, E. R., & Bianchi, E. (2013). Looking for patterns in the noise: non-site spatial analysis at Sebkha Kalbia Tuni-
sia. In S. Mulazzani (Ed.), Le Capsien de hergla Tunisie Culture, environnement et économie (pp. 385–395). Frankfurt:
Africa Magna.
Crema, E. R., Habu, J., Kobayashi, K., & Madella, M. (2016). Summed probability distribution of 14 C dates sug-
gests regional divergences in the population dynamics of the Jomon Period in Eastern Japan. PLoS One, 11.
doi:10.1371/journal.pone.0154809
de Castro, M. C., & Singer, B. H. (2006). Controlling the False Discovery Rate: A new application to account for
multiple and dependent tests in local statistics of spatial association. Geographical Analysis, 38, 180–208.
Eerkens, J. W., Spurling, A. M., & Gras, M. A. (2008). Measuring prehistoric mobility strategies based on obsidian
geochemical and technological signatures in the Owens Valley, California. Journal of Archaeological Science, 35,
668–680.
Non-stationarity and local spatial analysis 167
Eve, S., & Crema, E. R. (2014). A house with a view? Multi-model inference, visibility fields, and point process analy-
sis of a Bronze Age settlement on Leskernick Hill (Cornwall, UK). Journal of Archaeological Science, 43, 267–277.
Fortin, M.-J., & Dale, M. (2005). Spatial analysis: A guide for ecologists. Cambridge: Cambridge University Press.
Fotheringham, A. S., Brunsdon, C., & Charlton, M. (1998). Geographically weighted regression: A natural evolution
of the expansion method for spatial data analysis. Environment and Planning A, 30, 1905–1927.
Fotheringham, S. A., Brunsdon, C., & Charlton, M. (2000). Quantitative geography: Perspectives on spatial data analysis.
London: Sage Publications.
Fotheringham, A. S., Brunsdon, C., & Charlton, M. (2002). Geographically weighted regression: The analysis of spatially
varying relationships. Chichester: John Wiley & Sons.
Gaffney, C. F., & van Leusen, P. M. (1995). Postscript–GIS, environmental determinism and archaeology. In G. Lock &
Z. Stančić (Eds.), Archaeology and geographic information systems (pp. 367–382). London: Taylor and Francis.
Getis, A., & Aldstadt, J. (2010). Constructing the spatial weights matrix using a local statistic. Geographical Analysis,
36, 90–104.
Getis, A., & Franklin, J. (1987). Second-order neighborhood analysis of mapped point patterns. Ecology, 68, 473–477.
Getis, A., & Ord, J. K. (1992). The analysis of spatial association by use of distance statistics. Geographical Analysis,
24, 189–206.
Gil, A. F., Ugan, A., Otaola, C., Neme, G., Giardina, M., & Menéndez, L. (2016). Variation in camelid δ13C and δ15N
values in relation to geography and climate: Holocene patterns and archaeological implications in central western
Argentina. Journal of Archaeological Science, 66, 7–20.
Gkiasta, M., Russell, T., Shennan, S., & Steele, J. (2003). Neolithic transition in Europe: The radiocarbon record
revisited. Antiquity, 77, 45–62.
Hodder, I., & Orton, C. (1976). Spatial analysis in archaeology. Cambridge: Cambridge University Press.
Jones, J. P., & Casetti, E. (1992). Applications of the expansion method. London: Routledge.
Jones, J. P., & Hanham, R. Q. (1995). Contingency, realism, and the expansion method. Geographical Analysis, 27,
185–207.
Kelly, R. L., Surovell, T. A., Shuman, B. N., & Smith, G. M. (2013). A continuous climatic impact on Holocene
human population in the Rocky Mountains. Proceedings of the National Academy of Sciences of the United States of
America, 110(2), 443–447.
Löwenborg, D. (2010). Using geographically weighted regression to predict site representativity. In B. Frischer,
J. W. Crawford, & D. Koller (Eds.), Making history interactive, proceedings of CAA conference, 37th annual meeting
(pp. 203–215), Williamsburg, VA and Oxford: Archaeopress.
Manning, K., Stopp, B., Colledge, S., Downey, S., Conolly, J., Dobney, K., & Shennan, S. (2013). Animal exploitation
in the early Neolithic of the Balkans and central Europe. In S. Colledge, J. Conolly, K. Dobney, K. Manning, &
S. Shennan (Eds.), The origins and spread of domestic animals in southwest Asia and Europe (pp. 237–252). Walnut
Creek, CA: Left Coast.
Martindale, A., Morlan, R., Betts, M., Blake, M., Gajewski, K., Chaput, M., . . . Vermeersch, P. (2016). Canadian
Archaeological Radiocarbon Database (CARD 2.1). Retrieved April 1, 2017, from www.canadianarchaeology.ca/
Menze, B. H., Ur, J. A., & Sherratt, A. G. (2006). Detection of ancient settlement mounds. Photogrammetric Engineer-
ing & Remote Sensing, 72, 321–327.
Openshaw, S., Charlton, M., Wymer, C., & Craft, A. (1987). A Mark 1 geographical analysis machine for the auto-
mated analysis of point data sets. International Journal of Geographical Information Systems, 1, 335–358.
Orton, C. (2004). Point pattern analysis revisited. Archeologia e Calcolatori, 15, 299–315.
Palmisano, A. (2013). Zooming patterns among the scales: A statistics technique to detect spatial patterns among
settlements. In G. Earl, T. Sly, A. P. Chrysanthi, P. Murrieta-Flores, C. Papadopoulos, I. Romanowska, & D. Wheat-
ley (Eds.), CAA 2012: Proceedings of the 40th Annual Conference of Computer Applications and Quantitative Methods
in Archaeology (CAA) (pp. 348–356). Amsterdam: Amsterdam University Press.
Pélissier, R., & Goreaud, F. (2001). A practical approach to the study of spatial structure in simple cases of heteroge-
neous vegetation. Journal of Vegetation Science, 12, 99–108.
Pinhasi, R., Fort, J., & Ammerman, A. J. (2005). Tracing the origin and spread of agriculture in Europe. PLoS Biol-
ogy. doi:10.1371/journal.pbio.0030410
Premo, L. (2004). Local spatial autocorrelation statistics quantify multi-scale patterns in distributional data: An
example from the Maya Lowlands. Journal of Archaeological Science, 31, 855–866.
168 Enrico R. Crema
R Core Team. (2018). R: A language and environment for statistical computing: R Foundation for Statistical Computing.
Vienna, Austria. Retrieved from: www.R-project.org/
Rick, J. W. (1987). Dates as data : An examination of the Peruvian Preceramic Radiocarbon Record. American
Antiquity, 52, 55–73.
Ripley, B. D. (1976). The second-order analysis of stationary point processes. Journal of Applied Probability, 13,
255–266.
Riris, P. (2017). Towards an artefact’s-eye view: Non-site analysis of discard patterns and lithic technology in Neotrop-
ical settings with a case from Misiones province, Argentina. Journal of Archaeological Science: Reports, 11, 626–638.
Shennan, S., Downey, S. S., Timpson, A., Edinborough, K., Colledge, S., Kerig, T., . . . Thomas, M. G. (2013). Regional
population collapse followed initial agriculture booms in mid-Holocene Europe. Nature Communications, 4.
doi:10.1038/ncomms3486
Stolar, J., & Nielsen, S. E. (2015). Accounting for spatially biased sampling effort in presence-only species distribution
modelling. Diversity and Distributions, 21, 595–608.
Styring, A., Maier, U., Stephan, E., Schlichtherle, H., & Bogaard, A. (2016). Cultivation of choice: New insights into
farming practices at Neolithic lakeshore sites. Antiquity, 90, 95–110.
Syfert, M. M., Smith, M. J., & Coomes, D. A. (2013). The effects of sampling bias and model complexity on the
predictive performance of MaxEnt species distribution models. PLoS One. doi:10.1371/journal.pone.0055158
Timpson, A., Colledge, S., Crema, E., Edinborough, K., Kerig, T., Manning, K., . . . Shennan, S. (2014). Reconstruct-
ing regional population fluctuations in the European Neolithic using radiocarbon dates: A new case-study using
an improved method. Journal of Archaeological Science, 52, 549–557.
Warden, L., Moros, M., Neumann, T., Shennan, S., Timpson, A., Manning, K., . . . Damsté, J. S. S. (2017). Climate
induced human demographic and cultural change in northern Europe during the mid-Holocene. Scientific Reports,
7, 15251.
Williams, A. N. (2012). The use of summed radiocarbon probability distributions in archaeology: A review of meth-
ods. Journal of Archaeological Science, 39, 578–589.
10
Spatial fuzzy sets
Johanna Fusco and Cyril de Runz
Introduction
Archaeologists have, for many years, sought more and more data in order to accurately reflect and retrace
the spatial and temporal dynamics of past societies. We have now entered the era of ‘big data’ and are
becoming used to dealing with archaeological datasets that have grown “undigested” (Orton, 2010,
p. 4; see also Bevan, 2015; Cooper & Green, 2015; Green this volume). These datasets merge together
huge amounts of heterogeneous and fragmentary information from various sources, datasets and surveys
(Cooper & Green, 2015; McCoy, 2017), which are rarely at the same scale (Gattiglia, 2015) and do not
stem from the same interpretative and methodological frameworks (Cooper & Green, 2015; Roskams &
Whyman, 2007). Take, for example, dating, which is significantly affected by the inconsistency of chrono-
logical systems, methods and interpretations between surveys, even within one single archaeological site
(Kennedy & Hahn, 2017). This heterogeneity generates a high variability in data accuracy and reliability
within datasets, which impacts the quality and the reliability of models and analyses.
On the bright side, if this accumulation of heterogeneous data has little improved the accuracy of our
statements about past phenomena, it has contributed to highlighting the various aspects of imperfec-
tion that exist within archaeological information. While successful attempts at unifying interpretative
schemes and making datasets comparable have been put forward (Kennedy & Hahn, 2017; Roskams &
Whyman, 2007), this awareness has fuelled an added sense of urgency to tackle imperfection and to adopt
more sensitive methodological and theoretical approaches to past spatiotemporal phenomena. These new
approaches prevent archaeologists from rushing into inappropriate interpretation, and from confound-
ing the complexity of past phenomena with the analytical biases and complexities created by our own
failure to both manage dataset size and imperfection and shift towards more flexible methodological and
ontological frameworks (Bevan, 2015; Brouwer Burg, 2017). If swept under the carpet, data imperfec-
tion spreads throughout analyses, results and interpretation. It then grows out of control, and prevents
us from assessing the validity of our conclusions or from directly comparing situations and phenomena.
The extreme opposite approach is to remove from the analysis any data that do not display the required
quality level (see also Gupta, this volume). However, this upward standardisation process often results in a
great loss of information which might potentially impact the statistical representativeness of the sample.
We would argue that a more profitable pathway emerges from thinking within imperfection, making it
170 Johanna Fusco and Cyril de Runz
intelligible not only at the data level but at every step of analysis and data treatment and even in our own
reasoning schemes.
According to Veregin (1989), Plewe (2002) and Fisher (2005), identifying and classifying sources of
error, and elaborating a deep theoretical basis on data imperfection and its consequences on models and
analyses, is an indispensable step for managing it correctly, whether it concerns the attributes of the con-
sidered object, or its spatial and temporal dimensions. While an extensive literature on spatial and geo-
historical data imprecision exists (see Longley, Goodchild, Maguire, & Rhind, 2005; Plewe, 2002; Zhang &
Goodchild, 2002 for extensive discussion and review), Peter Fisher’s classification on data imperfection
(Fisher, 2005) is frequently considered to be one of the clearer and most useful. Fisher breaks down the
various dimensions of what we often call ‘imperfection’ or ‘uncertainty’ in a broad sense, into the specific
concepts of vagueness, incompleteness, uncertainty and ambiguity amongst other terms, in order to facilitate
their detection among datasets, and to anticipate their potential consequences on analyses. We propose to
synthetize this classification while focusing on its adaptation to the classical imperfections of archaeologi-
cal data (for further discussion on the following classification see Fisher, 2005; Fisher, Comber, & Wad-
sworth, 2006; Longley, Goodchild, Maguire, & Rhind, 2005; Plewe, 2002; de Runz, Desjardin, Piantoni, &
Herbin 2011 in the specific context of archaeological data).
What we perceive about past human activities is limited to the materials that cross the ages to reach us,
and obviously to the areas investigated and methods used. This “lack of evidence” (Plewe, 2002, p. 11) is
referred to here as incompleteness and prevents archaeological objects, larger structures or even regions and
time periods from being completely described and us from perceiving their functioning at different scales.
Our knowledge of the spatial, temporal and functional aspects of an archaeological object might be
questioned by the reliability of the source, by measuring or processing errors, or by our own interpreta-
tion and classification of the object. All of these cause uncertainty in archaeological datasets. For example,
the spatial (location) or temporal (period of time) measure of an archaeological object may simply be
wrong, an error may occur in the coding of these attributes, or the object may be attributed to the wrong
class in error. The validity of the information and knowledge concerning the object is thus uncertain.
Imprecision occurs when the boundaries of the categories used to classify archaeological objects are
inaccurately defined. This imprecision can be qualified as vagueness when we use subjective knowledge or
inaccurate measuring instruments. Most archaeological categories are inherently vague: relative chronolo-
gies (see Crema, 2012; Desachy, 2012; Kennedy & Hahn, 2017; Niccolucci & Hermon, 2015 for extensive
discussion on time classification and chronological inconsistency); typological classifications (Hermon &
Niccolucci, 2002); or even the use of what Zadeh (1975) calls “linguistic variables”, i.e. “variables whose
values are not numbers but words or sentences in a natural or artificial language” (Zadeh, 1975, p. 3).
These variables are mostly used in predictive modelling, to refer to ‘high’,‘medium’ and ‘low’ potentialities
of finding settlement, or ‘preferred, ‘indifferent, ‘avoided’ areas (Balla, Pavlogeorgatos, Tsiafakis, & Pavlidis,
2013; Jaroslaw & Hildebrandt-Radke, 2009; Vaughn & Crawford, 2009). Extending that logic brings us
to question several categories that might seem so common, so trivial, we often actually forget they are
categories and take them for granted:
When, exactly, is a house a house; a settlement, a settlement; a city, a city; a podsol, a podsol; an oak
woodland, an oak woodland? The questions always revolve around the threshold value of some
measurable parameter or the opinion of some individual, expert or otherwise.
(Fisher, 2005, p. 7)
Ambiguity intervenes when there is doubt in the definition of an archaeological object or a phenom-
enon, i.e. when they might belong to several categories or scales, or when their description is subject to
Spatial fuzzy sets 171
opposing interpretations. Ambiguity is thus a specific form of imperfection combining uncertainty and
imprecision. It is typical of urban archaeology, which is characterized by an intense superposition and
frequent reutilization of remains (de Runz et al., 2011).
Even though the ‘classic’ probabilistic framework developed in predictive archaeology and chrono-
logical construction – notably the aoristic approach (Crema, Bevan, & Lake, 2010; Johnson, 2004) and
Bayesian networks (Buck, 2004; Lanos & Philippe, 2015; Litton & Buck, 1995) – have been investigated
by many in archaeology (Crema, 2012), this framework is not the most suited to modelling imprecision.
Indeed, as probabilities have a frequency interpretation, they are not appropriate to tackle imprecision.
For instance, a 35-year-old man may be partially considered as young – at least if we consider the global
definition of young, even though he is older than a 20-year-old man. Thus, if one considers to model
the concept young with probabilities, and assign a probability of 0.75 to the partially young 35-year-old
man, it will mean that 75% of the 35-year-old people have the chance to be considered as young and the
other 25% not, which has no sense. By introducing fuzzy logic, Zadeh proposed an adapted alternative to
probabilities to quantify and estimate how much a value of a set characterizes the concept associated to
it, with a membership value. With fuzzy logic, giving the 35-year-old man a membership value of 0.75
to the concept young means that he is young but also not fully young. Niccolucci and Hermon (2015)
extensively discuss this argument and show its relevance in the field of archaeology. They argue that
fuzziness is about an imprecise concept; probability is about the unknown condition of a precise
concept. (. . .) a probabilistic model assumes that something is either true or false, and we do not
know which is the case, for various reasons: incomplete information about the past, or because the
condition concerns the future, etc. A fuzzy model concerns instead those situations in which ask-
ing if something is true or false is meaningless, because there are various degrees of being true (and
false), even concerning cases which can be thoroughly inspected.
(Niccolucci & Hermon, 2015, p. 69).
These cases are numerous in of archaeology, whether it concerns dating, prediction of archaeologi-
cal site location, classification of archaeological artefacts . . . Thus, several authors have recently adopted
fuzzy logic, and consider it a more adapted framework to tackle archaeological imperfection (Hatziniko-
laou, 2006; Niccolucci & Hermon, 2004; Niccolucci & Hermon, 2015). Hatzinikolaou, Hatzichristos,
Siolas, and Mantzourani (2003) investigated fuzzy logic potential for predictive archaeology and applied
it on Melos Island, Greece, in order to estimate the degree to which archaeological objects belonged to
a set of functional and cultural categories. Banerjee, Srivastava, Pike, and Petropoulos (2018) also used
it for predictive archaeology to identify possible rock art sites in Central India, while Balla et al. (2012)
exploited it for finding Macedonian tombs in Northern Greece. Taking a similar approach, Hermon and
Niccolucci (2002, 2003) have built fuzzy typologies in order to define archaeological artefacts through
object classes with indefinite boundaries, and have also applied this approach to temporal classification
(Farinetti, Hermon, & Niccolucci, 2004; Niccolucci & Hermon, 2015) while Niccolucci, D’andrea, and
Crescioli (2001) use it to assign a reliability coefficient to imprecise attributes of statistical data deriving
from archaeometry. As the flexibility inherent to fuzzy logic makes it easily combinable with other meth-
ods, Machàlek, Cimler, Olševičová, and Danielisová (2013) associated it with agent-based modelling to
describe patterns of agricultural use in Iron Age Europe, and Baxter (2009) used it in combination with
cluster analysis. While fuzzy logic is, as we have seen, increasingly used in archaeological data analysis,
Niccolucci and Hermon (2017) have made a compelling case for its upstream use, embedding it from the
documentation phase onwards, in order to include reliability assessments from the very beginning of the
data collection and gathering process.
172 Johanna Fusco and Cyril de Runz
Method
As argued above, imprecision should be considered in the modelling of information and as shown by the
Sorites paradox,1 probabilistic modelling is not well adapted to tackle imprecision. Zadeh (1965, 1978)
thus introduced fuzzy set theory, which defines the notion of partial and valued membership of a value to
a class. A fuzzy set A (also called a Type-1 fuzzy set) is characterized by a membership µA function tak-
ing the values [0, 1]. For each domain value x, a membership degree µA(x), defined as [0, 1], is proposed.
Therefore, concepts like ‘young’ (see Figure 10.1), ‘old’, etc. can easily be modelled by fuzzy sets. If the
membership degree is equal to 1, then the domain value fully belongs to the concept; if the member-
ship degree is equal to 0, then the domain value does not belong to it. Finally, if the membership degree
falls somewhere in between, then the domain value partially belongs to it. So for instance, according to
Figure 10.1, a 2-year-old is young, a 50-year-old is not young, and a 30-year-old is neither young or not
young but somewhere in between.
Two critically important concepts in fuzzy set theory are connectivity and the notion of the Alpha-cut
(α-cut). An α-cut Aα, for all α > 0, is the set of the domain values (the set of x) having a membership
value higher or equal to α (µA(x) ≥ α), Figure 10.2. By convention, A0 is the set of x such that µA(x) >
0 and is also called the support of the fuzzy set A. A1 is the core. If A1 is not null, then the fuzzy set A is
referred to as normalized.
In a spatial context, fuzzy geometries refer to ‘fuzzy polygons’, ‘fuzzy lines’ and ‘fuzzy points’, cor-
responding to spatial objects whose boundaries cannot be determined accurately (Figure 10.3). In order
to obtain consistent fuzzy sets for these spatial shapes, the second important concept to be considered is
convexity for 1-dimensional fuzzy sets, and connectivity for fuzzy sets defined on higher dimensions.
By definition, a fuzzy set A is connected if, and only if, for all α in [0; 1] Aα is connected. Therefore,
a fuzzy set A is considered to be connected if, and only if, for all α in [0; 1], each pair (x, y) of domain
values of Aα can be connected by a path included in Aα, meaning that Aα is not composed of separate sets
Figure 10.1 Illustration of a possible fuzzy definition of the concept ‘young’ for humans.
Spatial fuzzy sets 173
Figure 10.2 Interpretation examples of membership values, and illustration of the α-cut concept in the case
where the fuzzy set is representing a possibility distribution. For instance, the domain value subset [v9, v10] is
the core (the α-cut A1) of the fuzzy set and means very possible, while [min, max] is the support (the α-cut
A0) and means almost impossible. Any domain values outside [min, max] are impossible.
Source: after Zoghlami, de Runz, and Akdag (2016), Figure 5
(see Figure 10.4). In order to compute information from fuzzy sets, it is important (but not compulsory)
to deal with connected (or convex) normalized fuzzy sets. The connexity allows the modelling of simple
geographic shape (fuzzy points, fuzzy lines, fuzzy polygons) and the composition of connected fuzzy sets
may allow definition of more complex shapes.
There are several ways to combine fuzzy sets: arithmetic operations are possible (see Zadeh, 1965), such
as logical operations. The main approach is to use t-norms (for the AND) and t-conorms (for the OR).
The probabilistic t-norm is a multiplication of the two event values, and the t-conorm is their addition
minus the value of the operation AND on them. Zadeh (1965), proposed use of their minimum for the
t-norm, and their maximum for the t-conorm. For example, let A and B be two fuzzy sets with µA, µB,
their associated membership function.
Probabilistic
t − norm : µ A and B ( x ) = µ A ( x )∗ µ B ( x )
t − co -norm : µ A or B ( x ) = µ A ( x ) + µ B ( x ) − µ A and B ( x ) = µ A ( x ) + µ B ( x ) − (µ A ( x )∗ µ B ( x ))
Zadeh
t-co-norm: ∝A or B ( x ) = max ( ∝A ( x ), ∝B ( x ))
174 Johanna Fusco and Cyril de Runz
Figure 10.3 Illustration of a fuzzy wooded area. Experts determine 3 main areas: area 1, where the concept of
the wooded area is plainly respected (the α-cut A1); area 2, which encloses area 1, where the concept is partially
respected (the α-cut A0.5); and area 3, which encloses the two previous areas, representing the limits for which
the concept can at least partially be defined (the α-cut A0). Considering several α-cuts (at least 2, A1 and A0),
the membership degree of each domain value can be obtained: it is at least the highest degree of the α-cuts it
belongs to and can potentially be obtained by spatial interpolation as well.
Source: after Zoghlami, de Runz, and Akdag (2016), Figure 3
However, using this definition, for each value in the domain set we may obtain a unique value (its
membership degree) which may thus be considered as precise. According to Mendel (2003), it may be
paradoxical to consider that a precise value (a membership degree) may represent imprecision. As a result,
Type-2 fuzzy sets (Zadeh, 1975) have been introduced as an extension of Type-1 fuzzy sets, in order to
define values of membership that are themselves fuzzy. Thus, at each value of the primary variable, here
for example the number of sites, the membership degree is a function (e.g. an interval, as far as Interval
Type-2 fuzzy sets are concerned), and not just a point value. In this elaborated approach the membership
function is blurred, and becomes a surface representing the “footprint of uncertainty” (Mendel & Bob
John, 2002).
Spatial fuzzy sets 175
Case studies
Here we present two examples, the first one based on data modelling and the second one on data analysis.
In the first example Type-1 fuzzy set modelling is used while in the second Type-2 modelling is also
employed. In the second example, the issue of intermediary scale of analysis is also discussed.
issue (Allen, 1983). This is made more complex in a fuzzy context because there is no precise and abso-
lute definition of anteriority or posteriority between fuzzy periods or dates. The proposition, introduced
by de Runz, Desjardin, Piantoni, and Herbin (2010), is to moderate the decision, i.e. anterior or not, by
assigning it a confidence index. For that purpose, they defined an index, which evaluates the anteriority
between two fuzzy periods using the areas of non-overlap (a, b, c and d) of the fuzzy periods F and G, as
presented in Figure 10.5 and defined as follows:
Ant ( F ,G ) = (b + c ) / (a + b + c + d )
Considering the non-common areas between µF and µG, this index defines the anteriority between
F and G by the ratio of the areas that validate the anteriority hypothesis over the total of non-common
areas.
A sample output is shown in Figure 10.6 for the Reims (France) database on Roman street excavations
(F-BDRues).
Building upon this idea of fuzzy data selection, Zoghlami, de Runz, Pargny, Desjardin, and Akdag
(2012) present some examples of spatiotemporal requests to find entities (e.g. as Roman sites, streets, walls)
within the FGISSAR database that are respecting soft constraints. For instance, a given user may want to
select archaeological entities satisfying the two following constraints:
• Their activity is in a specific period (e.g. the 2nd century) with a user-defined membership degree
of at least 0.4.;
• Their shape belongs to a specific site (e.g. “PC 88”) with at least a membership degree of 0.8.
In the proposed system, the Zadeh t-norm is used for the logical operator AND. The principle used to
model the BELONG operator is simple too. First, for each entity x, the algorithm determines the set of
Figure 10.5 Membership functions of two fuzzy periods/dates (f and g); a, b, c and d are the areas of
non-overlap.
Spatial fuzzy sets 177
Figure 10.6 Visualization of Roman streets in Reims that were anterior to the period “around 200 AD”
according to the confidence we have in the results.
coupled α-cuts from x and PC88 where the α-cut from x is included in the α-cut of PC88. Then, the
degree of each couple is computed according to the Zadeh t-norm (minimum between the two degrees).
After that, the final degree of the spatial relation is obtained by taking the maximum of the α-cut couple
degrees. The visualization of this query, which combines both spatial and temporal imperfections is
illustrated in Figure 10.7.
In de Runz, Desjardin, Piantoni, and Herbin (2014), a more complex fuzzy spatiotemporal query is
performed in order to extract the configuration of Roman streets in Reims during different fuzzy time
periods. We take as an example the 3rd century AD (Figure 10.8). The black lines represent the possible
design of Roman streets and therefore inform about the Roman road network during Reims during the
3rd century AD. In order to compute it an adaptation of the Hough Transform, a pattern recognition
method, has been introduced, by considering separately the fuzzy periods, the fuzzy shape and the fuzzy
orientation of each object of the database. For more details, please refer to (de Runz et al., 2014).
This case study presents a simple application context for the use of a GIS where the fuzzy queries
allow representing and selecting data, or, in the last case, producing new layers. The associated databases
are built in order to handle archaeological data considering data fuzziness.
A bundle of past possibilities”: modelling spatiotemporal structures of past settlement with fuzzy
logic. Application to Syrian Arid Margins during the Bronze Age.
(3600–1200 BC)
This case study proposes an exploratory method to model the spatiotemporal structures and dynamics
of settlements described by imperfect archaeological data (Fusco, 2015, 2016). It is based on the Bronze
178 Johanna Fusco and Cyril de Runz
Figure 10.7Visualization of entities which have an activity dated to the “Middle of the 2nd century” and
which belong to site PC 88.
Source: after Zoghlami, de Runz, and Akdag (2016), Figure 29
Age Syrian Fertile Crescent’s “Arid Margins”,2 an area considered as transitional between the steppe and
the desert that was occupied by people from the Neolithic period through to the present day (Geyer,
2011), and which is described by highly imperfect data (Figure 10.9) for three particular reasons. First,
several zones within this area have been poorly surveyed by archaeologists or not surveyed at all, leading
to data incompleteness (designated under ‘intensity of archaeological survey’ on Figure 10.9). Therefore,
if detailed systematic survey zones characterized by an absence of sites can be considered as ‘unoccupied’
during the Bronze Age, the voids in unsurveyed or Google Earth survey zones are ‘ambiguous’. Second,
many sites display functional uncertainty, even in the well-surveyed zones, which means we cannot tell for
sure if they were habitation sites or not. ‘Reliable sites’ are the ones whose habitat function is unambigu-
ous for archaeologists and are displayed with full-colours in Figure 10.9. ‘Unreliable sites’ are the ones
whose habitat function is more ambiguous, and these are displayed using hatching. Third, the dating of
several Early Bronze Age sites is imprecise. Most of them are assigned to the last part of the sub-period
Early Bronze Age IV, from 2500 to 2000 BC (displayed in bright red on Figure 10.9), while some others
could not be dated as precisely. These undetermined sites have been assigned to the whole Early Bronze
Age period, from 3600 to 2000 BC (displayed in pale red on Figure 10.9).
This approach, presented by Fusco (2016), combines the theoretical frameworks of exploratory spatial
data analysis and fuzzy sets in a methodological chain (Figure 10.10) whose purpose is twofold: first, to
detect and model settlement spatiotemporal structures and dynamics in well-surveyed zones in order
to describe and partly explain site patterning, while taking into account the impact of data quality on
our results (steps 1 to 3 in Figure 10.10). Second, to use the information obtained about settlement
Spatial fuzzy sets 179
Figure 10.8 Simulated map of Reims’ streets during the 3rd century AD generated from the fuzzy spatio
temporal data stored in FGISSAR (Fuzzy Geographic Information System for Spatial Analysis in aRchaeology)
with an adaptation of a pattern recognition method (the Hough Transform). The darker the object, the higher
the possibility of its presence during the 3rd century AD.
spatiotemporal structures in well-surveyed zones to make estimates and assumptions about potential
settlement location in unsurveyed areas (step 4 in Figure 10.10).
Archaeological sites are not a random set of points; their location and pattern reflect specific land use
and management logics whose deciphering is fundamental to understand the functioning of past settle-
ments and to model the potential of unsurveyed zones for hosting settlement. The first step of Fusco’s
methodology presented in Figure 10.10 serves to reveal the diversity of spatial patterns, while estimating
the impact of data imperfection on the results. The local spatial autocorrelation measurement (Anselin,
1995) has been chosen in order to detect spatial clusters and outliers and to reveal homogeneous or more
atypical sub-areas in the location and evolution of archaeological sites. The Local Indicators of Spatial
Association3 (LISA), based on the Local Moran statistics (Anselin, 1995), have the ability to reveal the
spatial patterns created by local spatial autocorrelation, under the form of ‘spatial clusters’ and ‘spatial
outliers’ (Crema, this volume).
180 Johanna Fusco and Cyril de Runz
Figure 10.9 Spatiotemporal trajectories and imperfection of archaeological data in Syrian Arid Margins dur-
ing the Bronze Age. A colour version of this figure can be found in the plates section.
Source: after Fusco, 2016, Figure 77
In order to evaluate the impact of data imperfection on spatial structures, analyses were first carried
out on reliable sites only, i.e. sites which are known to be habitat sites (mentioned as ‘sites with reliable
functional information’ on Figure 10.11), and then with the whole dataset, including reliable and unreli-
able archaeological sites (mentioned as ‘all sites’ on Figure 10.11). The impact of temporal accuracy on
spatial structures is also examined: the two upper grids on Figure 10.11 show the LISA results for the
Early Bronze Age sites dated more accurately (‘Early Bronze Age IV’), while the two grids below show
all the sites broadly related to the Early Bronze Age. The four LISA results on Figure 10.11 thus present
Spatial fuzzy sets 181
Figure 10.11 Detecting local spatial configurations from reliable and unreliable archaeological sites with Local
Indicators of Spatial Association.
Source: after Fusco, 2016, Figures 89–91.
four possible local spatiotemporal configurations in the form of ‘spatial clusters’ and ‘spatial outliers’,
following the considered levels of temporal and functional accuracy and reliability (see Fusco (2016) for
further information on the methodology).
The second step of the methodology involves finding out the location parameters which may have
influenced these spatial patterns. A strong correlation between the location of waterways4 and Early
Bronze Age sites has been shown and this variable has thus been chosen as an example to model Early
Bronze Age settlement potential in the Arid Margins area. Whilst the overall method will be presented
and illustrated here for only a small part of the area, the final results will be presented for the study area
as a whole.
The approach carried out in the third step implies starting from what we know, i.e. archaeological site
locations and shapes, and their relationship to waterways, to infer what we want to know, i.e. the high,
medium and low potential of attracting settlement in the Arid Margins (step 4). These levels of potential
are assessed by the proximity between sites and waterways: the smaller the distance between them, the
higher the attractiveness potential. The general postulate stemming from the preliminary statistical and
morphological analyses stated above is that the most ‘attractive’ zones for settlement will be located close
to the waterways, and that this attractiveness will decrease as the distance increases. Several ‘sub-spaces’
have thus been identified in the studied zone: each of them representing remoteness from waterways (Fig-
ure 10.12). The objective is to model the relationship between density and distance of sites to waterways
in order to determine systematically what characterizes a sub-space’s ‘high’, ‘medium’ or ‘low’ potential of
182 Johanna Fusco and Cyril de Runz
Figure 10.12 Delineation of ‘sub-spaces’ from waterways demonstrated on a small area of the Arid Margins
(after Fusco, 2016, Figure 110).
attracting settlement. This model is calibrated with the characteristics of surveyed zones (step 3) which
is then interpolated in the unsurveyed parts of each sub-space (step 4).
The variation of attractiveness throughout the studied area is modelled through the variation of sites
density between each sub-space. The attractiveness model is thus computed as follows:
t ′ ×100
x� =
t/n
where
x′ is the over or under-representation of the considered sub-space in terms of sites respect of the average
number of sites per sub-space in the studied area;
t′ is the number of sites in the considered sub-space;
n is the total number of sub-spaces in the area; and
t is the total number of sites in the area.
In other words, a sub-space which is over-represented in terms of sites (i.e. a sub-space whose site density
is higher than the average number of sites in each subspace for the studied area) is considered as more
attractive than the others, and vice versa. Each sub-space is thus given a percentage traducing its over-
under-or average representation in terms of sites.
The next task consists of classifying each sub-space into a ‘high’, ‘medium’ or ‘low’ attractiveness cat-
egory according to its representation in terms of sites. To do so the thresholds of each class need to be
fixed, i.e. to determine the site percentage which justifies the passage from one category to another. How-
ever, if a sub-space containing 1% of sites can be easily considered as having a ‘low’ potential of attracting
settlement, distinguishing between ‘low’ and “medium’ potential will be more ambiguous when the site
percentage is around 80% or 90% for example. The ability of fuzzy logic to handle “linguistic variables”
(Zadeh, 1975, p. 3) is thus used during the modelling phase.
After testing various possibilities of threshold values, it appeared that ‘fuzzifying’ site density between
65%, 90%, 110% and 135% of the average was the most relevant option (Figure 10.13).
Spatial fuzzy sets 183
Figure 10.13 Fuzzy sets framework set up to estimate each sub-space’s potential of attracting settlement
throughout the studied area (after Fusco, 2016, Figure 122).
The framework represented on Figure 10.13 means literally that a sub-space containing a number of
sites representing 0 to 65%, 90% to 110% and more than 135% of the average will be considered as hav-
ing respectively ‘low’, ‘medium’ and ‘high’ potential of attracting settlement. The two ‘vague’ zones fall
between 65% and 90%, and 110% and 135%, where the definition of ‘high’, ‘medium’ and ‘low’ is more
uncertain. As a consequence, the potential of attracting settlement of a sub-space which contains a site
density representing 70% of the average will be considered as ‘low’ with a membership function – or to
put it differently, a ‘possibility degree’ – of 0.8, and ‘medium’ with a membership function of 0.2 (the
dotted line on Figure 10.13).
Comparing each sub-space number of sites to the average, and replacing them on the fuzzy sets
framework presented on Figure 10.13 then makes it possible to place each of them in one or more ‘high’,
‘medium’ or ‘low’ category (Figure 10.14). These results can then be mapped (Figure 10.15).
The numbers in red calibrate the fuzzy sets presented on Figure 10.13 (i.e. as the average number
of sites is 26, 65% of the average equals 17, 90% of the average equals 23 and so on.). The numbers in
black correspond to the observed site density in each sub-space. Placing these observed site densities in
the calibrated fuzzy sets framework enables us to determine which categories each sub-space belongs to.
We know, however, that the Arid Margins have not been completely surveyed and that survey inten-
sity plays a role in our perception of Arid Margins settlement. The method and the resulting models,
therefore, have been calibrated with data from the well-surveyed areas on the first three steps of the
methodology. In the fourth step of this methodology, each subset, and its attractiveness or site location
potential, has been categorised depending on how intensely it has been surveyed (Figure 10.16). What is
184 Johanna Fusco and Cyril de Runz
Figure 10.14 Application of the fuzzy sets framework to each sub-space. A colour version of this figure can
be found in the plates section.
Source: after Fusco, 2016, Figure 122
Figure 10.15 Mapping each sub-space’s fuzzy sets category as in Figure 10.14.
Source: after Fusco, 2016, Figure 110
called ‘attractiveness’ thus refers to the ‘known zones’, where an absence of sites is more likely to represent
absence of Bronze Age settlement rather than absence of information. On poorly or un-surveyed areas,
called ‘grey zones’, it makes more sense to talk about a ‘potential of unknown site location’, that is to say,
to the possibility of finding settlement or not, rather than calling it ‘attractiveness’.
Spatial fuzzy sets 185
Figure 10.16 Mapping the attractiveness of and possibilities of finding settlements at the sub-spaces using
fuzzy set estimates and survey intensity levels. A colour version of this figure can be found in the plates section.
Source: after Fusco, 2016, Figure 126
Type-1 fuzzy sets have laid the basis for a relevant method of modelling uncertainty, vagueness and
imprecision (John & Coupland, 2007). But applying this concept in a highly uncertain context faces a
paradox: if we cannot determine the precise value of a certain quantity, how could we determine its exact
membership grade in a fuzzy set (Karnik & Mendel, 1998; Mendel, 2003)? As discussed earlier, in order
to address this problem, Zadeh (1975) introduced type-2 fuzzy sets, where membership grades are no
longer crisp values but fuzzy membership functions, i.e. the membership value for each element of this
set is itself a fuzzy number in [0,1].
This paradox clearly interferes with our approach, as our fuzzy sets calibration is entirely based on the
distribution of data, i.e. habitat sites, in each sub-space of our studied area (Figures 10.13, 10.14). How-
ever, as stated above (Figure 10.9), a part of our data displays what we called ‘functional uncertainty’, i.e.
the impossibility to know if the concerned sites were habitat sites or not. As a consequence, the choice to
consider only reliable data or to take the whole dataset into account changes considerably the number of
sites in each sub-space, the average on which fuzzy set calibration is based, and as a result, the potential of
attracting settlement attributed to each sub-space. As they are not able to encompass all the dimensions
of our data’s uncertainty in our models, type-1 fuzzy sets considerably limit the robustness of our results.
The ‘higher order vagueness’ (Fisher & Arnot, 2006, p. 1) introduced by type-2 fuzzy sets is the addi-
tional uncertainty ‘layer’ we need in order to deal with our data properly. Indeed, we may consider that
186 Johanna Fusco and Cyril de Runz
the actual number of sites in each sub-space is not a crisp value, but an interval between the minimum
possible number of sites (i.e. the number of reliable habitat sites) and the maximum possible number of
sites (i.e. all the sites found in the considered sub-space, reliable and unreliable).
Figure 10.17 shows that type-2 fuzzy sets are defined by the boundaries of type-1 fuzzy sets for reli-
able (the lower boundary of the type-2 fuzzy set) and unreliable sites (the upper boundary of the type-2
fuzzy set). The intervals defining the membership of each fuzzy set thus represent our higher and lower
degree of knowledge.
For example, if we consider only reliable sites, a sub-space containing 29 sites will be considered as
“medium attractive” with a membership degree of 1 (upper graph of Figure 10.17). However, if we take
the totality of sites, a sub-space containing 29 sites will be considered as “medium attractive” and “low
attractive” with membership degrees of respectively 0.8 and 0.2 (middle graph of Figure 10.17). From
a type-2 fuzzy sets perspective (lower graph of Figure 10.17), this sub-space will be thus considered as
“medium attractive” with a membership degree of [0.8;1], and “low attractive” with a membership
degree of [0;0.2]. However, this example relates to sub-spaces that do not contain any unreliable sites:
the number of sites remains 29 whether we deal with reliable sites or the totality of sites; the attractive-
ness categories and their membership degrees of this sub-space change because the composition of its
surroundings (i.e. the number of unreliable sites in other sub-spaces) and thus, the resulting fuzzy sets
framework, change.
An additional complexity in these type-2 fuzzy sets models emerges from the fact that each sub-
space is composed with reliable and unreliable archaeological sites. Thus, each sub-spaces’ sites number
is an interval between the minimum possible number of sites (reliable sites only) and the maximum
possible number of sites (totality of sites). That way, the real number of sites contained in sub-space C
is comprised somewhere between the interval [10;22]. We may then picture out sub-space C as a slider
between 10 and 22, whose raising or lowering changes its attractiveness category and the associated
membership degree. Following its needs, the user may choose to keep these nested intervals (interval
on the number of sites, and interval on the type-2 fuzzy membership degrees), or choose one reference
point in sites number interval, in order to represent each sub-spaces’ site number by a crisp value (mean,
maximum, minimum . . .). In order to facilitate attractiveness degree mapping and as an example, we
chose to keep the maximum number of sites (i.e. the totality of sites). The sub-space C will then be
represented by its maximum possible number of sites (i.e. 22 sites) in our type-2 fuzzy sets mapping,
which makes it “medium attractive” with a membership degree of [0;0.85] and “low attractive” with
a membership degree of [0.15;1].
The defined type-2 fuzzy sets are then allowed for mapping of the attractiveness levels of ‘known
zones’ and the settlement potential of ‘grey zones’ as described in Figure 10.14 and Figure 10.15. In order
to evaluate the impact of local spatial patterns on the projected attractiveness and settlement potential,
the whole process was carried out separately on the ‘clusters’ and the ‘outliers’ detected with the LISA
(Figure 10.11). Figure 10.18 shows these results on the whole extent of the Arid Margins.
In summary, two types of information have been assessed in this case study: the degree of attractiveness
of well-surveyed zones which constitutes the descriptive dimension of the maps shown by the lighter
colours; and the settlement potential of ‘grey zones’ which constitutes the possibilistic dimension of the
maps, shown by the darker colours.
Mapping all of this information and the location of known archaeological sites on the same map is
of great interest as we can see at the same time the data, the results, and the deviations from the model as
shown by the sites that fall into unattractive zones. These ‘anomalies’ suggest two possibilities, which may
both be true at the same time. Either the model has to be recalibrated and refined or, the sites which devi-
ate from the model have different location logics or spatial patterns than the ones assumed by the model.
Figure 10.17 From type-1 to type-2 fuzzy sets: introducing the reliability of archaeological sites in fuzzy set
calibration. A colour version of this figure can be found in the plates section.
Source: after Fusco, 2016, Figure 125
188 Johanna Fusco and Cyril de Runz
Figure 10.18 Type-2 fuzzy set settlement estimates for Early Bronze Age IV Arid Margins comparing results
for ‘cluster’ sites and ‘outlier’ sites. A colour version of this figure can be found in the plates section.
Source: after Fusco, 2016, Figures 128 and 131
Whatever the case may be, our point has been to highlight the importance of integrating multiple
observations and levels of analysis, represented here by the imperfection of spatiotemporal information
and spatial patterns. Considering information in a fuzzy dimension offers an alternative method which
prevents us from making restrictive choices in modelling and/or forcing us to reject all unreliable data.
The latter not only limits the information that is available, but by artificially homogenising heterogeneous
data, we run the very real risk of giving priority to quantity over quality.
Conclusion
The various dimensions of archaeological data imperfection should prevent us from assessing hypotheses
on past settlement patterns that are too rigid and restrictive. The amount and quality of data, but also
our own theoretical and methodological frameworks as well as our implicit or explicit choices have
Spatial fuzzy sets 189
obviously strong impacts on the results. This does not mean, however, that we have to be resigned to
the assumption that ‘everything is possible everywhere’. Instead, by detecting and revealing the hidden
flaws in our reasoning, defining and assessing different levels of data imperfection, and taking them into
account throughout the research process, we can begin to constructively fold them into our analyses.
Reasoning within uncertainty or with imperfection through fuzzy logic and fuzzy set theory broadens
the horizons of archaeological research as it enables the formal exploration and ordering of a variety of
possibilities without being restrained by big trends, and allows consideration of outliers which are not
encompassed by a probability framework. We end with a challenge. Is it enough to simply acknowledge
‘valid’ or ‘false’ results due to data imperfection, and/or rush to eliminate data imperfection by any means?
We would argue not. Instead we must examine deliberately and systematically how the various forms of
uncertainty and imperfection arise in our results and hypotheses, and consider how we can consider them
as a ‘bundle of past possibilities’, where the options are controlled by the level and the type of uncertainty
we consciously decide to assume.
Notes
1 A usual formulation of the paradox involves a heap of sand. If we remove a single grain from it, is it still a heap?
What happens when this process is reiterated enough times: Is a single residual grain still a heap? If not, when did
it switch from a heap to a non-heap?
2 This study was carried out in the context of the PaleoSyr/PaleoLib project, “Holocene palaeoenvironments and
settlement patterns in Western Syria and Lebanon”, directed by Frank Braemer and Bernard Geyer.
3 The LISA have been calculated with the freeware GeoDa 0.9 (Anselin, 2005)
4 For data availability reasons, the study refers to today’s waterways, not palaeo-environmental estimates.
References
Allen, J. F. (1983). Maintaining knowledge about temporal intervals. Communication of the ACM, 26(11), 832–843.
Anselin, L. (1995). Local indicators of spatial association: LISA. Geographical Analysis, 27, 93–115.
Anselin, L. (2005). Exploring spatial data with GeoDa: A workbook. University of Illinois, Urbana-Champaign: Spatial
Analysis Laboratory, Department of Geography.
Balla, A., Pavlogeorgatos, G., Tsiafakis, D., & Pavlidis, G. (2013). Locating Macedonian tombs using predictive model-
ling. Journal of Cultural Heritage, 14(5), 403–410.
Banerjee, R., Srivastava, P. K., Pike, A. W. G., & Petropoulos, G. P. (2018). Identification of painted rock-shelter sites
using GIS integrated with a decision support system and fuzzy logic. ISPRS International Journal of Geo-Information,
7, 326–346.
Baxter, M. (2009). Archaeological data analysis and fuzzy clustering. Archaeometry, 51, 1035–1054.
Bevan, A. (2015). The data deluge. Antiquity, 89(348), 1473–1484.
Brouwer Burg, M. (2017). It must be right, GIS told me so! Questioning the infallibility of GIS as a methodological
tool. Journal of Archaeological Science, 84, 115–120.
Buck, C. E. (2004). Bayesian chronological data interpretation: where now? In C. E. Buck & A. R. Millard (Eds.),
Tools for constructing chronologies: crossing disciplinary boundaries (pp. 1–24). London: Springer-Verlag.
Cooper, A., & Green, C. (2015). Embracing the complexities of Big Data in archaeology: The case of the english
landscape and identities project. Journal of Archaeological Method and Theory, 23(1), 271–304.
Crema, E. R. (2012). Modelling temporal uncertainty in archaeological analysis. Journal of Archaeological Method and
Theory, 19, 440–461.
Crema, E. R., Bevan, A., & Lake, M. (2010). A probabilistic framework for assessing spatiotemporal point patterns in
the archaeological record. Journal of Archaeological Science, 37(5), 1118–1130.
de Runz, C., Desjardin, E., Piantoni, F., & Herbin, M. (2014). Reconstruct street network from imprecise excavation
data using fuzzy Hough transforms. Geoinformatica, 18(2), 253–268.
de Runz, C., Desjardin, E., Piantoni, F., & Herbin, M. (2011). Towards handling uncertainty of excavation data into
a GIS. In E. Jerem, F. Redő, & V. Szeverényi (Eds.), On the road to reconstructing the past: Proceedings of the 36th
190 Johanna Fusco and Cyril de Runz
computer applications and quantitative methods in archaeology (CAA) international conference (pp. 187–191). Budapest:
Archeaeolingua.
de Runz, C., Desjardin, E., Piantoni, F., & Herbin, M. (2010). Anteriority index for managing fuzzy dates in archaeo-
logical GIS. Soft Computing-A Fusion of Foundations, Methodologies and Applications, 14(4), 339–344.
Desachy, B. (2012). Formaliser le raisonnement chronologique et son incertitude en archéologie de terrain. Cybergeo:
European Journal of Geography, Systèmes, Modélisation, Géostatistiques, document 597.
Farinetti, E., Hermon, S., & Niccolucci, F. (2004). Fuzzy logic application to artefact surface survey data. In F. Nicco-
lucci & S. Hermon (Eds.), Beyond the artifact: Digital interpretation of the past: Proceedings of CAA 2004 (pp. 125–129).
Budapest: Archaeolingua.
Fisher, P. F. (2005). Models of uncertainty in spatial data. In P. A. Longley, M. F. Goodchild, D. J. Maguire, &
D. W. Rhind (Eds.), Geographical information systems: Principles, techniques, management and applications (pp. 191–205).
Hoboken, NJ: Wiley.
Fisher, P. F., & Arnot, C. (2006). Mapping type 2 change in fuzzy land cover. In A. Morris & S. Kokhan (Eds.),
Geographic uncertainty in environmental security: Proceedings of the NATO advanced research workshop on fuzziness and
uncertainty in GIS for environmental security and protection (pp. 167–186). The Netherlands: Springer.
Fisher, P. F., Comber, A., & Wadsworth, R. (2006). Approaches to uncertainty in spatial data. In R. Devillers &
R. Jeansoulin (Eds.), Fundamentals of spatial data quality (pp. 43–59). London: ISTE.
Fusco, J. (2015). Detection of spatio-morphological structures on the basis of archaeological data with Mathemati-
cal Morphology and Variography: Application to archaeological sites. In A. Traviglia (Ed.), Across space and time,
selected papers from the 41st annual conference of computer applications and quantitative methods in archaeology (CAA)
(pp. 249–260). Amsterdam: Amsterdam University Press.
Fusco, J. (2016). Analyse des dynamiques spatio-temporelles des systèmes de peuplement dans un contexte d’incertitude: Applica-
tion à l’archéologie spatiale (Unpublished doctoral dissertation). University Nice Sophia Antipolis. Retrieved from
https://tel.archives-ouvertes.fr/tel-01341554
Gattiglia, G. (2015). Think big about data: Archaeology and the Big Data challenge. Archäologische Informationen, 38,
113–124.
Geyer, B. (2011). The steppe: Human occupation and potentiality, the example of northern Syria’s Arid Margins.
Syria, 88, 7–22.
Hatzinikolaou, E. G. (2006). Quantitative methods in archaeological prediction: from binary to fuzzy logic. In
M. W. Mehrer & K. L. Wescott (Eds.), GIS and archaeological site location modelling (pp. 437–446). New York:
Taylor & Francis.
Hatzinikolaou, E. G., Hatzichristos, T., Siolas, A., & Mantzourani, E. (2003). Predicting archaeological site locations
using GIS and fuzzy logic. In M. Doerr & A. Sarris (Eds.), The digital heritage in archaeology: Computer applications
and quantitative methods in archaeology (pp. 169–178). Heraklion: Archive of Monuments and Publications, Hellenic
Ministry of Culture.
Hermon, S., & Niccolucci, F. (2002). Estimating subjectivity of typologists and typological classification with fuzzy
logic. Archeologia e Calcolatori, 13, 217–232.
Hermon, S., & Niccolucci, F. (2003). A fuzzy logic approach to typology in archaeological research. In M. Doerr &
A. Sarris (Eds.), The digital heritage in archaeology: Computer applications and quantitative methods in archaeology
(pp. 169–178). Heraklion: Archive of Monuments and Publications, Hellenic Ministry of Culture.
Jaroslaw, J., & Hildebrandt-Radke, I. (2009). Using multivariate statistics and fuzzy logic system to analyse settlement
preferences in lowland areas of the temperate zone: An example from the Polish Lowlands. Journal of Archaeological
Science, 36(10), 2096–2107.
John, R., & Coupland, S. (2007). Type-2 fuzzy logic: A historical view. IEEE Computational Intelligence Magazine,
2, 57–62.
Johnson, I. (2004). Aoristic analysis: Seeds of a new approach to mapping archaeological distributions through time.
In M. de Stadt Wien, R. K. Erbe, and S. Wien (Eds.), Enter the past: The E-way into the four dimensions of cultural
heritage: Proceedings of the 31st computer applications in archaeology (pp. 448–452). Oxford: Archaeopress.
Karnik, N. N., & Mendel, J. M. (1998). Introduction to type-2 fuzzy logic systems. Proceedings of the 1998 IEEE
FUZZ Conference (pp. 915–920).
Kennedy, W. M., & Hahn, F. (2017). Quantifying chronological inconsistencies of archaeological sites in the Petra
area. eTopoi, 6, 64–106.
Spatial fuzzy sets 191
Lanos, P., & Philippe, A. (2015). Event model: A robust bayesian tool for chronological modeling. Retrieved from https://
hal.archives-ouvertes.fr
Litton, C. D., & Buck, C. E. (1995). The Bayesian approach to the interpretation of archaeological data. Archaeometry,
37(1), 1–24.
Longley, P. A., Goodchild, M. F., Maguire, D. J., & Rhind, D. W. (2005). Geographic information systems and science (2nd
ed.). London: John Wiley & Sons.
Machálek, T., Cimler, R., Olševičová, K., & Danielisová, A. (2013). Fuzzy methods in land use modeling for
archaeology. In H. Vojackova (Ed.), Proceedings of the 31st international conference mathematical methods in economics
(pp. 552–557). Jihlava: College of Polytechnics.
Mccoy, M. D. (2017). Geospatial Big Data and archaeology: Prospects and problems too great to ignore. Journal of
Archaeological Science, 84, 74–94.
Mendel, J. M. (2003). Type-2 fuzzy sets: Some questions and answers. IEEE Neural Networks Society Newsletter, 1,
10–13.
Mendel, J. M., & Bob John, R. I. (2002). Type-2 fuzzy sets made simple. IEEE Transactions on fuzzy systems, 10(2),
117–127.
Niccolucci, F., D’andrea, A., & Crescioli, M. (2001). Archaeological applications of fuzzy databases. In Z. Stančič &
T. Veljanovski (Eds.), Computing archaeology for understanding the past: Proceedings of the 28th computer applications in
archaeology (pp. 107–116). Oxford: Archaeopress.
Niccolucci, F., & Hermon, S. (2004). A fuzzy approach to reliability in archaeological virtual reconstruction. In
F. Niccolucci & S. Hermon (Eds.), Beyond the artifact: Digital interpretation of the past: Proceedings of computer applica-
tions in archaeology (pp. 28–35). Budapest: Archaeolingua.
Niccolucci, F., & Hermon, S. (2015). Time, chronology and classification. In J. A. Barceló & I. Bogdanovic (Eds.),
Mathematics and archaeology (pp. 257–271). New York: Taylor & Francis.
Niccolucci, F., & Hermon, S. (2017). Documenting archaeological science with CIDOC CRM. International Journal
of Digital Libraries, 18, 281.
Orton, C. (2010). Fit for purpose? Archaeological data in the 21st century. Archeologia e Calcolatori, 11, 249–260.
Plewe, B. (2002). The nature of uncertainty in historical Geographic information. Transactions in GIS, 6(4), 431–456
Roskams, S., & Whyman, M. (2007). Categorising the past: Lessons from the archaeological resource assessment for
Yorkshire. Internet Archaeology, 23.
Vaughn, S., & Crawford, T. (2009). A predictive model of archaeological potential: An example from northwestern
Belize. Applied Geography, 29(4), 542–555.
Veregin, H. (1989). A taxonomy of error in spatial databases. Technical Paper, 89.12, Santa Barbara, CA: National
Center for Geographic Information and Analysis.
Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8, 338–355.
Zadeh, L. A. (1975). The concept of a linguistic variable and its application to approximate reasoning. Information
sciences, 8, 199–249.
Zadeh, L. A. (1978). Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets and Systems, 1, 3–28.
Zhang, J. X., & Goodchild, M. F. (2002). Uncertainty in geographical information. New York: Taylor and Francis.
Zoghlami, A., de Runz, C., & Akdag, H. (2016). F-Perceptory: An approach for handling fuzziness of spatiotemporal
data in geographical databases. International Journal of Spatial, Temporal and Multimedia Information Systems, 1(1),
30–62.
Zoghlami, A., de Runz, C., Pargny, D., Desjardin, E., & Akdag, H. (2012). Through an archaeological urban data model
handling data imperfection. Paper presented at the Computer Applications and Quantitative Methods in Archaeol-
ogy (CAA) Conference, Southampton, UK.
11
Spatial approaches to assignment
John Pouncett
Introduction
Two deceptively simple questions are pivotal to definitions of geographic information and, by exten-
sion, spatial analysis – what? and where? (Goodchild, 2003). This chapter addresses the second of these
questions, focusing on spatial approaches to assignment that can be used to ascertain the likelihood of an
archaeological sample originating from a particular geographic region. These approaches are explored
with reference to evidence for mobility and migration that can be inferred from isotope tracers from
cremated bone and tooth enamel. Questions of geographic origins, however, are equally important
with regard to evidence for trade and exchange that can be inferred from the biological or chemical
signatures of raw materials. Geochemical analysis of the worked flint from the Gravettian sites at Rhens
and Koblenz-Metternich, Germany for example, has identified the use of flint from western Belgium c.
260km away (Moreau et al., 2016). Conversely, wood species identification and strontium isotope analysis
has suggested that the wooden artefacts analysed from Pitch Lake, Trinidad, were predominantly manu-
factured from locally sourced materials (Ostapkowicz et al., 2017). In both instances, the distance that raw
materials were transported is critical to the understanding of trade and exchange. A degree of caution,
however, should be exercised when identifying possible sources of raw materials. In the case of Bronze
Age copper metals, theoretical frameworks that move beyond the concept of provenance have been
developed that recognise that metal chemistry is the product of a longer life-history of a unit of metal
which may reflect re-use and recycling as well as the original source of the ore (Bray et al., 2015; Pollard,
2018). These theoretical frameworks highlight the need to understand the complex range of processes
that can contribute to variation in the measurements that are commonly used to determine the possible
geographic origins of archaeological samples. The spatial approaches to assignment described below are
widely used in bioarchaeology but are equally applicable to other areas of archaeology.
Isotope tracers are used to make inferences about mobility and migration, comparing observed isotope
ratios or values for archaeological samples to baseline data to determine whether an individual is local
or non-local to a geographic region (e.g. Bentley & Knipper, 2005; Montgomery, Budd, & Evans, 2000;
Price, Burton, & Bentley, 2002). For the purposes of determining geographic origins of humans and
animals in archaeology, strontium and oxygen isotopes are the most commonly employed, with multiple
isotope tracers used to narrow down the range of possible locations from where an individual could have
Spatial approaches to assignment 193
originated (Emery, Prowse, Elford, Schwarcz, & Brickley, 2017; Evans, Chenery, & Fitzpatrick, 2006; Laf-
foon et al., 2017). Traditional approaches to identifying locals and non-locals based on the identification
of outliers or extreme values using parametric and non-parametric statistics (Lightfoot & O’Connell,
2016; Wright, 2005) are increasingly being supplemented by spatial approaches based on geographic
assignment using isoscapes and Bayesian statistics (Pellegrini, Pouncett, Jay, Parker-Pearson, & Richards,
2016; Schulting et al., 2019; Snoeck et al., 2018; Snoeck, Pouncett et al., 2016). The key principles
underpinning these spatial approaches to determining the geographic origins of individuals are outlined
below with reference to case studies from Annaghmare in Northern Ireland and Duggleby Howe on the
Yorkshire Wolds.
Strontium
Strontium is an alkaline earth metal with four naturally occurring isotopes (84Sr, 86Sr, 87Sr and 88Sr) which
are formed during primordial nucleogenesis. 87Sr is radiogenic and is formed as a result of the decay of
the radioactive alkaline metal rubidium (87Rb). The 87Sr/86Sr ratios for bedrock and surface geology are
related to both the mineral composition and the age of the geological formation (Faure, 1986). Strontium
derived from the weathering of a geological formation will have the same 87Sr/86Sr ratio as the parent
geology. Consequently, the 87Sr/86Sr ratios of soil are related to the geological formations from which they
are derived. The 87Sr/86Sr ratio for rainwater is related to the mineral composition of the water vapour
(evaporated sea-or freshwater) and aerosolised particles in the atmosphere (e.g. Saharan dust) which act as
condensation nuclei for excess water vapour. In coastal regions, the 87Sr/86Sr ratio of rainfall is equivalent
to that of seawater (Hodell, Mueller, McKenzie, & Mead, 1989). Strontium from soil, groundwater and
rainwater is absorbed into plants and once it enters the food chain becomes incorporated into the tissues
of humans and animals (Capo, Stewart, & Chadwick, 1998).
87
Sr/86Sr ratios of tooth enamel (Montgomery, Evans, & Cooper, 2007; Neil, Evans, Montgomery, &
Scarre, 2018) and cremated bone (Snoeck et al., 2018; Snoeck, Pouncett et al., 2016) reflect the food/
drink consumed by an individual and, assuming that the food/drink is locally sourced, can be used to
infer the geographic locations where an individual spent specific times of their lives. The time of life
for which these inferences can be made is dependent on the tissue (bone, dentine or enamel) analysed
which is in turn dependent on the mode of burial. In the case of inhumations, the crystalline structure
of tooth enamel preserves the original in vivo 87Sr/86Sr ratio, while bone is often susceptible to diagenesis
and through the exchange of calcium for strontium equilibrates with the value of the soil in which it is
buried (Budd, Montgomery, Barreiro, & Thomas, 2000; Hoppe, Koch, & Furutani, 2003). 87Sr/86Sr ratios
of tooth enamel represent an average of the food/drink consumed during crown formation and depend-
ing on the tooth sampled can indicate where an individual spent different stages of their childhood. In
the case of cremations, high temperatures cause spalling and loss of tooth enamel, while at the same time
result in the crystallisation of bone making it resistant to diagenesis such that fully calcined bone retains its
original in vivo 87Sr/86Sr ratio (Snoeck et al., 2015). 87Sr/86Sr ratios of cremated bone represent an average
of the food/drink consumed over the decade or so before death and can indicate where an adult spent
the last c. 10 years of their life. The time of life represented for non-adults will be shorter given that it
reflects the growth stage of the skeleton.
Oxygen
Oxygen is a non-metal with three naturally occurring stable isotopes, a primary isotope (16O) formed
during primordial nucleogenesis and two secondary isotopes (17O and 18O) formed during the
194 John Pouncett
carbon-nitrogen-oxygen cycle. Spatial and temporal variation in 18O of rainwater occurs as a result of
Rayleigh fractionation or distillation, i.e. the preferential condensation of water with 18O in air masses,
depleting the 18O relative to the 16O in the vapour phase (Sharp, 2007). The d18O value of rainwater is
dependent on a wide range of variables including latitude, temperature, elevation, amount of rainfall,
and distance to surface water (Bowen & Wilkinson, 2002). The d18O value of groundwater is related
to that of the local rainfall but may vary due to evaporation of surface water, fractionation within
aquifers, and recharge from rivers with water from higher elevations (Gat, 1971). Drinking water and
water from food are typically derived from a combination of local groundwater and local rainwater.
The d18O values of food/drink consequently will approximate that of these sources. Oxygen from
food/drink is incorporated into body water and is in turn incorporated into the tissues of humans
and animals.
A linear relationship has been demonstrated between the d18O values of phosphate from the bioapa-
tite fraction of teeth and bones (d18Op) and the mean annual d18O values of rainwater (d18Ow) from the
region where an individual lived (Longinelli, 1984; Luz, Kolodny, & Horowitz, 1984). d18Op values from
tooth enamel represent an average of the food/drink consumed during crown formation and depend-
ing on the tooth sampled can be used to infer the regions where an individual spent different stages
of their childhood (Evans et al., 2006). A number of issues have been raised regarding the utility of
oxygen as an isotope tracer (Lightfoot & O’Connell, 2016; Pouncett, 2019). This is because d18O values
may be affected by a number of factors, including: variation in 18O due to changes in climate conditions
(Daux et al., 2008); fractionation of 18O as a result of the preparation of food/drink (Brettell, Mont-
gomery, & Evans, 2012); physiological variation between individuals (White, Spence, Longstaffe, &
Law, 2004); and enrichment of 18O due to weaning and/or the consumption of milk (Lin, Rau, Chen,
Chou, & Fu, 2003). Unlike strontium, oxygen cannot be used to infer the geographic origins of indi-
viduals who were cremated, since cremation alters the d18O values of bone and teeth, reflecting pyre
characteristics such as temperature and ventilation rather than diet and mobility (Snoeck, Schulting,
Lee-Thorp, Lebon, & Zazzo, 2016).
Local or non-local?
The geographic origins of an individual can be inferred by comparing observed values of one or more
isotope ratios to the expected values from baseline data for the area of interest – typically biologically
available strontium (BASr) from modern plants or animals in the case of strontium, and modern ground-
water or rainwater in the case of oxygen. If the isotope ratio from an archaeological sample has a value
that is similar to that for the baseline data for a given geographic region the individual could have been
local to that region, i.e. spent part of their childhood (tooth enamel) or the last decade or so of their
adult life (cremated bone) there. Conversely, if the isotope ratio from an archaeological sample has a value
that differs from the baseline data for a given geographic region, the individual is unlikely to have been
local to that region, i.e. did not spend part of their childhood (tooth enamel) or the last decade or so of
their life (cremated bone) there. The ability to identify whether an individual is local or non-local to a
geographic location is dependent upon a number of factors, most significantly in this context: analytical
errors associated with the isotope ratios for the archaeological samples; sampling errors associated with
the generation of the isotopic baseline data; equifinality with the same values of isotope ratios found in
multiple geographic regions; and the scale of analysis/spatial extent of the ‘local’ signal. The uncertainty
introduced as a result of these factors is integral to the methods of geographic assignment described below
and applied to the case studies for this chapter.
Spatial approaches to assignment 195
Methodology
Two principal methods have been used to determine the geographic origins of humans and animals. The
first method, based on the calculation of residuals between the expected isotope measurement from the
baseline data for the area of interest and the observed isotope measurement for an archaeological sample,
is applied to a case study from Annaghmare in Northern Ireland using a single isotope tracer. The second
method, based on the use of Bayesian statistics to determine the likelihood that an individual came from
a particular location given the observed isotope measurement, is applied to a case study from Duggleby
Howe on the Yorkshire Wolds using two isotope tracers.
Calculation of residuals
The simplest method of determining the geographic origins of humans and animals is to calculate the
residuals between the expected isotope measurement for a location and the observed measurement for an
individual (Pellegrini et al., 2016; Snoeck et al., 2018):
ei = δs ,i − δs (11.1)
where:
ei = the residual between the expected and observed isotope measurements.
ds,i = the expected isotope measurement for location i.
ds = the observed isotope measurement for the individual.
The expected isotope measurement for a location can be estimated from the baseline data for the area
of interest. A threshold can be applied to the residuals to identify locations from which an individual
could have originated (cf. Laffoon et al., 2017). Typically, the threshold used to identify the locations
from which an individual could have originated will be based on the sum of the sampling error for the
baseline data and the analytical error for the measured isotope. For example, in the case of the fragments
of cremated bone analysed from Aubrey Hole 7 at Stonehenge, Wiltshire locations with residuals less than
±0.0005 (equivalent to the sum of the sampling error for the BASr baseline and the analytical error for
the strontium measurements) were identified as possible geographic origins for the cremated individuals
(Snoeck et al., 2018).
Bayesian statistics
Bayesian statistics have been widely employed by biologists and archaeologists for the purposes of deter-
mining the geographic origins of humans and animals (Bowen, Liu, Vander Zanden, Zhao, & Takahashi,
2014; Laffoon et al., 2017; Wunder, Kester, Knopf, & Rye, 2005). The likelihood that an individual came
from a particular location given the observed isotope measurement can be calculated using Bayes’ theo-
rem which can be expressed as:
P (B | Ai ) P ( Ai )
P ( Ai|B ) = (11.2)
∑ P (B | A )P ( A )
i i
where:
P(Ai|B) = the posterior probability distribution of individual A originating from location i, given
observed isotope measurement B.
196 John Pouncett
P(B|Ai) = the sampling probability distribution of observing isotope measurement B, given all loca-
tions i from which individual A could have originated.
P(Ai) = the prior probability distribution of individual A originating from location i, given assump-
tions or knowledge prior to observing isotope measurement B.
If there are no prior assumptions or knowledge of the likely geographic origin of an individual, it
is assumed that all locations are equally possible and a non-informative prior is used as the prior
probability distribution – typically, this will be a uniform distribution (a,b) with probability density
function:
1
f (x ) = (11.3)
b −a
where:
a = the minimum isotope measurement for all locations i.
b = the maximum isotope measurement for all locations i.
If there are prior assumptions or knowledge of the likely origin of an individual, the probability density
function which best describes the prior assumptions or knowledge should be used as the prior probability
distribution. For example, in a test of geographic assignment of mountain plover chicks (Charadrius mon-
tanus) using isotope tracers in feathers, it was assumed that the chick feathers were exclusively of known
geographic origin and the sample sizes per location were used to estimate the prior probability density
(Wunder et al., 2005).
It is generally assumed that the observed isotope measurement for a sample is an outcome of a random
process and that the sampling probability distribution can consequently be estimated using the probabil-
ity density function for a normal distribution (m,s) with location m and scale s:
1 2
−(x −µ) / 2 σ 2
f (x ) = e (11.4)
σ 2π
The parameters of the normal distribution are commonly estimated using the observed isotope value
or ratio for the individual, the total of the sampling error for location i and the analytical error for the
observed isotope measurement:
µ = δs (11.5)
σ = σs ,i + σε (11.6)
where:
ds = the observed isotope measurement for the individual.
ss,i = the sampling error for location i estimated from the baseline data.
se = the analytical error for the observed isotope measurement.
If a uniform distribution is used as a non-informative prior and a normal distribution is used as the sam-
pling probability function, Eq. (11.2) can be rewritten as:
−(δs ,i −δs ) / 2(σs ,i 2 +σε2 )
1 2
1
e
(σ s ,i + σε ) 2π
δ s ,imax − δ s ,imin (11.7)
P ( Ai|δ s ) =
i −(δs ,i −δs ) / 2(σs ,i 2 +σε2 )
1 1
2
∑ 1(σ + σ ) 2π
e δ
s ,i ε s ,imax − δ s ,imin
Spatial approaches to assignment 197
where:
ds,i = the expected isotope measurement for location i estimated from the baseline data.
e s ,i s
(σs,i + σε ) 2π
P ( Ai|δs ) = (11.8)
1 −(δs ,i −δs ) / 2(σs ,i 2 +σε 2 )
2
∑ 1 (σ + σ ) 2π e
i
s ,i ε
The posterior probability distributions are commonly rescaled by the largest observed density, with the
resultant probability densities ranging between 0 and 1 (cf. Wunder, 2010). Where multiple isotope trac-
ers are used to determine the geographic origin of an individual, Bayes’ theorem can be applied iteratively
with the posterior probability density from the iteration for one isotope tracer used as the prior prob-
ability density for the iteration for the next isotope tracer.
Maximum likelihood estimation is commonly used to determine the geographic origin of an
individual, with the individual assigned to the location with the highest probability density. Ultimately,
the validity of this assignment is dependent on the robustness of the baseline data used to determine
the geographic origin of an individual. If the baseline does not adequately account for all of the factors
that influence spatial variation in the ratios or values of the isotope tracer within the area of interest (see
earlier), the location to which an individual is assigned may not be valid and the process by which it was
derived will not be robust.
A BASr baseline suitable for geographic assignment at a national or regional scale has recently been
produced for Ireland (Snoeck et al., 2019). This baseline is based on published 87Sr/86Sr measurements
for modern plants (Ryan, Snoeck, Crowley, & Babechuk, 2018; Snoeck, Pouncett et al., 2016; Snoeck
et al., 2019; Wilson & Standish, 2016) and Geological Survey of Ireland (GSI) Bedrock Geology 500k
Series data (https://data.gov.ie/dataset/gsi-bedrock-geology-500k-series). Strontium isotope ratios for
the plant samples were aggregated in order of preference by outcrop (single part polygons for individual
outcrops of bedrock), formation (multi-part polygons for each geological formation) and type/age
(multi-part polygons for formations of similar type/age), with a range of descriptive statistics calculated
for the polygons corresponding to each outcrop/formation. The 87Sr/86Sr ratios for the modern plants are
not normally distributed (Shapiro-Wilk test: W = 0.949, df=228, p = 0.000) and the median and median
absolute deviation (MAD) are consequently used to describe the spatial variation in biologically available
strontium rather than the mean and standard deviation.
The expected 87Sr/86Sr ratio for the outcrops of Silurian sandstone, greywacke and shale (Formation
49, Snoeck, Pouncett et al., 2016) in the immediate vicinity (<5km) of Annaghmare based on the BASr
baseline for Ireland is 0.71081 ± 0.000491 (Figure 11.1). The observed 87Sr/86Sr ratio for Individual A2
falls more than 3 MAD below the median for the Silurian sandstone, greywacke and shale (Figure 11.2),
consistent with the interpretation of this individual as a non-local. Marked variation, however, can be seen
in the 87Sr/86Sr ratios for the modern plants from the Silurian sandstone, greywacke and shale (n=24). Six
of the plant samples have 87Sr/86Sr ratios which fall more than 3 MAD below the median for the forma-
tion, including two plant samples from the deposits of till overlaying the formation in County Kildare
to the south of Annaghmare, and two plant samples from the gravel deposits overlying the formation in
County Cavan to the west of Annaghmare. In theory, the low 87Sr/86Sr ratios for these plant samples raise
the possibility that Individual A2 could have originated from areas to the south and west of Annaghmare
where the Silurian sandstone, greywacke and shale are locally overlain by drift deposits. In practice, these
plant samples are outliers – the 87Sr/86Sr ratio from cremated bone represents an average all of the foods
consumed by an individual over c. 10 years and given the localised nature of the till and gravel Individual
A2 is unlikely to have only eaten foods corresponding to these outliers (Warham, 2011).
Point-based comparisons between observed 87Sr/86Sr ratios for archaeological samples and expected
87
Sr/86Sr ratios based on baseline data from modern plants are problematic for several reasons: they fail
to account for imprecision in the coordinates for the sites from which samples were taken – a particular
problem with legacy samples from nineteenth century excavations; they do not account for localised dif-
ferences in surface geology; and past populations would not have obtained their food from a single source.
These problems can in part be addressed by calculating BASr catchments, with focal medians calculated
based on the expected values of 87Sr/86Sr ratios for all of the BASr comparanda within a specified distance
of a site (cf. Snoeck, Pouncett et al., 2016). The size of the BASr catchment should be appropriate to
the scale of analysis and the distance from which food would have been sourced, with catchments <5km
representing locally sourced food, catchments <20km representing food sourced from the wider region
and catchments >20km representing food sourced from further afield based on analysis of comparable
Neolithic and Bronze Age sites in Ireland. In contrast to point-based comparisons which will only reflect
the 87Sr/86Sr ratio of food sourced from a single geological formation, comparisons based on BASr catch-
ments will also reflect the 87Sr/86Sr ratios of food sourced from multiple geological formations depending
upon the spatial extent of the geological formations and the size of the BASr catchments.
The BASr baseline for Ireland was converted to a raster dataset with a cell size of 100m to preserve
localised outcrops of bedrock and focal medians, representing the expected 87Sr/86Sr ratios for 5km BASr
catchments, were calculated for each cell in the raster dataset (Figure 11.3, left). Residuals between the
observed 87Sr/86Sr ratio for Individual A2 and the expected values of the 87Sr/86Sr ratios for the 5km BASr
Figure 11.1 Expected 87Sr/86Sr ratios in the immediate vicinity of Annaghmare based on the BASr baseline for
Ireland after Snoeck et al. (2019) (top: median; bottom: median absolute deviation from the median). The black
dots indicate the locations of the modern plant samples used to generate the biologically available strontium (BASr)
baseline and the white dots indicate outliers for the Silurian sandstone, greywacke and shale (Formation 49).
Figure 11.2 Boxplot showing the variation in the observed 87Sr/86Sr ratios of modern plants from the outcrops of
Silurian sandstone, greywacke and shale (Formation 49) in the Annaghmare region. The grey shaded area shows
the 87Sr/86Sr ratios that lie within 1 median absolute deviation (MAD) of the median. Samples with 87Sr/86Sr ratios
less than 3 MAD below the median or greater than 3 MAD above the median can be considered to be non-locals.
Spatial approaches to assignment 201
Figure 11.3 Focal medians for 5km BASr catchments calculated from the BASr baseline for Ireland (left) and
residuals between expected 87Sr/86Sr ratios based on the BASr catchments and the observed 87Sr/86Sr ratio for
Individual A2 (right). Locations from which Individual A2 could have originated are shown in white. Loca-
tions from which Individual A2 is unlikely to have originated are shown in blue (more depleted) and orange
(more enriched). A colour version of this figure can be found in the plates section.
catchments were subsequently calculated (Figure 11.3, right). The residuals were symbolised using gradu-
ated colours with defined values based on the sum of the sampling error for the baseline data – defined
as the median absolute deviation for the geological formation – and the analytical error for the observed
isotope ratio for Individual A2. Residuals with a magnitude less than the sum of the sampling error and
the analytical error represent possible locations from which Individual A2 could have originated, i.e.
spent the last decade or so of their lives. The possible locations from which Individual A2 could have
originated are largely confined to the area to the south of Annaghmare, including the Boyne Valley, and
the area to the west of Annaghmare. These areas mirror the spatial distribution of passage tombs – in
contrast, the distribution of court tombs is confined to the northern third of Ireland (Darvill, 1979). The
radiocarbon dates from Annaghmare fall within a similar time-frame to the megalithic tombs at Ballyna-
hatty 1855 and Millin Bay in County Down, both of which have an affinity with the developed passage
tomb tradition of the late fourth millennium BC (Schulting et al., 2012).
Riley, 1980). It was partially excavated during the late eighteenth century by the Reverend Christopher
Sykes and was re-opened by John Robert Mortimer in July and August 1890 (Cole, 1901; Mortimer,
1892, 1893, 1905). The structural sequence at Duggleby Howe is complex, with five phases of construc-
tion proposed (Pouncett, 2019) on the basis of recent radiocarbon dates (Gibson et al., 2011; Gibson &
Bayliss, 2009):
• Phase 1 (Early Neolithic) – the earliest phase of the monument was characterised by a shaft grave
(Grave B), marked by an up-cast mound of chalk;
• Phase 2 (Middle Neolithic) – two burials on the old land surface and a burial in a shallow grave
(Grave A) were added respecting the position of the shaft grave;
• Phase 3 (Late Neolithic) – an interim mound was constructed, and a series of burials and cremations
were inserted into the mound;
• Phase 4 (Chalcolithic) – a circular enclosure, defined by a causewayed ditch c. 350m in diameter, was
built around the interim mound; and
• Phase 5 (Early Bronze Age) – the interim mound was enlarged substantially with the construction
of a chalk outer mound over 22 feet in height.
Thirteen inhumations and fifty-three cremations, spanning a period of more than 1,000 years from the
middle of the fourth millennium BC to the late third millennium BC, were found within or beneath
the inner mound. This sequence of burials represents the full spectrum of Middle and Late Neolithic
funerary practices and is considered pivotal in understanding the transition between inhumation and
cremation as the dominant funerary rite during the Neolithic (Loveday, 2002). Prestige goods, including
a polished flint adze, a polished discoidal knife, a perforated antler mace head and a series of boar’s tusk
blades, were found with inhumations from the shaft grave and the old land surface. Duggleby Howe was
a lynch-pin in the framework established for the classification and dating of Neolithic round barrows
and ring ditches (Kinnes, 1979).
87
Sr/86Sr ratios and d18Op values have been obtained for tooth enamel from seven of the burials from
Duggleby Howe, including the inhumation that was buried at the base of the shaft grave (Burial K:
87
Sr/86Sr = 0.70859; d18Op = 18.9 ‰) associated with the earliest phase of the monument (Evans, Chen-
ery, & Montgomery, 2012; Montgomery, Cooper, & Evans, 2007; Montgomery, Evans et al., 2007). The
87
Sr/86Sr ratios and d18Op values have been used to suggest that none of the individuals from Duggleby
Howe spent their childhood on the chalk of the Yorkshire Wolds and that Burial K could have come
from as far away as Western Scotland or Cornwall – assertions which have been woven into narratives
about mobility during the Neolithic and Early Bronze Age (Gibson, 2016; Hutton, 2014; Loveday, 2016).
These assertions have been accepted at face value for several reasons: (1) they fit with current models of
settlement practice which regard the uplands of the Yorkshire Wolds as a place where people buried their
dead and the lower-lying areas to the south and east as the epicentre of Neolithic settlement (Carver,
2012; Harding, 2006; Manby, 1988); (2) they explain the prestige goods found with several of the burials,
including the polished flint adze and discoidal knife thought to have been manufactured in the specialist
workshops at North Dale and South Landing (Durden, 1995; Loveday, 2011; Pierpoint, 1980); (3) they fit
with narratives about mobility which prioritise the exotic over the mundane, with the possible origins of
the Amesbury Archer in the Austrian Alps (Evans et al., 2006) more captivating than an ‘everyday tale of
country folk’ in Wiltshire. Although the individuals buried at Duggleby Howe might not have spent their
childhood on the chalk of the Yorkshire Wolds, this does not necessarily mean that they were not locals.
The geographic origins of Burial K from Duggleby Howe are re-evaluated below using Bayesian sta-
tistics and maximum likelihood estimation, by calculating probability density surfaces for the burial and
Spatial approaches to assignment 203
Figure 11.4 Expected 87Sr/86Sr ratios based on the BASr baseline for mainland Britain (after Snoeck et al.,
2018, Figure 2) (left) and expected d18O values based on the ground water baseline for the United Kingdom
and Republic of Ireland (after Darling et al., 2003, Figure 6) (right). The expected d18O values have been con-
verted from d18Ow values to d18Op values using the equation from Daux et al. (2008) to allow direct comparison
with the observed isotope value for Burial K at Duggleby Howe.
assigning the individual to the geographic region with the highest probability density. In contrast to the
Annaghmare case study that was reliant upon a single isotope tracer, two isotope tracers can be used to
determine locations from which Burial K could have originated. Two baselines were consequently used
for the purposes of the geographic assignment of Burial K (Figure 11.4): (1) a BASr baseline for mainland
Britain, based on published 87Sr/86Sr measurements (Chenery, Müldner, Evans, Eckardt, & Lewis, 2010;
Evans, Montgomery, Wildman, & Boulton, 2010; Schulting et al., 2019; Snoeck et al., 2018) and Brit-
ish Geological Survey DiGMapGB-625 bedrock geology data (www.bgs.ac.uk/products/digitalmaps/
digmapgb_625.html); and (2) a d18O baseline for mainland Britain, based on modern groundwater values
(Darling, Bath, & Talbot, 2003) converted to phosphate values using the equation d18Op = 0.501 d18Ow +
20.71 published by Daux et al. (2008). Both of these baselines are suitable for geographic assignment at
a national or regional scale. The expected 87Sr/86Sr ratio for the Cretaceous chalk of the Yorkshire Wolds
from the BASr baseline is 0.70818 ± 0.00036 and the expected d18Op for the Yorkshire Wolds from the
converted modern d18Ow values is 16.7 ± 0.3‰. No baseline is without limitations and the drawback of
the baselines currently available for mainland Britain is that they are based on bedrock geology and mod-
ern groundwater and do not directly take into account the other factors which might influence spatial
variation in 87Sr/86Sr ratios or d18O values highlighted in the introduction to this chapter.
204 John Pouncett
Building on the approach used for the Annaghmare case study, the BASr and d18Op baselines were
converted into raster datasets with a cell size of 100m, and focal means were calculated to represent 5km
BASr and d18O catchments for every cell in the resultant raster datasets. Probability density surfaces were
calculated from the focal means using Bayes’ theorem, with the prior probability distribution defined
using either the probability density function for a continuous distribution as a non-informative prior
(single) or using the posterior density distribution for strontium (dual), and the sampling probability
distribution defined using the probability density function for a normal distribution with location μ and
scale s. Parameters for the normal distribution for each of the isotope tracers were estimated using the
observed 87Sr/86Sr ratio and d18Op value for the burial and the standard deviation of the expected 87Sr/86Sr
ratio and the converted modern d18Ow value for the Cretaceous chalk respectively, and the resultant pos-
terior probability densities were rescaled by the largest observed density with values ranging between 0
and 1. A Euclidean distance surface with a cell size of 100m was calculated for Duggleby Howe. Zonal
statistics were then calculated from the probability density and Euclidean distance surfaces using geo-
graphic regions based on National Character Areas (https://naturalengland-defra.opendata.arcgis.com/
datasets/national-character-areas-england), National Landscape Character Areas (https://landmap-maps.
naturalresources.wales) and Landscapes of Scotland (https://gateway.snh.gov.uk/natural-spaces/). Geo-
graphic assignments were determined for Burial K based on the zonal statistics, with the regions ranked
by highest probability density and lowest Euclidean distance (Figure 11.5).
The geographic assignments for Burial K based on Bayesian statistics and maximum likelihood esti-
mation paint a different picture to the previous analysis (Montgomery, Evans et al., 2007; Montgomery,
Cooper et al., 2007; Evans et al., 2012; Montgomery & Jay, 2013). Where a single isotope tracer is used,
the observed 87Sr/86Sr ratio for Burial K suggests that the individual spent their childhood in Eastern
Britain with the closest match on the Yorkshire Wolds while the observed d18Op value suggests that the
individual spent their childhood in Western Britain with the closest match on Barra and Uist, Outer
Hebrides (Table 11.1). The geographic assignment based on the observed 87Sr/86Sr ratio for Burial K
reflects averaging of the expected 87Sr/86Sr ratios of the Cretaceous Chalk of the Yorkshire Wolds, and
the Jurassic Clay of the Howardian Hills and the Triassic Rocks of the Humberhead Levels, the Vale of
Pickering and the Vale of York adjacent to the Yorkshire Wolds. In contrast to point-based comparisons,
comparisons based on catchments take into account the possibility of locally obtaining food/drink from
more than one source (cf. Montgomery, 2010). Where multiple isotope tracers are used, the observed
87
Sr/86Sr ratio and d18Op value for Burial K suggest that the individual spent their childhood in Western
Britain with the closest match on The Lizard Peninsula, Cornwall. Links with Western Scotland and
Cornwall cannot be evidenced on the basis of the observed 87Sr/86Sr ratio for the burial and are instead
based on the observed d18Op value for the burial. This discrepancy is repeated for the other burials from
Duggleby Howe (Pouncett, 2019).
Whilst d18O values are commonly used to narrow the range of possible locations based on 87Sr/86Sr
ratios in multi-isotope tracer approaches, the interpretation of Burial K from Duggleby Howe is dispro-
portionately skewed by the d18O value – to the point where the geographic assignments are effectively
based on a single tracer isotope, with the local origins supported by the strontium isotope measurement
overridden by the distant origins supported by the oxygen isotope measurement. This analysis raises
significant questions about the utility of oxygen as a tracer isotope and the modern groundwater values
that are used as a baseline for the study of mobility and migration in mainland Britain. Analysis of the
oxygen isotope ratios carried out as part of the Beaker People Project has shown that burials from several
of the burial mounds from Eastern Yorkshire, including Garton Slack 37 on the Yorkshire Wolds, exhibit
more than half of the national variation (Pellegrini et al., 2016). This degree of variation is perhaps not
surprising given that d18Op values from tooth enamel can be affected by a wide range of factors, including
Figure 11.5 Probability density surface (Top Left) and maximum likelihood estimations showing the locations
where Burial K from Duggleby Howe could have originated from based on the observed 87Sr/86Sr ratio (Top
Right), the observed d18O value (Bottom Left), and both the observed 87Sr/86Sr ratio and d18O value (Bottom
Right). The geographic assignments based on dual isotope tracers are unduly influenced by one of the isotopes
(oxygen) raising further questions about the utility of oxygen as a tracer isotope. A colour version of this figure
can be found in the plates section.
206 John Pouncett
Table 11.1 Geographic assignments for Burial K from Duggleby Howe ranked by highest probability density and
lowest Euclidean distance, using regions based on National Character Areas (England), National Landscape Character
Areas (Wales) and Landscapes of Scotland (Scotland).
Rank 87
Sr/86Sr d18Op Sr/86Sr + d18Op
87
short-term climate conditions, sourcing waters from reservoirs, preparation of food and drink, analytical
errors and physiological differences between individuals. The uncertainty introduced by this variability is
compounded by the process of converting from d18O values from modern water to d18O values for tooth
enamel which is known to be problematic (Pollard, Pellegrini, & Lee-Thorp, 2011). Different formula
for converting between d18O values for modern groundwater and d18O values for tooth enamel (Chenery,
Pashley, Lamb, Sloane, & Evans, 2012; Daux et al., 2008; Longinelli, 1984; Luz et al., 1984; Pollard et al.,
2011) would potentially result in an individual being assigned to different geographic regions.
Conclusion
The case studies used to illustrate the spatial approaches to assignment which are commonly used to
determine the possible geographic origins of humans and animals highlight two key points. First, the
analysis of the individuals from Annaghmare and Duggleby Howe highlights the ambiguity in the pos-
sible geographic regions from which the individuals originated. Where a single isotope tracer is used
more than one geographic region may have the same isotope measurement as the individual, and where
multiple isotope tracers are used each isotope may suggest that the individual originated from a differ-
ent geographic region. Secondly, the analysis of the individuals from Annaghmare and Duggleby Howe
highlights the importance of the baseline data that are used for the purposes of geographic assignment. If
the baseline data do not adequately account for key factors that influence variation in the isotope tracer
measurements, the resultant geographic assignments will not be reliable.
At Annaghmare, the residuals calculated between the expected 87Sr/86Sr ratio for 5km BASr catch-
ments and the observed 87Sr/86Sr ratio from cremated bone suggested that Individual A2 was non-local
and could have spent the last decade or so of their life in Central or Western Ireland. The baseline data for
Ireland is based solely on bedrock geology and 87Sr/86Sr ratios comparable to the observed 87Sr/86Sr ratio
for Individual A2 can be found in the superficial deposits that locally overlay the geological formation
on which the court tomb is located but are not reflected in the plant samples taken from the immediate
vicinity of the tomb (Snoeck, Pouncett et al., 2016). At Duggleby Howe, Bayesian statistics and maximum
Spatial approaches to assignment 207
likelihood estimation highlighted a discrepancy between the geographic regions from which Burial K
could have originated based on the observed 87Sr/86Sr ratio and d18O value. The geographic assignment
based solely on the 87Sr/86Sr ratio suggested that the individual was local and could have spent part of
their childhood on the Yorkshire Wolds or adjacent regions, while the geographic assignments based on
the d18Op value (either as a single isotope tracer or a multi-isotope tracer) suggested that the individual
was non-local and could have spent part of their childhood on the Lizard Peninsula, Cornwall. This dis-
crepancy raises significant questions about the utility of oxygen as an isotope tracer and in particular the
converted d18O values of modern groundwater which are often used as a baseline.
Although both of the case studies in this chapter related to the use of isotope tracers from tooth enamel
or cremated bone to ascertain the likelihood that an individual spent the last c. 2–3 years or c. 10 years
of their lives respectively in a particular geographic region, the spatial approaches that were introduced
can be applied to other types of archaeological samples and analytic measurements providing that suit-
able comparative data are available to create a robust baseline for the purposes of geographic assignment.
Both the approach based on the calculation of residuals and the approach based on Bayesian statistics
and maximum likelihood estimation will yield similar results. The approach based on the calculation of
residuals retains a direct link to the measured values and, as such, is perhaps more intuitive and easier to
interpret – particularly in instances where sources of error are poorly understood at the time the analysis
is carried out.
Acknowledgements
This chapter would not have been possible without the support of Christophe Snoeck who undertook
the original analysis on the samples from Annaghmare, and Maura Pellegrini who carried out the research
which prompted the re-analysis of the isotope data from Duggleby Howe. It arose from work carried
out with Joanna Ostapkowicz as part of the Black Pitch, Carved Histories project funded by the AHRC
(AH/L00268X/1), and the Stone Interchanges within the Bahama archipelago project funded by the
AHRC (AH/N007476/1). Emma Gowans, Chris Green, Stuart Pouncett, Mark Gillings and Rick Schult-
ing have commented on earlier drafts of this chapter and have greatly improved the final text and figures.
Any errors or omissions are entirely my own. Lastly, I would like to thank the editors for inviting me to
contribute to this volume.
Note
1 The expected 87Sr/86Sr ratio is quoted as the median ± median absolute deviation for the geological formation.
References
Bentley, R. A., & Knipper, C. (2005). Geographical patterns in biologically available strontium, car-
bon and oxygen isotope signatures in prehistoric SW Germany. Archaeometry, 47(3), 629–644. https://doi.
org/10.1111/j.1475-4754.2005.00223.x
Bowen, G. J., Liu, Z., Vander Zanden, H. B., Zhao, L., & Takahashi, G. (2014). Geographic assignment with stable
isotopes in IsoMAP. Methods in Ecology and Evolution, 5(3), 201–206. https://doi.org/10.1111/2041-210X.12147
Bowen, G. J., & Wilkinson, B. (2002). Spatial distribution of d18O in meteoric precipitation. Geology, 30(4), 315–318.
https://doi.org/10.1130/0091-7613(2002)030 < 0315:SDOOIM>2.0.CO;2
Bray, P., Cuénod, A., Gosden, C., Hommel, P., Liu, R., & Pollard, A. M. (2015). Form and flow: The “karmic cycle”
of copper. Journal of Archaeological Science, 56, 202–209. https://doi.org/10.1016/j.jas.2014.12.013
208 John Pouncett
Brettell, R., Montgomery, J., & Evans, J. (2012). Brewing and stewing: The effect of culturally mediated behaviour
on the oxygen isotope composition of ingested fluids and the implications for human provenance studies. Journal
of Analytical Atomic Spectrometry, 27(5), 778–785. https://doi.org/10.1039/C2JA10335D
Budd, P., Montgomery, J., Barreiro, B., & Thomas, R. G. (2000). Differential diagenesis of strontium in archaeological
human dental tissues. Applied Geochemistry, 15(5), 687–694. https://doi.org/10.1016/S0883-2927(99)00069-4
Capo, R. C., Stewart, B. W., & Chadwick, O. A. (1998). Strontium isotopes as tracers of ecosystem processes: theory
and methods. Geoderma, 82(1), 197–225. https://doi.org/10.1016/S0016-7061(97)00102-X
Carver, G. (2012). Pits and place-making: Neolithic habitation and deposition practices in East Yorkshire c. 4000–
2500 BC. Proceedings of the Prehistoric Society, 78, 111–134. https://doi.org/10.1017/S0079497X00027134
Chenery, C. A., Müldner, G., Evans, J., Eckardt, H., & Lewis, M. (2010). Strontium and stable isotope evidence
for diet and mobility in Roman Gloucester, UK. Journal of Archaeological Science, 37(1), 150–163. https://doi.
org/10.1016/j.jas.2009.09.025
Chenery, C. A., Pashley, V., Lamb, A. L., Sloane, H. J., & Evans, J. A. (2012). The oxygen isotope relationship between
the phosphate and structural carbonate fractions of human bioapatite. Rapid Communications in Mass Spectrometry,
26(3), 309–319. https://doi.org/10.1002/rcm.5331
Cole, E. (1901). Duggleby Howe. Transactions of the East Riding Antiquarian Society, 9, 57–61.
Darling, W. G., Bath, A. H., & Talbot, J. C. (2003). The O and H stable isotope composition of freshwaters in the
British Isles. 2. Surface waters and groundwater. Hydrology and Earth System Sciences, 7(2), 183–195. https://doi.
org/10.5194/hess-7-183-2003
Darvill, T. C. (1979). Court cairns, passage graves and social change in Ireland. Man, 14(2), 311. https://doi.
org/10.2307/2801570
Daux, V., Lécuyer, C., Héran, M.-A., Amiot, R., Simon, L., Fourel, F., . . . Escarguel, G. (2008). Oxygen isotope frac-
tionation between human phosphate and water revisited. Journal of Human Evolution, 55(6), 1138–1147. https://
doi.org/10.1016/j.jhevol.2008.06.006
Durden, T. (1995). The production of specialised flintwork in the later Neolithic: A case study from the Yorkshire
Wolds. Proceedings of the Prehistoric Society, 61, 409–432. https://doi.org/10.1017/S0079497X00003157
Emery, M. V., Prowse, T. L., Elford, S., Schwarcz, H. P., & Brickley, M. (2017). Geographic origins of a War of 1812
skeletal sample integrating oxygen and strontium isotopes with GIS-based multi-criteria evaluation analysis. Jour-
nal of Archaeological Science: Reports, 14(Supplement C), 323–331. https://doi.org/10.1016/j.jasrep.2017.06.007
Evans, J. A., Chenery, C. A., & Fitzpatrick, A. P. (2006). Bronze Age childhood migration of individuals near Stone-
henge, revealed by strontium and oxygen isotope tooth enamel analysis. Archaeometry, 48(2), 309–321. https://
doi.org/10.1111/j.1475-4754.2006.00258.x
Evans, J. A., Chenery, C. A., & Montgomery, J. (2012). A summary of strontium and oxygen isotope variation
in archaeological human tooth enamel excavated from Britain. Journal of Analytical Atomic Spectrometry, 27(5),
754–764. https://doi.org/10.1039/C2JA10362A
Evans, J. A., Montgomery, J., Wildman, G., & Boulton, N. (2010). Spatial variations in biosphere 87Sr/86Sr in Britain.
Journal of the Geological Society, 167(1), 1–4. https://doi.org/10.1144/0016-76492009-090
Faure, G. (1986). Principles of isotope geology (2nd ed.). Chichester: Wiley.
Gat, J. R. (1971). Comments on the stable isotope method in regional groundwater investigations. Water Resources
Research, 7(4), 980–993. https://doi.org/10.1029/WR007i004p00980
Gibson, A. (2016). Who were these people? A sideways view and a non-answer of political proportions. In
K. Brophy, G. MacGregor, & I. Ralston (Eds.), The neolithic of mainland Scotland (pp. 57–73). Edinburgh: Edinburgh
University Press.
Gibson, A., Allen, M., Bradley, P., Carruthers, W., Challinor, D., French, C., . . . Walmsley, C. (2011). Report on the
excavation at the Duggleby Howe causewayed enclosure, North Yorkshire, May–July 2009. Archaeological Journal,
168(1), 1–63. https://doi.org/10.1080/00665983.2011.11020828
Gibson, A., & Bayliss, A. (2009). Recent research at Duggleby Howe, North Yorkshire. Archaeological Journal, 166(1),
39–78. https://doi.org/10.1080/00665983.2009.11078220
Goodchild, M. F. (2003). The nature and value of geographic information. In M. Duckham, M. F. Goodchild, &
M. Worboys (Eds.), Foundations of geographic information science (pp. 18–30). London: Taylor & Francis.
Harding, J. (2006). Pit-digging, occupation and structured deposition on Rudston Wold, Eastern Yorkshire. Oxford
Journal of Archaeology, 25(2), 109–126. https://doi.org/10.1111/j.1468-0092.2006.00252.x
Spatial approaches to assignment 209
Hodell, D. A., Mueller, P. A., McKenzie, J. A., & Mead, G. A. (1989). Strontium isotope stratigraphy and
geochemistry of the late Neogene ocean. Earth and Planetary Science Letters, 92(2), 165–178. https://doi.
org/10.1016/0012-821X(89)90044-7
Hoppe, K. A., Koch, P. L., & Furutani, T. T. (2003). Assessing the preservation of biogenic strontium in fossil bones
and tooth enamel. International Journal of Osteoarchaeology, 13(1–2), 20–28. https://doi.org/10.1002/oa.663
Hutton, R. (2014). Pagan Britain. New Haven, CT: Yale University Press.
Kinnes, I. (1979). Round barrows and ring-ditches in the British Neolithic. London: British Museum.
Laffoon, J. E., Sonnemann, T. F., Shafie, T., Hofman, C. L., Brandes, U., & Davies, G. R. (2017). Investigating human
geographic origins using dual-isotope (87Sr/86Sr, d18O) assignment approaches. PLoS One, 12(2), e0172562. https://
doi.org/10.1371/journal.pone.0172562
Lightfoot, E., & O’Connell, T. C. (2016). On the use of biomineral oxygen isotope data to identify human migrants
in the archaeological record: Intra-sample variation, statistical methods and geographical considerations. PLoS
One, 11(4), e0153850. https://doi.org/10.1371/journal.pone.0153850
Lin, G. P., Rau, Y. H., Chen, Y. F., Chou, C. C., & Fu, W. G. (2003). Measurements of dD and d18O stable isotope
ratios in milk. Journal of Food Science, 68(7), 2192–2195. https://doi.org/10.1111/j.1365-2621.2003.tb05745.x
Longinelli, A. (1984). Oxygen isotopes in mammal bone phosphate: A new tool for paleohydrological and paleoclimato-
logical research? Geochimica et Cosmochimica Acta, 48(2), 385–390. https://doi.org/10.1016/0016-7037(84)90259-X
Loveday, R. (2002). Duggleby Howe revisited. Oxford Journal of Archaeology, 21(2), 135–146. https://doi.
org/10.1111/1468-0092.00153
Loveday, R. (2011). Polished rectangular flint knives–elaboration or replication? In A. Saville (Ed.), Flint and stone in
the Neolithic period (pp. 37–61). Oxford: Oxbow Books.
Loveday, R. (2016). Monuments to mobility? Investigating cursus patterning in Southern Britain. In J. Leary &
T. Kador (Eds.), Moving on in Neolithic studies: Understanding mobile lives (pp. 67–109). Oxford: Oxbow Books.
Luz, B., Kolodny, Y., & Horowitz, M. (1984). Fractionation of oxygen isotopes between mammalian bone-
phosphate and environmental drinking water. Geochimica et Cosmochimica Acta, 48(8), 1689–1693. https://doi.
org/10.1016/0016-7037(84)90338-7
Manby, T. (1988). The Neolithic in Eastern Yorkshire. In T. Manby (Ed.), Archaeology in Eastern Yorkshire: Essays in hon-
our of T. C. M. Brewster (pp. 35–88). Sheffield: Department of Archaeology and Prehistory, University of Sheffield.
Montgomery, J. (2010). Passports from the past: Investigating human dispersals using strontium isotope analysis of
tooth enamel. Annals of Human Biology, 37(3), 325–346. https://doi.org/10.3109/03014461003649297
Montgomery, J., Budd, P., & Evans, J. (2000). Reconstructing the lifetime movements of ancient people: A
Neolithic case study from Southern England. European Journal of Archaeology, 3(3), 370–385. https://doi.
org/10.1177/146195710000300304
Montgomery, J., Cooper, R. E., & Evans, J. (2007). Foragers, farmers or foreigners? An assessment of dietary stron-
tium isotope variation in Middle Neolithic and Early Bronze Age East Yorkshire. In M. Larsson & M. Parker-
Pearson (Eds.), From Stonehenge to the Baltic: Living with cultural diversity in the third millennium BC (pp. 65–75).
Oxford: Archaeopress.
Montgomery, J., Evans, J. A., & Cooper, R. E. (2007). Resolving archaeological populations with Sr-isotope mixing
models. Applied Geochemistry, 22(7), 1502–1514. https://doi.org/10.1016/j.apgeochem.2007.02.009
Montgomery, J., & Jay, M. (2013). The contribution of Skeletal Isotope Analysis to understanding the Bronze Age in
Europe. In H. Fokkens & A. Harding (Eds.), The Oxford handbook of the European Bronze Age. Retrieved from www.
oxfordhandbooks.com/view/10.1093/oxfordhb/9780199572861.001.0001/oxfordhb-9780199572861-e-10
Moreau, L., Brandl, M., Filzmoser, P., Hauzenberger, C., Goemaere, É., Jadin, I., . . . Schmitz, R. W. (2016). Geochemi-
cal sourcing of flint artifacts from Western Belgium and the German Rhineland: Testing hypotheses on Gravettian
period mobility and raw material economy. Geoarchaeology, 31(3), 229–243. https://doi.org/10.1002/gea.21564
Mortimer, J. (1892). An account of the opening of the tumulus “Howe Hill” Duggleby. Proceedings of the Yorkshire
Geological and Polytechnical Society, 12, 215–225.
Mortimer, J. (1893). Further observations on the contents of the Howe Tumulus [Duggleby]. Proceedings of the York-
shire Geological and Polytechnical Society, 12, 242–245.
Mortimer, J. (1905). Forty years’ researches in British and Saxon burial mounds of East Yorkshire: Including Romano-British
discoveries, and a description of the ancient entrenchments of a section of the Yorkshire Wolds. London: A Brown and Sons,
Limited.
210 John Pouncett
Neil, S., Evans, J., Montgomery, J., & Scarre, C. (2018). Isotopic evidence for landscape use and the role of causewayed
enclosures during the earlier Neolithic in Southern Britain. Proceedings of the Prehistoric Society, 1–21. https://doi.
org/10.1017/ppr.2018.6
Ostapkowicz, J., Brock, F., Wiedenhoeft, A. C., Snoeck, C., Pouncett, J., Baksh-Comeau, Y., . . . Boomert, A. (2017).
Black pitch, carved histories: Radiocarbon dating, wood species identification and strontium isotope analysis of
prehistoric wood carvings from Trinidad’s Pitch Lake. Journal of Archaeological Science: Reports, 16, 341–358. https://
doi.org/10.1016/j.jasrep.2017.08.018
Pellegrini, M., Pouncett, J., Jay, M., Parker-Pearson, M., & Richards, M. P. (2016). Tooth enamel oxygen “isoscapes”
show a high degree of human mobility in prehistoric Britain. Scientific Reports, 6. https://doi.org/10.1038/
srep34986
Pierpoint, S. (1980). Social patterns in Yorkshire prehistory 3500–750 B.C. Oxford: British Archaeological Reports.
Pollard, A. M. (Ed.). (2018). Beyond provenance: New approaches to interpreting the chemistry of archaeological copper alloys.
Leuven: Leuven University Press.
Pollard, A. M., Pellegrini, M., & Lee-Thorp, J. A. (2011). Technical note: Some observations on the conversion of
dental enamel d18Op values to d18Ow to determine human mobility. American Journal of Physical Anthropology, 145(3),
499–504. https://doi.org/10.1002/ajpa.21524
Pouncett, J. (2019). Neolithic occupation and stone working on the Yorkshire Wolds (Unpublished doctoral dissertation).
University of Oxford, Oxford.
Price, T. D., Burton, J. H., & Bentley, R. A. (2002). The characterization of biologically available strontium isotope ratios
for the study of prehistoric migration. Archaeometry, 44(1), 117–135. https://doi.org/10.1111/1475-4754.00047
Riley, D. (1980). Recent air photographs of Duggleby Howe and the Ferrybridge henge. Yorkshire Archaeological
Journal, 52, 174–178.
Ryan, S. E., Snoeck, C., Crowley, Q. G., & Babechuk, M. G. (2018). 87Sr/86Sr and trace element mapping of geosphere-
hydrosphere-biosphere interactions: A case study in Ireland. Applied Geochemistry, 92, 209–224. https://doi.
org/10.1016/j.apgeochem.2018.01.007
Schulting, R. J., le Roux, P., Gan,Y. M., Pouncett, J., Hamilton, J., Snoeck, C., . . . Lock, G. (2019). The ups & downs
of Iron Age animal management on the Oxfordshire Ridgeway, south-central England: A multi-isotope approach.
Journal of Archaeological Science, 101, 199–212. https://doi.org/10.1016/j.jas.2018.09.006
Schulting, R. J., Murphy, E., Jones, C., & Warren, G. (2012). New dates from the north and a proposed chronology
for Irish court tombs. Proceedings of the Royal Irish Academy. Section C: Archaeology, Celtic Studies, History, Linguistics,
Literature, 112C, 1–60.
Sharp, Z. (2007). Principles of stable isotope geochemistry. Upper Saddle River, NJ: Pearson Education.
Smith, A. G., Pilcher, J. R., & Pearson, G. W. (1971). New radiocarbon dates from Ireland. Antiquity, 45(178), 97–102.
https://doi.org/10.1017/S0003598X00069246
Snoeck, C., Lee-Thorp, J., Schulting, R., Jong, J. de, Debouge, W., & Mattielli, N. (2015). Calcined bone provides
a reliable substrate for strontium isotope ratios as shown by an enrichment experiment. Rapid Communications in
Mass Spectrometry, 29(1), 107–114. https://doi.org/10.1002/rcm.7078
Snoeck, C., Pouncett, J., Claeys, P., Goderis, S., Mattielli, N., Parker-Pearson, M., . . . Schulting, R. J. (2018). Strontium
isotope analysis on cremated human remains from Stonehenge support links with west Wales. Scientific Reports,
8(1), 10790. https://doi.org/10.1038/s41598-018-28969-8
Snoeck, C., Pouncett, J., Ramsey, G., Meighan, I. G., Mattielli, N., Goderis, S., . . . Schulting, R. J. (2016). Mobility
during the Neolithic and Bronze Age in Northern Ireland explored using strontium isotope analysis of cremated
human bone. American Journal of Physical Anthropology, 160(3), 397–413. https://doi.org/10.1002/ajpa.22977
Snoeck, C., Ryan, S. E., Pouncett, J., Pellegrini, M., Claeys, P., Wainwright, A., . . . Schulting, R. J. (2019). Towards a
biologically available strontium baseline for Ireland. Manuscript submitted for publication.
Snoeck, C., Schulting, R. J., Lee-Thorp, J. A., Lebon, M., & Zazzo, A. (2016). Impact of heating conditions on the
carbon and oxygen isotope composition of calcined bone. Journal of Archaeological Science, 65, 32–43. https://doi.
org/10.1016/j.jas.2015.10.013
Warham, J. O. (2011). Mapping biosphere strontium isotope ratios across major lithological boundaries: A systematic inves-
tigation of the major influences on geographic variation in the 87Sr/86Sr composition of bioavailable strontium above the
Cretaceous and Jurassic rocks of England (Doctoral dissertation). Retrieved from https://bradscholars.brad.ac.uk/
handle/10454/5500
Spatial approaches to assignment 211
Waterman, D. M., & Morton, W. R. M. (1965). The court cairn at Annaghmare, Co. Armagh. Ulster Journal of
Archaeology, 28, 3–46.
White, C. D., Spence, M. W., Longstaffe, F. J., & Law, K. R. (2004). Demography and ethnic continuity in the Tlai-
lotlacan enclave of Teotihuacan: The evidence from stable oxygen isotopes. Journal of Anthropological Archaeology,
23(4), 385–403. https://doi.org/10.1016/j.jaa.2004.08.002
Wilson, J. C., & Standish, C. D. (2016). Mobility and migration in late Iron Age and Early Medieval Ireland. Journal
of Archaeological Science: Reports, 6, 230–241. https://doi.org/10.1016/j.jasrep.2016.02.016
Wright, L. E. (2005). Identifying immigrants to Tikal, Guatemala: Defining local variability in strontium isotope
ratios of human tooth enamel. Journal of Archaeological Science, 32(4), 555–566. https://doi.org/10.1016/j.
jas.2004.11.011
Wunder, M. B. (2010). Using isoscapes to model probability surfaces for determining geographic origins. In J. West,
G. Bowen, T. Dawson, & K. Tu (Eds.), Isoscapes: Understanding movement, pattern, and process on earth through isotope
mapping (pp. 251–270). Dordrecht: Springer.
Wunder, M. B., Kester, C. L., Knopf, F. L., & Rye, R. O. (2005). A test of geographic assignment using isotope tracers
in feathers of known origin. Oecologia, 144(4), 607–617. https://doi.org/10.1007/s00442-005-0071-y
12
Analysing regional environmental
relationships
Kenneth L. Kvamme
Introduction
The study of the distributions of archaeological settlements and sites across the landscape has a long
history. Frequently referred to as “settlement pattern” research or as “settlement archaeology” (Chang,
1968), most scholars agree that it first became a serious focus with the publication of Willey’s (1953)
Prehistoric Settlement Patterns in the Virú Valley, Peru. That work aimed to describe “prehistoric sites with
reference to geographic . . . position” and “to reconstruct cultural institutions insofar as these may be
affected by settlement configurations” (Willey, 1953, p. 1). Billman (1997, p. 1) argues that “no other
single project has so fundamentally changed the manner in which archaeology is conducted.”
Yet, there were antecedents. Trigger (1967) observes that archaeologists have long noticed environ-
mental associations, such as the one between Linearbandkeramic sites and loess soils in Europe. It was
Julian Steward, a cultural anthropologist active in archaeology, who probably played the largest role in
getting settlement archaeology off the ground by combining a focus in cultural ecology with the study
of regions as units in such pioneering papers as “Ecological Aspects of Southwestern Society” (Steward,
1937), where regional archaeological distributions yielded insights into the development of Puebloan
society. Murphy (1977, p. 26) concludes that Steward pioneered settlement archaeology, noting that the
Virú Valley project was largely planned by Steward (along with Wendall Bennett) who “assigned” Willey
to the settlement pattern investigations.
Just what is settlement archaeology and the study of settlement patterns? Before proceeding, it should
be clarified that although settlements are commonly regarded as villages or communities of multiple
households, the investigation of settlement patterns pertains to regional archaeological distributions of
any kind, so hereafter “site” and “settlement” are used interchangeably. Early investigators (e.g. Chang,
1968; Parsons, 1972; Trigger, 1968) commonly recognized several areas of investigation, but my focus
is on the distribution of settlements over the landscape. In this context Kantner (1996, p. 636) defines
settlement pattern as “the distribution of human activities across the landscape and the spatial relationship
between these activities and features of the natural and social environment.” These two environments are
critical to understanding the perspectives and methodologies that have developed.
It is generally argued that the natural environment is central to settlement because regional popula-
tion distributions are largely governed by the nature and availability of natural resources. At the same
Analysing regional environmental relationships 213
time, social, political, and religious institutions also frame patterns of settlement (Trigger, 1968). Winters
(1969, p. 110) views settlement pattern as “geographic and physiographic relationships” (the natural
environment), while settlement system pertains to “the functional relationships among the sites con-
tained within a settlement pattern” (the social environment). The Southwestern Archaeological Research
Group (SARG), a consortium of researchers in the American Southwest that focused on the question
of why “prehistoric populations located sites where they did”, explicitly recognized that sites had to be
investigated in the context of these two environments (Plog & Hill, 1971, pp. 8–9). Separate analytical
approaches have formed around these distinct perspectives in settlement pattern studies.
The foregoing duality is emphasized because it fits well within contemporary spatial analytic perspec-
tives. At the scale of regions an archaeological distribution may be regarded as a point pattern, which is
a realization of one or more spatial processes (O’Sullivan & Unwin, 2003, pp. 64–66). Bevan, Crema,
Li, and Palmisano (2013) describe first-and second-order characteristics of a realized point pattern (see
Bevan, this volume). The former influence the intensity of points, causing regional variation. Favour-
able environmental circumstances (soils, access to water) commonly encourage such first-order trends.
Second-order effects occur when interactions between the points themselves impact their distributions,
such as repulsion or attraction. In other words, the presence of a site, such as a central place, may influence
the distribution of other sites. This chapter focuses exclusively on first-order spatial characteristics and
ways that have been employed to analyse environmental relationships with archaeological distributions.
Core methods
5 km (one hour) for farming peoples. An application by Peebles (1978) examined siting preferences of Mis-
sissippian settlements in the vicinity of Moundville, a ceremonial centre in Alabama. Early twentieth century
pre-hybrid corn yields in bushels per acre by soil type were employed as a proxy for Mississippian period
corn yields (recognizing that absolute yields might be overestimated, but relative scaling would be accurate).
Catchment radii of 0.6 and 1.2 miles (approximately 1 and 2 km, respectively) were investigated around
each site and corn productivity was cumulated by soil type. The data showed that larger settlements were
associated with more productive catchments and strong correlations, some as high as r = .87, were shown
between catchment productivity in bushels of corn and village size.
approximated from only three data points – the remainder of the cumulative curve was simply “sketched
in” between them. It would have to wait until the advent of geographical information systems (GIS) more
than a decade later before more accurate approximations of background distributions could be achieved.
GIS modifications
By the 1990s GIS technology was making a tremendous impact in regional studies. Besides ease of data
handling, Harris and Lock (1990) realized that it provided a means to link the two primary approaches of
regional investigations: visual and subjective appraisals of map-type information and quantitative analysis.
It also encouraged a change in the direction of “spatial analysis” which began to lean heavily toward
visualization and away from quantification. Yet, although data quality, volume, and ease of analysis was
improved, in practice GIS approaches delivered few advances over analysis methods established decades
earlier, it just made them easier to do.
New variables
With GIS a variety of new variables could be explored that were previously difficult or impossible to
compute. Two that stand out are viewsheds, areas visible from a point or points, and cost distances, quan-
tifications of non-linear distances computed by considering difficulty of human travel over landscapes
(see Gillings & Wheatley; Herzog, this volume; Conolly & Lake, 2006, p. 215). Wheatley (1995) exam-
ined inter-visibility between barrows in the region of Stonehenge, in the United Kingdom, where their
prominence on the landscape has been long noted. A viewshed was computed from each barrow and
the multiple viewsheds were then summed to yield a cumulative viewshed, where each cell in the raster
data structure held the number of barrows visible. Comparing the cumulative distributions measured at
the barrows against the cumulative distributions for the entire region using a one-sample Kolmogorov-
Smirnov test showed significantly greater inter-visibility between barrows, suggesting their siting for
visibility and a ritual authority that required them to be seen.
In GIS-based SCA, Hunt (1992) showed quite early how GIS increase the accuracy of environmental
features represented within catchments and how more realistic boundaries could be formed utilizing drain-
ages or their basins (as opposed to arbitrary circular territories). However, more profound changes were
introduced by Gaffney and Stančič (1996) who proposed cost-distance weighted catchments based on
difficulty of travel to or from a site that formed more realistic territories of irregular shape within which
catchment calculations could be based. Ullah (2011) gives a more contemporary example of this approach.
New methods
Raster GIS permit quantification of entire regions on a cell-by-cell basis (e.g. every 10 m) allowing
entire “population distributions” of covariates to be more closely approximated for better estimates of
216 Kenneth L. Kvamme
background proportions in goodness-of-fit tests. Kvamme (1992a), for example, replicated Hodder and
Orton’s (1976, pp. 226–229) analysis of Iron Age coin finds against the Roman road network in the
south of England, but instead of approximating the background distribution of distances to roads with
only three data points, more than 23,000 were employed, one for each square kilometre in the study
region. This more accurate analysis yielded the same conclusion, but the statistical significance was
somewhat lower. Digital characterization of entire background environments also allows replacement
of two-sample tests with one-sample forms, because it is no longer necessary to sample the background.
The unusualness of a single archaeological sample’s locational tendencies can be examined against entire
background populations (Kvamme, 1990).
GIS also facilitate use of resampling or randomization methods where, given a sample of n archaeologi-
cal sites, raster methods can generate thousands of random samples of the same size (with replacement)
for a variable of interest. The result is a sampling distribution of some focus statistic (e.g. mean, median,
variance) against which the actual sample is compared. If the archaeological sample statistic lies in the
extreme five percent of all samples, then tendencies significantly different from the sampled region can be
claimed (Fischer Farrelly, Naddocks, & Ruggles, 1997; Kvamme, 1996). As a form of permutation test, this
approach is resistant to extreme values and offers freedom from the limiting assumptions (e.g. normality,
homogeneity) associated with many statistical tests (Berry, Johnson, & Mielke, 2014, p. 6).
Multivariate approaches
Multivariate approaches came to the forefront with the advent of GIS and ALM. While some ALM only
focus on modelling as an end-product, largely for cultural resource management and planning, others
employ multivariate statistical models for the insights they offer into relationships between sites and
environment. When all variables are considered jointly in a multivariate analysis many frequently lose
significance owing to inter-correlations and redundancies between them. Moreover, some models yield
coefficient weights that have interpretive value. Warren and Asch (2000), for example, based on a logistic
regression analysis between samples of sites and non-sites in central Illinois, interpret findings to show
increasing site probability near streams, in regions of higher local relief, on steep slopes, in floodplains,
and in upland knolls.
Another multivariate strategy comes from species distribution modelling in the biological sciences
(Browning, Beaupré, & Duncan, 2005; Dunn & Duncan, 2000; Rotenberry, Preston, & Knick, 2006)
where an unusual application of principal components analysis (PCA) offers an approach for isolating
dimensions most relevant to analysed locations. It recognizes that many measured variables are obtained
through convenience or ease of measurement through GIS, that some variables may be partially or highly
Analysing regional environmental relationships 217
correlated representing redundant expressions of similar phenomena, and that other variables may not
actually be relevant. PCA permits new independent dimensions to be defined that exhibit minimum
variance at the locations under analysis (i.e. the lowest principal components), thereby removing the most
variable siting aspects of a point pattern and leaving location commonalities in a reduced set of dimen-
sions. This approach was investigated by Kvamme (in press) using historic farmstead data from northwest
Arkansas where the lowest location-constraining components demonstrate natural and social features of
environment as dimensions relevant to farmstead placement (see the final case study in this chapter).
Issues
One of the greatest difficulties in the investigation of first-order environmental relationships with archae-
ological distributions is constancy of environment. This is particularly true for zonal approaches that
investigate frequencies of site occurrence with respect to environmental communities, whose boundaries
may have varied through time. Modern alterations to landscape, changing courses of rivers and streams,
and climate change frequently challenge assumptions of unchanged environments (Lock & Harris, 2006).
Such considerations are frequently ignored in locational analyses.
Also ignored is the nature and extent of the “study region” itself. Most studies, particularly in GIS set-
tings, simply employ an arbitrary rectangle surrounding a point pattern of interest. Yet, in virtually all of
the foregoing, statistical conclusions are reached through comparisons against data from the background
region, so the nature of how that region is defined can profoundly affect results. A region that extends
beyond the point pattern might include higher elevations, lower slopes, poorer soils, and the like, that
will bias results when compared against the more restrictive space occupied by sites. These effects were
illustrated by Kvamme (1996) in a central Arizona study that analysed environmental variables measured
at 30 settlements. When the background was defined as the area within a minimum polygon encompass-
ing the settlement distribution, less significant and even insignificant differences were found compared
to an analysis that bounded the distribution with a larger and arbitrary rectangle containing large areas
devoid of settlements.
Method
In the following, a generic null hypothesis is one of “no difference” between sites and background or
between site types in the samples analysed. Depending on the statistical test, this hypothesis might vary
from no differences from expectation to no difference between means, medians, variances, or cumula-
tive distribution functions. Unless otherwise stated, two-tailed testing forms are presented for simplicity.
10% of the region, producing an expectation of 10 sites, a real orientation for sites in that zone is realized
because the 25 observed are far more than expected. The foregoing conclusions are subjective; statistical
evaluation is achieved by computing:
2
k
(Oi − Ei )
2
χObs =∑ (12.1)
i =1 Ei
(where Oi are observed and Ei are expected frequencies per category), which follows a chi-square dis-
tribution with k -1 degrees of freedom (df; when, conventionally, all Ei ³ 5; Conover, 1999, p. 240). A
significant result (p << .05) indicates that observed frequencies in one or more categories deviate mark-
edly from expectation, showing preferences or avoidances for those categories.
where xi are the respective sample means (sites and background or two site types), si2 the sample vari-
ances, and ni the sample sizes. This statistic is compared against a t-distribution with ν df, where
2
s12 s22
+
n1 n2
ν=
s14 s24
n 2 ( n − 1) + n 2 ( n − 1)
1 1 2 2
T = ∑ i =1 R( xi )
nk
(12.3)
where R(xi) is the rank of the ith case in group k. When n1 < 20 and n2 < 20 and there are not ties
exact probabilities associated with T may be obtained; otherwise, various approximations are employed
(Conover, 1999, p. 272).
The Kolmogorov-Smirnov test looks for any distributional differences between two samples based on
their empirical cumulative distribution functions. It is nonparametric and has the advantage of a graphical
solution, useful for illustrating differences (see following). The test statistic is simply:
T = supx S1 (x ) − S2 (x ) (12.4)
Analysing regional environmental relationships 219
which indicates the greatest vertical distance (denoted by “sup” for supremum) between the Si(x), the
respective sample empirical distribution functions for variable, x. Under the null hypothesis of no distri-
butional differences, the distribution of T follows the two-sample Smirnov distribution (Conover, 1999,
p. 456).
The parametric approach for evaluating differences between two sample variances computes an
F-ratio:
s12
FObs = (12.5)
s22
which follows an F-distribution with n1 -1 and n2 -1 df (Hays, 1994, p. 362). Yet this test is highly sensi-
tive to departures from normality so a nonparametric test, such as Levene’s (R Core Team, 2016; Levene,
1960), is preferred. In the two-sample case it computes:
W= (12.6)
∑ ∑ (x − xi )
2 ni 2
i =1 j =1 ij
where xTot is the grand mean over all cases in both groups. This statistic follows an F-distribution with
1 and n1 + n2 -2 df. Kvamme, Stark, and Longacre (1996) demonstrate the superiority of a version of this
test (based on medians instead of means) in an archaeological application.
T = supx F * (x ) − S (x ) (12.7)
which indicates the greatest vertical distance between sample and theoretical distribution functions.
Under a null hypothesis of no distributional differences, the distribution of T follows the one-sample
Kolmogorov distribution (Conover, 1999, p. 430).
With an entire regional population encoded within GIS, randomization methods or Monte Carlo
significance tests, may be performed. A sample of archaeological locations can be regarded as a subset
(S′) of n observations from the population. GIS can generate k random samples of size n from the same
population (S1, S2, . . ., Sk). A summary statistic of interest, T, is computed for each sample, yielding k + 1
values (T ′, T1, T2, . . ., Tk). Assuming each sample is an equally probable outcome from the population the
probability of the observed test statistic, T′, or one more extreme, is the proportion or resampled statistics
with values equal to or more extreme than that value. To illustrate, if k = 999 and R(T ′) = 40 (i.e. the
fortieth smallest), then p = .04. This value should be doubled for two-tailed probabilities.
corresponding site densities. Significant trends suggest site intensity variations with the covariate. With
a null hypothesis of “no correlation”, statistical significance is evaluated with Pearson’s r, the product-
moment correlation coefficient, given by:
∑
n
(xi − x )(yi − y )
r= i =1
√ 2 (12.8)
∑ i =1(xi − x ) √ ∑ i =1(yi − y )
n 2 n
(where the xi are bin midpoints, the yi are bin densities, n is the number of bins, and -1 £ r £ 1). The
ratio r √(n – 2) / √(1 – r2) follows a t-distribution with n – 2 (Hays, 1994, p. 647). Owing to the bivariate
normality assumption of this test a safer course is to rely on the nonparametric Spearman’s rho, rs, which
assesses correlation based on data ranks (Conover, 1999, p. 314).
Multivariate approaches
Bevan et al. (2013) go a step further with the foregoing method by examining site intensity relationships
against a suite of environmental covariates which are then subjected to multiple linear regression (Hays,
1994, p. 687) as a means to examine (and model) multivariate relationships. This section, however, exam-
ines the more commonly employed multiple logistic regression model, a nonparametric method that compares
k-independent variables for differences between two classes, such as site-presence and site-absence, which
form a dichotomous dependent variable, Y (coded 1, 0, respectively). Computing the logarithm of the
odds ratio
P ( Y = 1)
ln
1− P [ Y = 1]
is known as a logit transformation which produces a dependent variable that varies between plus and
minus infinity and becomes increasingly large as the odds ratio increases. The relationship between this
dependent variable and the independent variables then becomes:
where the hat, “^”, means “estimate of ” and “SE” is the standard error of estimate. Under a null hypothesis
that these coefficients are zero (the variable has no effect), the Wj may be evaluated for significance against
the standard normal distribution to ascertain variables bearing noteworthy relationships with site presence
(Hosmer & Lemeshow, 2000, p. 37). Moreover, the coefficients lend themselves to interpretation. The signs,
when positive, indicate that high values of the associated variable are related to site presence, with the reverse
for negative coefficients. Coefficient sizes may also be compared when measurement scales are the same
(standardization may achieve this), with larger absolute coefficients giving greater influence.
Analysing regional environmental relationships 221
PCA is a multivariate method that can be applied to an n ´ k matrix of k variables measured at n settle-
ments. The method produces k uncorrelated principal components (PCs) that represent independent sets
of relationships between a settlement distribution and the measured variables. Each PC, 1 through k, is
associated with an eigenvalue (variance), λ, representing the portion of the total variance included in the
k variables, such that λ1 > λ2 > . . . > λk. Each PC is also associated with an eigenvector, k coefficients
that when multiplied against the original variables and summed form the linear composites that are the
PCs. The absolute sizes of these coefficients indicate the relative importance of each variable to each
component, forming a basis for PCA interpretation. In other words, the meaning of a PC is gained by
determining which variables are associated with the largest absolute coefficients. The data rearrangement
of PCA can generate unanticipated insights into data structures and reveal latent underlying dimensions
(Jolliffe, 2002). Unstandardized PCA is based on the original or raw measurements where different scales,
magnitudes, and variances can profoundly influence results. Most applications therefore employ standard-
ized PCA where each variable is standardized to a variance of unity and therefore offers an equal influence
in the analysis. While the highest PCs maximize variation, the lowest exhibit minimum variance. When
the latter are derived from settlement data they portray dimensions that isolate least variable contexts for
settlement locations.
Case studies
Many of the foregoing methods are illustrated through a case study from the Sonoran Desert of southern
Arizona that contains a prehistoric Hohokam agricultural field complex of the Classic Period (13th–14th
centuries CE). This field, originally mapped by Fish, Fish, Miksicek, and Madsen (1985), contains an
abundance of agricultural features including terraces, check dams, and the ubiquitous “rock pile.” Rock
piles are circular lens-shaped mounds of earth about 1.5 m in diameter and .75 m high covered with
fist-sized cobbles. They were employed for agave growing, an important plant for food and fibre. Local
soils have a high clay content that causes rainfall to run-off rather than penetrate. The mounds enhance
the plant-growing environment because their relatively porous surfaces permit absorption of run-off
and direct rainfall during the monsoon season (July–August). Moreover, the rocks act like a mulch by
preserving interior moisture. An early GIS-based analysis of these rock piles (Kvamme, 1992b) revealed
tendencies for higher elevations (perhaps to reduce water run-off volume), but on steep slopes (necessary
to capture run-off), with an avoidance of drainages (where too much run-off could be damaging) and
ridge-like concave-down surfaces (that reduce run-off), and an insignificant tendency for north-facing
slopes (reducing the effects of intense solar radiation). The same sample of n = 50 rock piles analysed by
that study, and for brevity only the slope data (as percent grade at 4 m spatial resolution), are examined
in the 400 ´ 400 m study region.
Figure 12.1 Slope data and rock pile locations in the Hohokam agricultural field complex: (a) slope in 3 equal-
area categories, (b) slope data showing rock pile and background samples with boxplot and cumulative graphs,
(c) slope data showing rock pile and check dam samples with boxplot and cumulative graphs, (d) slope data
and rock piles with cumulative graphs, (e) sampling distribution of 9,999 sample means from the region with
indication of realized sample mean.
larger. While this test is robust against its normality assumption, the boxplot of the background distri-
bution illustrates great skewness. For comparison, the nonparametric Mann-Whitney test ((Eq. (12.3))
based on ranks yields p = .0003 (W = 1781). Finally, the Kolmogorov-Smirnov test (Eq. (12.4)), which
detects distributional differences of any kind between the empirical cumulative distribution functions
(right, Figure 12.1(b)), yields a maximum difference of D = 0.44, significant at p < .0001, with the rock
pile distribution skewed markedly toward higher slopes.
Because a priori theory suggests that human activities and occupations are placed in lower variance
settings compared to a region at large (Kvamme, 1985, 2006), a conventional one-tailed F-test for
comparing two variances (Eq. (12.5)) is explored, which yields a highly significant result (FObs = 2.159;
the theoretical F49,49 gives p = .004). Yet, because this test is highly sensitive to departures from normality the
nonparametric Levene’s test (Eq. (12.6)) was also run, which yields a significant result (FObs = 6.5763; the
theoretical F1,98 gives p = .006), so evidence suggests rock pile placements occur in less variable settings
with respect to slope.
Table 12.1 Descriptive slope (percent grade) statistics for rock piles, check dams, and background samples.
Figure 12.2 Slope data and rock pile locations in the Hohokam agricultural field complex: (a) slope in eight
categories with midpoints plotted against rock pile density, (b) rock piles and slopes within a circumscribed
study region with cumulative distribution plots, (c) elevation, slopes, and a logistic regression model for rock
piles based on these data.
“background” as lying within a minimum bounding polygon surrounding the points, a solution not
without problems (e.g. when the point distribution is highly irregular or non-contiguous). In any case,
for the present analysis the first nearest-neighbour distance was computed for each rock pile in the sample
and the mean nearest-neighbour distance was determined (21.75 m). This distance was then employed
as a buffer radius around each rock pile (via GIS), and a minimum bounding polygon was established
around these buffered areas to redefine a more relevant background region (Figure 12.2(b)). The empiri-
cal cumulative distribution function of the new background is substantially less different from the rock
pile distribution compared to the full background (Figure 12.2(b)), with a maximum difference in a one-
sample Kolmogorov-Smirnov test (12.7) of D = .23, compared to the previous D = .45 (Figure 12.1(d)).
The result, nevertheless, remains significant (p < .01).
Multivariate approaches
The slope data are again considered together with the three variables from the original analysis of the
Marana field complex (Kvamme, 1992b) to examine how they simultaneously relate to the rock pile loca-
tions. Globally, these variables, elevation, a ridge-drainage index, and aspect (measured on a north-south
scale), exhibit low inter-correlations (Table 12.2, left) indicating they are largely independent (the largest
correlation, between slope and aspect, is r = -.2311, indicating only 100r2 = 5.3% of the variance in com-
mon). Utilizing a random sample of n = 500 background points (to more fully characterize the background
environment) and the n = 50 rock piles, these data were subjected to a logistic regression analysis (Eq. (12.9))
with coefficients and associated statistics given in Table 12.2 (right). The positive coefficients indicate that
rock pile placement is associated with high values of slope, elevation, and the ridge-drainage index (pointing
to ridge-like settings), although the last is insignificant (p = .28), and aspect is inconsequential (p = .77). The
data set was standardized to remove scale differences and the analysis was re-run to obtain beta coefficients
whose absolute values may be directly compared (Table 12.2). They indicate that slope is nearly twice as
influential as higher elevation settings in the rock pile placements. The coefficients may be directly applied
to the entire region via GIS map algebra to produce a mapping of the modelled relationships using:
1
p (Rock pile ) =
500
+.160 Slope +.129 Elevation +.004 RidgeIndex −.001 Aspect
−−7.554 +ln
50
1 + e
Table 12.2 Global correlation matrix (left) for the four variables measured in the agricultural field complex and
logistic regression parameter estimates (right) indicating multivariate relationships between rock pile presence and
the four variables.
Slope Elevation Ridge Estimated Std. Error z value p(>|z|) Beta Coeff
Index Coefficient
(where the log-ratio removes sample size imbalances). Clearly, the effects of slope are dominant (Fig-
ure 12.2(c)), but the contribution of elevation may also be discerned.
A second case study utilizes the species distribution modelling technique pioneered in the biologi-
cal sciences based on the lowest PCs of a PCA analysis (Browning et al., 2005; Dunn & Duncan, 2000;
Rotenberry et al., 2006). An historic data set includes 589 farmsteads and roads in an 18 ´ 27 km area of
northwest Arkansas derived from surveyor-grade maps made in 1892 (Kvamme, in press, Figure 12.3(a)).
Figure 12.3 A northwest Arkansas historic data set from 1892: (a) the 18 ´ 27 km study region with 589
historic farmsteads and roads plotted over topography with towns outlined, (b) maps of the four principal
components of historic settlement with central values of legend indicating most preferred locations. A colour
version of this figure can be found in the plates section.
Analysing regional environmental relationships 227
Table 12.3 Lowest four principal components derived from 10 environmental variables in historic Northwest
Arkansas with largest absolute coefficients of eigenvectors shown in boldface for interpretive purposes.
Eigenvectors
Eight GIS-generated variables of the natural environment were acquired from the study region (aspect on
a north-south scale, aspect on an east-west scale, elevation, slope, a soil quality index, soil neighbourhood
variance, stream distance, stream density), and two of the social environment (road distance and road den-
sity), all at 10 m spatial resolution. These data were extracted from the loci of the farmsteads and subjected
to a standardized PCA. The four lowest components are of interest because they are location-constraining,
illustrated by their low eigenvalues compared to their “population variances” when they are mapped to the
full region (Table 12.3). Collectively, they represent less than 18% of the total variance in the data. They also
each appear to isolate very different dimensions relevant to farmstead placement, pointing to the importance
of soils, terrain, hydrology, and the cultural network of roads as the most constraining dimensions pertinent
to farmstead placement as revealed by the farmsteads themselves. The farmsteads exhibit more constancy in
locational variation when measured on these minimum variance components making them suitable vari-
ables for subsequent locational analyses and modelling. GIS map algebra methods were employed to apply
the eigenvectors associated with each component (Table 12.3) to the corresponding standardized data from
throughout the region permitting visualization of these principal dimensions of settlement (Figure 12.3(b))
and an improved understanding of preferred siting contexts.
Conclusion
The analysis of first-order characteristics of archaeological point patterns at the regional level generally
focuses on their relationships with characteristics of the natural environment, although fixed aspects of the
social environment, such as proximities to road networks or central places, have also been considered. A
228 Kenneth L. Kvamme
wide variety of methods have been utilized, parametric and nonparametric, for examining distributional,
central tendency, or variance characteristics of site samples relative to a region of interest. Moreover,
differences may also be examined between specific site types to examine locational variations between
them. Findings can give insights that help address the question first posed by SARG of why archaeological
sites are located in the places we find them (Plog & Hill, 1971). Positive and negative results increase the
knowledge base necessary for building explanatory models of location, and some of those relationships
can yield unanticipated insights. These methods are also important for screening relevant variables in
ALM settings as a start-point in the model-building process. At the same time, ALM models that combine
multiple first-order characteristics may be the best means for characterizing them. Bevan et al. (2013)
employ ALM to “remove the effects” of first-order locational mechanisms from a regional point-pattern
in an effort to better explore second-order characteristics of a settlement system.
The investigation of first-order environmental relationships has clear advantages over studies that
focus on second-order processes of social influences that structure patterns of settlement. For the former,
archaeological samples can be widely spread in non-contiguous survey areas through a region. Inferences
can then be drawn from the sample to the larger population of sites that hypothetically exist in the region
of study. This is not true in the study of second-order characteristics where contemporaneity between
settlements and sites needs to be established (making their interaction possible) and broad areas of full-
coverage survey must be considered because all components of the “system” should be exposed. The last
is necessary in order that interactions between the full network of sites, settlements, central places, nearest
neighbours, and the like, can be considered.
Finally, an important consideration in regional studies is the definition of “region” itself. Virtually
all analytical methods establish whether relationships exist between environmental features and archaeo-
logical distributions relative to the region investigated. The nature, size, and breadth of that region must
therefore be carefully considered because the nature of findings depends largely on the definition of
region. They should be defined either relative to the spread of the archaeological distribution in ques-
tion or according to some a priori construct of arguable relevance, such as a watershed, valley, or political
entity. Clearly, more work needs to be conducted in this critical domain.
Acknowledgements
All statistical calculations and associated graphs were generated using R software (The R Project for Statisti-
cal Computing, www.r-project.org/). The GIS operations and all maps were generated using TerrSet, by
Clark Labs at Clark University (https://clarklabs.org/terrset/). Excellent comments and improvements to
this chapter were suggested by the editors.
References
Attwell, M. R., & Fletcher, M. (1987). An analytical technique for investigating spatial relationships. Journal of
Archaeological Science, 14, 1–11.
Berry, K. J., Johnson, J. E., & Mielke, Jr., P. W. (2014). A chronicle of permutation statistical methods: 1920–2000, and
beyond. Cham, Switzerland: Springer.
Bevan, A., Crema, E., Li, X., & Palmisano, A. (2013). Intensities, interactions, and uncertainties: some new approaches
to archaeological distributions. In A. Bevan & M. Lake (Eds.), Computational approaches to archaeological spaces
(pp. 27–52). Walnut Creek, CA: Left Coast Press.
Billman, B. R. (1997). Settlement pattern research in the Americas: Past, present, and future. In B. R. Billman &
G. M. Feinman (Eds.), Settlement pattern studies in the Americas: Fifty years since Virú (pp. 1–5). Washington, DC:
Smithsonian Institution Press.
Analysing regional environmental relationships 229
Browning, D. M., Beaupré, S. J., & Duncan, L. (2005). Using partitioned Mahalanobis D2(k) to formulate a GIS-based
model of timber rattlesnake hibernacula. Journal of Wildlife Management, 69, 33–44.
Chang, K. C. (1968). Settlement archaeology. Palo Alto, CA: National Press Books.
Chapman, J. (2000). Settlement archaeology, theory. In L. Ellis (Ed.), Archaeological method and theory: An encyclopedia
(pp. 551–555). New York: Garland Publishing.
Conolly, J., & Lake, M. (2006). Geographical information systems in archaeology. Cambridge: Cambridge University Press.
Conover, W. J. (1999). Practical nonparametric statistics (3rd ed.). New York: John Wiley.
Dunn, J. E., & Duncan, L. (2000). Partitioning Mahalanobis D2 to sharpen GIS classification. In C. A. Brebbia &
P. Pascolo (Eds.), Management information systems 2000: GIS and remote sensing (pp. 195–204). Boston: WIT Press.
Fischer, P., Farrelly, C., Naddocks, A., & Ruggles, C. (1997). Spatial analysis of visible areas from the Bronze Age
cairns of Mull. Journal of Archaeological Science, 24, 581–592.
Fish, S. K., Fish, P. R., Miksicek, C., & Madsen, J. (1985). Prehistoric agave cultivation in Southern Arizona. Desert
Plants, 7, 107–112.
Gaffney, V., & Stančič, Z. (1996). GIS approaches to regional analysis: A case study of the island of Hvar. Ljubljana: Uni-
versity of Ljubljana.
Harris, T. M., & Lock, G. R. (1990). The diffusion of a new technology: A perspective on the adoption of a geo-
graphic information systems within UK archaeology. In K. M. S. Allen, S. W. Green, & E. B. W. Zubrow (Eds.),
Interpreting space: GIS and archaeology (pp. 33–53). London: Taylor & Francis.
Hays, W. L. (1994). Statistics (5th ed.). Fort Worth, TX: Harcourt Brace.
Hodder, I. R., & Orton, C. (1976). Spatial analysis in archaeology. Cambridge: Cambridge University Press.
Hosmer, D. W., & Lemeshow, S. (2000). Applied logistic regression (2nd ed.). New York: John Wiley.
Hunt, E. D. (1992). Upgrading site-catchment analyses with the use of GIS: Investigating the settlement patterns of
horticulturalists. World Archaeology, 24, 283–309.
Jolliffe, I. T. (2002). Principal components analysis (2nd ed.). New York: Springer-Verlag.
Kammermans, H. (2000). Land evaluation as predictive modelling: A deductive approach. In G. Lock (Ed.), Beyond
the map: Archaeology and spatial technologies (pp. 125–146). Amsterdam: IOS Press.
Kantner, J. (1996). Settlement pattern analysis. In B. M. Fagan (Ed.), The Oxford companion to archaeology (pp. 636–
638). Oxford: Oxford University Press.
Kellogg, D. C. (1987). Statistical relevance and site locational data. American Antiquity, 52, 143–150.
Kvamme, K. L. (1985). Determining empirical relationships between the natural environment and prehistoric site
locations: A hunter-gatherer example. In C. Carr (Ed.), For concordance in archaeological analysis: Bridging data struc-
ture, quantitative technique, and theory (pp. 208–238). Kansas City: Westport Publishers.
Kvamme, K. L. (1990). One-sample tests in regional archaeological analysis: New possibilities through computer
technology. American Antiquity, 55, 367–381.
Kvamme, K. L. (1992a). Geographic information systems and archaeology. In G. Lock & J. Moffett (Eds.), Com-
puter applications and quantitative methods in archaeology 1991 (pp. 77–84). BAR International Series S577. Oxford:
Tempus Reparatum.
Kvamme, K. L. (1992b). Terrain form analysis of archaeological location through geographic information systems.
In G. Lock & J. Moffett (Eds.), Computer applications and quantitative methods in archaeology 1991 (pp. 127–136).
BAR International Series S577. Oxford: Tempus Reparatum.
Kvamme, K. L. (1996). Randomization methods for statistical inference in raster GIS contexts. In A. Bietti, A.
Cazzella, I. Johnson, & A. Voorrips (Eds.), The colloquia of the XIII international congress of prehistoric and protohistoric
sciences, vol. 1: Theoretical and methodological problems (pp. 107–114). Forli, Italy: ABACO.
Kvamme, K. L. (2006). There and back again: Revisiting archaeological locational modeling. In M. W. Mehrer & K.
L. Wescott (Eds.), GIS and archaeological site location modelling (pp. 3–38). Boca Raton, FL: CRC Press.
Kvamme, K. L. (in press). Defining and modeling the dimensions of settlement choice: An empirical approach. In
E. Robinson, S. Harris, & B. F. Codding (Eds.), Cultural landscapes and long-term human ecology. Berlin: Springer.
Kvamme, K. L., Stark, M. T., & Longacre, W. A. (1996). Alternative procedures for assessing standardization in ceramic
assemblages. American Antiquity, 61, 116–126.
Levene, H. (1960). Robust tests for equality of variances. In I. Olkin, S. G. Ghurye, W. Hoeffding, W. G. Madow, & H.
B. Mann (Eds.), Contributions to probability and statistics: Essays in honor of Harold Hotelling (pp. 278–292). Stanford:
Stanford University Press.
230 Kenneth L. Kvamme
Lock, G., & Harris, T. (2006). Enhancing predictive archaeological modeling: Integrating location, landscape, and
culture. In M. W. Mehrer & K. L. Wescott (Eds.), GIS and archaeological site location modelling (pp. 41–62). Boca
Raton, FL: CRC Press.
Maschner, H. D. G. (1996). The politics of settlement choice on the Northwest Coast: Cognition, GIS, and coastal
landscapes. In M. Aldenderfer & H. D. G. Maschner (Eds.), Anthropology, space, and geographic information systems
(pp. 175–189). Oxford: Oxford University Press.
Mehrer, M. W., & Wescott, K. L. (Eds.). (2006). GIS and archaeological site location modeling. Boca Raton, FL: CRC Press.
Mink II, P. B., Stokes, B. J., & Pollack, D. (2006). Points vs. polygons: A test case using statewide geographic infor-
mation. In M. W. Mehrer & K. L. Wescott (Eds.), GIS and archaeological site location modelling (pp. 219–239). Boca
Raton, FL: CRC Press.
Murphy, R. F. (1977). Introduction: The anthropological theories of Julian H. Steward. In J. C. Steward & R. F.
Murphy (Eds.), Evolution and ecology: Essays on social transformation by Julian H. Steward (pp. 1–39). Urbana, IL:
University of Illinois Press.
O’Sullivan, D., & Unwin, D. (2003). Geographic information analysis. New York: John Wiley.
Parsons, J. R. (1972). Archaeological settlement patterns. Annual Review of Anthropology, 1, 127–150.
Pearson, C. E. (1978). Analysis of Late Mississippian settlements on Ossabaw Island, Georgia. In B. D. Smith (Ed.),
Mississippian settlement patterns (pp. 53–80). New York: Academic Press.
Peebles, C. S. (1978). Determinants of settlement size and location in the Moundville phase. In B. D. Smith (Ed.),
Mississippian settlement patterns (pp. 369–416). New York: Academic Press.
Plog, F. (1968). Archaeological surveys: A new perspective (Unpublished master’s thesis). Department of Anthropology,
University of Chicago, Chicago.
Plog, F., & Hill, J. N. (1971). Explaining variability in the distributions of sites. In G. J. Gumerman (Ed.), The distri-
bution of prehistoric population aggregates (pp. 7–36). Prescott, AZ: Prescott College Press.
R Core Team. (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing,
Vienna, Austria. Retrieved from www.R-project.org/
Rotenberry, J. T., Preston, K. L., & Knick, S. T. (2006). GIS-based niche modeling for mapping species’ habitat.
Ecology, 87, 1458–1464.
Shennan, S. (1988). Quantifying archaeology. Edinburgh: Edinburgh University Press.
Shermer, S. J., & Tiffany, J. A. (1985). Environmental variables as factors in site location: An example from the Upper
Midwest. Midcontinental Journal of Archaeology, 10, 215–240.
Steward, J. H. (1937). Ecological aspects of Southwestern society. Anthropos, 32, 87–104.
Trigger, B. G. (1967). Settlement archaeology: Its goals and promise. American Antiquity, 32, 149–160.
Trigger, B. G. (1968). The determinants of settlement patterns. In K. C. Chang (Ed.), Settlement archaeology
(pp. 53–78). Palo Alto, CA: National Press.
Ullah, I. I. T. (2011). A GIS method for assessing the zone of human-environmental impact around archaeological
sites: a test case from the Late Neolithic of Wadi Ziqlâb, Jordan. Journal of Archaeological Science, 38, 623–632.
Verhagen, P., & Whitley, T. G. (2012). Integrating archaeological theory and predictive modeling: A live report from
the scene. Journal of Archaeological Method and Theory, 19, 49–100.
Vita-Finzi, C., & Higgs, E. S. (1970). Prehistoric economy in the Mt. Carmel area of Palestine: Site catchment analy-
sis. Proceedings of the Prehistoric Society, 36, 1–37.
Warren, R. E., & Asch, D. L. (2000). A predictive model of archaeological site location in the Eastern Prairie Pen-
insula. In K. L. Wescott, & R. J. Brandon (Eds.), Practical applications of GIS for archaeologists: A predictive modeling
toolkit (pp. 5–32). London: Taylor and Francis.
Wheatley, D. (1995). Cumulative viewshed analysis: A GIS-based method for investigating intervisibility, and its
archaeological application. In G. Lock and Z. Stančič (Eds.), Archaeology and geographical information systems
(pp. 171–186). London: Taylor and Francis.
Willey, G. R. (1953). Prehistoric settlement patterns in the Virú Valley, Peru. Bureau of American Ethnology Bulletin 155.
Washington, DC: Smithsonian Institution Press.
Winters, H. D. (1969). The Riverton culture: A second millennium occupation in the Central Wabash Valley. Monographs 1.
Springfield, IL: Illinois State Museum.
13
Predictive spatial modelling
Philip Verhagen and Thomas G. Whitley
Introduction
Archaeological predictive modelling can be defined as a set of techniques employed to predict “the
location of archaeological sites or materials in a region, based either on a sample of that region or on
fundamental notions concerning human behavior” (Kohler & Parker, 1986, p. 400). The basic premise
of predictive modelling is that human spatial behaviour is to a large extent predictable, which implies
that the locations where people lived and performed their daily activities can be identified on the basis
of statistical and/or explanatory models.
The roots of predictive modelling can be traced back to the days of New Archaeology, and in particu-
lar the development of Site Catchment Analysis in the 1970s (Vita-Finzi & Higgs, 1970). Archaeologists
became aware that human settlement is intimately linked to its environmental setting, which also implied
that it should be possible to predict the locations suitable for human settlement. Around the same time,
Cultural Resource Management (CRM) was developing in North America through legislation aimed at
the protection of cultural heritage. In the absence of sufficient data to identify all archaeological sites in a
region, predictive modelling answered the need for a more comprehensive mapping of cultural resources
and ways in which to avoid impacts to them by development. Although mainframe-based Geographic
Information Systems (GIS) had already been applied to predictive modelling in a very limited fashion,
the arrival of desktop-based GIS in the late 1980s paved the way for further proliferation of predictive
modelling, particularly in CRM contexts. Publication of seminal works on the theory and methods of
predictive modelling began initially in the USA (Kohler & Parker, 1986; Judge & Sebastian, 1988) and
later also in Europe (Van Leusen & Kamermans, 2005).
Predictive modelling is used in archaeology for two purposes. First, it is a planning aid for CRM in
order to assess the risks of disturbing archaeological remains during development projects. Here predictive
models serve to inform and influence the decision-making processes of planners and to convince them
that developments should take place in the least sensitive areas. They also guide the archaeological inves-
tigations once developments have started, by making reasoned choices on where to concentrate research
efforts within the constraints of available time and money. The cost-effectiveness of the approach has
been proven in CRM, and it has provided a basic degree of protection to zones of high archaeological
potential in those regions where it is included in CRM policies. Second, predictive modelling is applied
232 Philip Verhagen and Thomas G. Whitley
as a tool to develop and test scientific models of human locational behaviour. In academic contexts,
therefore, predictive models can be considered as heuristic devices (Verhagen & Whitley, 2012) that can
play an important role in formalizing and quantifying theoretical notions on the development of settle-
ment patterns and land use.
Despite its widespread application, predictive modelling has also encountered substantial criticism
from archaeologists. This is because it will never be able to accurately predict the locations or presence
of all archaeological remains. The models are only as good as the data and theories that have been used
to create them, since we can only extrapolate from the existing state of archaeological knowledge. In that
respect, there is no real difference between theory-driven or data-driven approaches; they both reflect
our existing understandings, or biases, about the human past. For many archaeologists the risk of ‘wrong
predictions’ and their undesired consequences for the protection and investigation of the archaeological
record in CRM is unacceptable. From a scientific point of view, however, discrepancies between predic-
tion and data can be the starting point to develop new theories about site location and/or to direct future
data collection practices, which could also be a guiding principle for decision making in CRM.
Method
Predictive models can be made following two different strategies, usually named ‘inductive’ and ‘deduc-
tive’ (Kamermans & Wansleeben, 1999), but more accurately described as ‘data-driven’ and ‘theory-
driven’ (Wheatley & Gillings, 2002). In data-driven modelling, the locations of known archaeological
sites in a study region are compared to a number of parameters that are considered to be important to
settlement choice (such as slope gradient, soil type, or distance to water), using various statistical assess-
ments within a GIS. The results of this quantitative site location analysis can then be extrapolated to
areas where no archaeological data are yet available, and will thus result in a prediction of site densities
or probabilities for the area to be studied. A popular technique for data-driven predictive modelling
is logistic regression (cf. Warren, 1990; Hudak et al., 2002; Conolly & Lake, 2006, Chapter 8.8). This
is a technique for fitting a prediction curve to a set of observations that is especially suited for vari-
ables that are measured at different scales (nominal, ordinal, interval and/or ratio). Also, fitting to a
logistic rather than to a linear curve has advantages for increasing the statistical contrast between site
and non-site locations (Warren & Asch, 2000), and as such it has been the preferred tool for predictive
modelling for many years. However, many other options are becoming available, including ecological
niche modelling (cf. Kondo, 2015; Banks, 2017), Monte Carlo simulations (cf. Kvamme, 1997; Vana-
cker et al., 2001) and Bayesian statistics (cf. Finke, Meylemans, & Van de Wauw, 2008; Van Leusen,
Millard, & Ducke, 2009).
Despite the current state of sophistication of statistical modelling, it is still very difficult to determine
which statistical technique performs best since there are hardly any case studies available where methods are
compared. Additionally, regression and similar statistical analyses require large datasets of previously known
archaeological sites to produce significant results. Therefore, only well-represented site types or behaviours
can be predicted using such techniques, and prior biases in those datasets dramatically affect the outcomes.
The theory-driven approach bypasses most of the statistical complexity by defining theoretical
assumptions about the parameters influencing human spatial behaviour. For example, it can be assumed
that early farming communities preferentially located their settlements in environments well-suited for
agriculture and animal husbandry, which in turn can be related to parameters such as soil quality and
texture, nitrogen content, and moisture potential or drainage. Those characteristics may be embodied
in, and extracted from, soil type classifications which exist within a GIS dataset. Weights are then given
to the parameters based on the nature of the assumptions about the people who may have been living
Predictive spatial modelling 233
there, as well as different site functions or behaviours. The weights and variables are then combined into
predictive formulas which are compared to the known archaeological record in order to judge their per-
formance. This approach has the advantage of including more sophisticated theoretical frameworks that
are based on causal explanation; it can include human agency as a factor, and it is not directly dependent
on archaeological datasets (Verhagen & Whitley, 2012). As a result, there are few restrictions on the types
of sites or behaviours which might be predicted. Deciding on a best possible model however still implies
comparing various parameter weights to the actual archaeological data.
1 the archaeological input data is usually not representative of the full archaeological record, and the
archaeological record itself is a biased reflection of human activity in the past; therefore, we cannot
expect that models based on existing archaeological datasets will accurately predict all locations of
past human settlement;
2 the predictor variables used are often based on modern-day environmental datasets, that may not be
at the right level of detail and accuracy for predictive modelling purposes, and may not accurately
reflect the situation in the past;
3 socio-cultural variables are usually not included;
4 the temporal resolution of predictive models is limited, since this is determined by the archaeological
and environmental input datasets used; and
5 testing of predictive models is often done in a haphazard way and in most cases does not involve
a representative field survey; the distinction made between areas of low and high probability has
instead often led to a policy of not surveying the low-probability zones, and in this way, a self-
fulfilling prophecy will be created (Wheatley, 2004).
Predictive modelling has often been criticized for restricting itself to a limited set of ‘environmental’ vari-
ables. This can partly be attributed to the fact that there are relatively few relevant datasets available that
cover large areas. The inputs most often used are derivatives of Digital Elevation Models (in particular
slope and aspect), and, to a lesser extent, topographical, geological and pedological maps. This approach
has been successfully applied in many cases, but the datasets are sometimes used in a very uncritical way.
Socio-cultural variables (such as distance to roads, or to specific archaeological sites), on the other hand,
are much more difficult to implement in predictive models because of the scarcity of relevant data and
lack of quantifiable theoretical models, although there is no inherent barrier to including them (see e.g.
Whitley, Moore, Goel, & Jackson, 2010).
Ideally, the desired degree of accuracy of a predictive model should determine what is needed in terms
of data and knowledge. In practice, however, predictive models are often made on the basis of datasets
that happen to be available, and for this reason they can vary considerably in their accuracy. It is therefore
extremely important that the models are tested, both internally in order to establish the uncertainties
in the parameters and archaeological data used by means of sensitivity analysis, and externally by add-
ing independent, representative archaeological data. Establishing the representativeness of archaeological
datasets however is often very difficult, since in many cases there is insufficient information about the
intensity and methods of survey applied and about the influence of potential biases, such as the visibility
of archaeological remains on the surface. These factors highly influence not just the number of archaeo-
logical sites found but also the types of sites that can be discovered successfully (Verhagen, 2008).
234 Philip Verhagen and Thomas G. Whitley
Creating and testing archaeological predictive models is therefore a complex exercise, at the end of
which the results need to be translated into terms that can be easily implemented in CRM and other
planning contexts. Formal statistical assessments do not necessarily play an important role in this. Instead,
a number of explicit and implicit assumptions about the importance of specific archaeological remains,
their state of preservation, and the costs of excavating them are used as well to assess the archaeological and
financial risks of development plans. Thus, creating predictive models is only one stage in the decision-
making process surrounding archaeology in spatial planning, and as such their role in CRM should not
be overemphasized. However, since they are used at the beginning of the planning process and will direct
decision making in subsequent stages, their accuracy should be a major concern to archaeologists, devel-
opers and planners alike.
Case studies
The Mn/Model
A classic example of large-scale agency-supported predictive modelling is known as the ‘Minnesota
Model’ (or Mn/Model in abbreviation – Hudak et al., 2002). The Mn/Model was the first archaeological
predictive model to be applied to an entire US state, and was originally initiated in 1995 by the Minnesota
Department of Transportation with the financial support of the US Federal Highway Administration. It
was also the first widely applied, data-driven model to consider survey bias and depth of deposits in its
application.
The main objective of the Mn/Model was to provide transportation planners with a GIS-based tool
that would help them identify areas likely to contain archaeological sites, so that they could be avoided.
The idea was that these sites could be identified early on in the planning process, thereby saving time and
expense later on when transportation projects were underway. The primary methodological assumptions
of the model were:
1 That only pre-1837 sites could be predicted using the methods employed. The year 1837 marks the
earliest permanent historic-era settlement within Minnesota, and it was largely assumed that historic
Euro-American settlement followed more complex patterns not defined by environmental variables.
2 That separate models were necessary for 24 different ‘environmental regions’ within Minnesota.
Each region was defined based on topographic distinctions, ecological communities, or geomorpho-
logical origins. They each also had a unique set of pre-existing archaeological sites from which the
correlative analyses were drawn.
3 That paleo-landscape and geomorphological modelling would additionally help create the ‘third and
fourth’ dimensions of depth and time to the analysis.
4 That issues with sample size and pre-existing survey bias could be overcome by using statistical
techniques. These techniques would allow the generation of appropriate datasets from which cor-
relations could be derived.
The Mn/Model is a set of 24 multiple logistic regression models (one for each environmental subregion)
that each identify correlations between a dependent variable (e.g. site presence/absence) and a wide range
of independent variables (the environmental predictors). In this case, predictor variables were derived
from elevation, watersheds, hydrology, soils, geology, vegetation maps, anthropogenic disturbances (usually
modern), paleoclimate models of temperature, precipitation, geomorphological events, and palynology,
as well as some historical cultural features in limited contexts (Hudak et al., 2002). Known site locations
Predictive spatial modelling 235
were evaluated against ‘non-sites’ using these variables in a logistic regression analysis for each environ-
mental region over three successive phases; each modified in response to data quality or other issues
encountered during the process.
The resulting regional models met or exceeded the performance expected by the modellers, with
an average gain statistic (cf. Kvamme, 1988) of about 0.71 for all regions combined during Phase 3,
which had improved from 0.37 in Phase 1, and 0.68 in Phase 2. That range of individual gain statistics
though varied widely across regions with some as low as 0.40 and others as high as 0.89 depending on
the number of sites being modelled and the ability of the model to reduce the size of the high/medium
probability areas.
Gain is calculated as follows (Kvamme, 1988):
G = 1 − pa /ps
where
pa = the area proportion of the zone of interest (usually the zone of high probability); and
ps = the proportion of sites found in the zone of interest.
If the area likely to contain sites in a region is small (the model is very precise), and the sites found in
that area represent a large proportion of the total (the model is very accurate), then we will have a model
with a high gain.
The authors note that the positive results in some cases can be very misleading due to biased survey,
few known sites in limited environments, and very few actual surveys having been conducted. Its total
cost over a period of seven years was $4.5 million. Nevertheless, the Mn/Model has been held up as a
successful application of archaeological predictive modelling since it is estimated to have saved the State
of Minnesota about $3 million per year, for the first four years of its implementation. Several other states
have followed with their own statewide data-driven archaeological predictive models including North
Carolina and Washington.
assumed that their characteristics can be used for predictive modelling purposes by finding the locations
that are most similar to them.
The authors developed a new methodology for this purpose, the Locally Adaptive Model of Archaeo-
logical Potential (LAMAP), that employs not just the information from a site’s location itself, but from
a predefined (circular) neighbourhood around the site. The characteristics of these known site location
surroundings are then compared to the whole study region, resulting in a measure of similarity of each
location in the region to the characteristics of the known sites. Such an approach is not completely new:
similar GIS-based analyses had already been undertaken extensively in the south of France since the 1990s
(see Favory, Nuninger, & Sanders, 2012). However, these were not aimed at predictive modelling but only
at analysing site location preferences.
The LAMAP method is implemented by first calculating the frequency of each particular value of
a landscape characteristic, like elevation, within a neighbourhood around a site. Optionally, the model
accommodates distance weighting, so that locations closer to the site will be considered as more impor-
tant than those further away. Then, it is established how probable it is that the observed set of values that
occurs jointly within the site’s neighbourhood is also found elsewhere.
Carleton et al. (2012) first prepared a test model for a set of Maya sites in Belize, using a conventional
set of environmental parameters and a distance radius of 1 km. The model’s performance, measured with
Kvamme’s gain statistic, was considered to be good, although the testing was only done by holding back
a portion of the site sample (split sampling; see Kvamme, 1988, for more details). Later, they also tested the
model using new field survey data and sites newly identified on Light Detection and Ranging (LiDAR)
images (Carleton et al., 2017). This resulted in a very close correlation between prediction and site occur-
rence. The high success rate coupled to the relatively simple implementation suggests that this method
is a good solution for data-driven predictive modelling on the basis of site presence-only data without
having to resort to using pseudo-absence data.
energy consumption and expenditure given assumptions about their diet known from the archaeological
record. Through this it would be possible to model a variety of outcomes including the nature of catch-
ments from known sites, dietary preferences and sustainable populations, the evolution from land-based
to maritime diets in the region, seasonal resource stability, and even complex concepts such as resource
competition and social dominance. Additionally, the approach could be used to generate an archaeologi-
cal predictive model for sites, based on their likely function, in non-surveyed areas.
The overall model is designed as an intersection of a series of environmental and habitat models devel-
oped in the GIS. These are weighted-additive predictive models, exactly like the kind used for predicting
archaeological site locations in other situations described above. But, these are intended to predict the
habitat suitability for one specific resource at one specific time of year based on regional biological studies
of that particular organism and existing local, state, and regional habitat models. Fifteen different envi-
ronmental variables are used to develop geospatial models for 37 different forage categories (i.e. faunal
or floral species or groupings), for each month of the year. These models represent suitability as a range
of values from 0 (not suitable) to 1 (highest suitability) (Figures 13.1 and 13.2).
The habitat models are then converted to calorific surfaces based on estimates of species population
size, density, reproductive rates, mortality, and resilience, during each month. A calorific surface is a GIS
layer, which shows a prediction for the number of calories (kCal) one might acquire from any one pixel
(or map unit) from each resource at any given time of year. Instead of a decimal value between 0 and 1,
as in the habitat models, the calorific surfaces represent numbers of calories. By adding them all together,
one gets a total number of predicted calories at every GIS pixel in the study area. The resulting ‘available’
calorific surfaces are then modified into models of ‘returned’ calories by subtracting the calorific ‘costs’
of acquiring the ‘available’ resources (Figures 13.3 and 13.4).
The cost formulas are based on energy expenditures for individuals and families calculated by Thomas
(2008), for each of the resources used in the study. But, they are also tempered by known archaeological
faunal/floral assemblages and the periodic introduction of different technologies; such as the bow-and-
arrow, the introduction of crop staples, and grain storage. Subtracting calories based on rates of energy
loss through decay and trade, as well as dietary preferences (e.g. personal tastes) leads to the outcome of a
‘selected’ energy model. In short, the available calories are the ones predicted by the habitat models. The
returned calories are the predicted calories but with the costs of accessing and processing them subtracted.
The selected calories are the ones remaining after some have been lost over time from decay, traded away
to someone else, or left uncollected for some other reason.
Ultimately, the objectives of the model were meant to be explanatory, in that they helped answer
questions about periods of site occupation, seasonality, diet sufficiency, patterns of exploitation, regional
trade, and competition. To do so meant applying the locations of some 7000 known archaeological sites,
but of which only 103 were well-dated domestic sites, to the analysis. Although no predictive model was
actually intended from the outcome, Whitley used a sample area with a total of 308 known archaeological
sites, did a simple unweighted combination of all calorific variables, and split them into three equal parts
creating areas of low, moderate, and high total calorific value. These were then overlaid with the known
archaeological sites and compared to see how many occurred in high probability areas.
This simple analysis showed that even without formal constructs separating out sites by period, func-
tion, or seasonality, or a more sophisticated evaluation of the boundaries between high and low potential-
ity, the gain values were in excess of 0.80; higher than typical data-driven models and unheard of for areas
without high terrain dissection or limitations on water availability. The objective here was not to evaluate
the application of this particular simplistic predictive model for development purposes, but to illustrate
that a theory-driven approach was far more powerful than a data-driven one in predicting how human
behaviour shapes site selection. The gain statistic, even with its inherent flaws, was merely used here as
Figure 13.1 Southern portion of the coastal Georgia study area: maximum available calories for white-tailed
deer (Odocoileus virginianus) for the month of September (ca. 500 BP). A colour version of this figure can be
found in the plates section.
Figure 13.2 Southern portion of the coastal Georgia study area: maximum available calories for all shellfish
species for the month of September (ca. 500 BP). A colour version of this figure can be found in the plates
section.
Figure 13.3 Southern portion of the coastal Georgia study area: returnable calories for all resources combined
for the month of January (ca. 500 BP). A colour version of this figure can be found in the plates section.
Figure 13.4 Southern portion of the coastal Georgia study area: returnable calories for all resources combined
for the month of September (ca. 500 BP). A colour version of this figure can be found in the plates section.
242 Philip Verhagen and Thomas G. Whitley
a comparative device with far more expensive data-driven models like the Mn/Model. This approach
was also used in a more formal predictive model successfully applied in parts of Louisiana, Arkansas and
Mississippi (Whitley et al., 2011), but which remained untested in the field since federal funding expired.
resolution. In the absence of this information, we therefore can only model spatio-temporal patterns
by including uncertainty. Bevan and Wilson (2013) and Paliou and Bevan (2016) assessed the quality
of their models by including simulated additional settlements in the modelled network, based on a
prediction of suitable site locations. By repeating this procedure a large number of times, it could be
established whether the resulting networks’ characteristics were stable or not. In this exercise, conven-
tional predictive modelling was therefore used to support archaeological analysis, rather than the other
way around.
Conclusion
Predictive modelling has a long and controversial history in archaeology. There are almost as many
methods of creating a predictive model as there are actual models out there. Yet we routinely see new
models that rely on a data-driven, correlative approach. These are almost always logistic regression-based
analyses applied to strictly environmental parameters taken straight from the Judge and Sebastian (1988)
playbook. Such techniques have worked well in some situations, but they leave a great deal to be desired
from an explanatory perspective and are routinely criticized for their perceived environmental determin-
ism, along with many other issues. Repeated application of these methods, developed in the 1970s and
1980s, seems to have driven a philosophical wedge between academic and CRM opinions regarding the
value of predictive models (Verhagen & Whitley, 2012). Although there are new and innovative develop-
ments arising every year in archaeological predictive modelling, the general (incorrect) perception is one
of methodological stagnation and theoretical limitations. Changing such a perception will eventually
require new publications that can revisit the theory and methods of predictive modelling, and can put
them into a more modern context.
References
Alden, J. R. (1979). A reconstruction of toltec period political units in the valley of Mexico. In C. Renfrew & K. L.
Cooke (Eds.), Transformations: Mathematical approaches to cultural change (pp. 169–200). New York, NY: Academic
Press.
Banks, W. E. (2017). The application of ecological niche modeling methods to archaeological data in order to exam-
ine culture-environment relationships and cultural trajectories. Quaternaire, 28, 271–276.
Bevan, A., & Wilson, A. (2013). Models of settlement hierarchy based on partial evidence. Journal of Archaeological
Science, 40, 2415–2427.
Carleton, W. C., Cheong, K. F., Savage, D., Barry, J., Conolly, J., & Iannone, G. (2017). A comprehensive test of the
Locally-Adaptive Model of Archaeological Potential (LAMAP). Journal of Archaeological Science: Reports, 11, 59–68.
Carleton, W. C., Conolly, J., & Iannone, G. (2012). A Locally-Adaptive Model of Archaeological Potential (LAMAP).
Journal of Archaeological Science, 39, 3371–3385.
Conolly, J., & Lake, M. (2006). Geographical informations systems in archaeology. Cambridge, UK: Cambridge University
Press.
Davies, T., Fry, H., Wilson, A., Palmisano, A., Altaweel, M., & Radner, K. (2014). Application of an entropy maximiz-
ing and dynamics model for understanding settlement structure: The Khabur triangle in the middle Bronze and
Iron ages. Journal of Archaeological Science, 43, 141–154.
Ducke, B., & Kroefges, P. C. (2008). From points to areas: Constructing territories from archaeological
site patterns using an enhanced xtent model. In A. Posluschny, K. Lambers, & I. Herzog (Eds.), Layers of
perception: Proceedings of the 35th international conference on computer applications and quantitative methods in
archaeology (CAA), Berlin, Germany, April 2–6, 2007, Kolloquien zur Vor-und Frühgeschichte (Vol. 10, p. 243).
Bonn: Dr. Rudolf Habelt GmbH, + CD-ROM. Retrieved from http://proceedings.caaconference.org/
paper/78_ducke_kroefges_caa2007/
244 Philip Verhagen and Thomas G. Whitley
Emlen, J. M. (1966). The role of time and energy in food preference. American Naturalist, 100, 611–617.
Favory, F., Nuninger, L., & Sanders, L. (2012). Integration of geographical and spatial archeological concepts for the
study of settlement systems. L’Espace géographique, 41, 295–309.
Finke, P. A., Meylemans, E., & Van de Wauw, J. (2008). Mapping the possible occurrence of archaeological sites by
Bayesian inference. Journal of Archaeological Science, 35, 2786–2796.
Grayson, D. K., & Delpech, F. (1998). Changing diet breadth in the early Upper Palaeolithic of southwestern France.
Journal of Archaeological Science, 25, 1119–1129.
Hames, R., & Vickers, W. (1982). Optimal diet breadth theory as a model to explain variability in Amazonian hunt-
ing. American Ethnologist, 9, 358–378.
Hodder, I. (1974). Some marketing models for Romano-British coarse pottery. Britannia, 5, 340–359.
Hudak, G. J., Hobbs, E., Brooks, A., Sersland, C. A., & Phillips, C. (Eds.). (2002). Mn/model final report 2002: A
predictive model of precontact archaeological site location for the state of Minnesota. St. Paul, MN: Minnesota Department
of Transportation.
Judge, J. W., & Sebastian, L. (Eds.). (1988). Quantifying the present and predicting the past: Theory, method and application
of archaeological predictive modelling. Denver, CO: U.S. Department of the Interior, Bureau of Land Management.
Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47, 263–291.
Kamermans, H., & Wansleeben, M. (1999). Predictive modelling in Dutch archaeology, joining forces. In J. Barceló, I.
Briz, & A. Vila (Eds.), New techniques for old times-CAA98: Computer applications and quantitative methods in archaeol-
ogy (pp. 225–230). Oxford: Archaeopress.
Kohler, T. A., & Parker, S. C. (1986). Predictive models for archaeological resource location. In M. B. Schiffer (Ed.),
Advances in archaeological method and theory (Vol. 9, pp. 397–452). New York: Academic Press.
Kondo, Y. (2015). An ecological niche modelling of Upper Palaeolithic stone tool groups in the Kanto-Kushinetsu
region, eastern Japan. The Quaternary Research, 54, 207–218.
Kvamme, K. L. (1988). Development and testing of quantitative models. In W. J. Judge & L. Sebastian (Eds.), Quanti-
fying the present and predicting the past: Theory, method, and application of archaeological predictive modelling (pp. 325–428).
Denver, CO: U.S. Department of the Interior, Bureau of Land Management Service Center.
Kvamme, K. L. (1997). GIS and statistical inference in Arizona: Monte Carlo significance tests. In I. Johnson &
M. North (Eds.), Archaeological applications of GIS: Proceedings of colloquium II, UISPP XIIIth congress, Forlí, Italy,
September 1996. Sydney: University of Sydney.
MacArthur, R. H., & Pianka, E. R. (1966). On the optimal use of a patchy environment. American Naturalist, 100,
603–609.
Nüsslein, A., Nuninger, L., & Verhagen, P. (in press). To boldly go where no one has gone before: Integrating social
factors in site location analysis and predictive modelling, the hierarchical types map. In J. B. Glover, J. M. Moss, &
D. Rissolo (Eds.), Digital archaeologies, material worlds. Proceedings of the CAA2017 Conference, Atlanta.
O’Connell, J. F., & Hawkes, K. (1984). Food choice and foraging sites among the Alyawara. Journal of Anthropological
Research, 40, 435–504.
Orians, G. F., & Pearson, N. E. (1979). On the theory of central place foraging. In D. J. Horn, R. D. Mitchell, &
C. R. Stairs (Eds.), Analysis of ecological systems (pp. 154–177). Columbus: Ohio State University Press.
Paliou, E., & Bevan, A. (2016). Evolving settlement patterns, spatial interaction and the socio-political organisation
of late prepalatial South-Central crete. Journal of Anthropological Archaeology, 42, 184–197.
Renfrew, C., & Level, E. (1979). Exploring dominance: Predicting polities from centers. In C. Renfrew & K. L.
Cooke (Eds.), Transformations: Mathematical approaches to cultural change (pp. 145–166). New York, NY: Academic
Press.
Rihll, T. E., & Wilson, A. G. (1987). Spatial interaction and structural models in historical analysis: Some possibilities
and an example. Histoire & Mesure, 2, 5–32.
Rihll, T. E., & Wilson, A. G. (1991). Modelling settlement structures in ancient Greece: New approaches to the polis.
In J. Rich & A. Wallace-Hadrill (Eds.), City and country in the ancient world (Vol. 3, pp. 58–95). London: Routledge.
Rivers, R., Knappett, C., & Evans, T. (2013). What makes a site important? Centrality, gateways and gravity. In
C. Knappett (Ed.), Network analysis in archaeology: New approaches to regional interaction (pp. 125–150). Oxford:
Oxford University Press.
Smith, E. A. (1991). Inujuamiut foraging strategies: Evolutionary ecology of an arctic hunting economy. New York: Aldine.
Stephens, D. W., & Krebs, J. R. (1986). Foraging theory. Princeton: Princeton University Press.
Predictive spatial modelling 245
Thomas, D. H. (2008). Native American landscapes of St. Catherines Island, Georgia. Anthropological Papers of the
American Museum of Natural History 88.
Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal
of Risk and Uncertainty, 5(4), 297–323.
Vanacker, V., Govers, G., Van Peer, P., Verbeek, C., Desmet, J., & Reyniers, J. (2001). Using Monte Carlo simulation
for the environmental analysis of small archaeologic datasets, with the Mesolithic in Northeast Belgium as a case
study. Journal of Archaeological Science, 28, 661–669.
Van Leusen, M., & Kamermans, H. (Eds.). (2005). Predictive modelling for archaeological heritage management: A research
agenda. Amersfoort: Rijksdienst voor het Oudheidkundig Bodemonderzoek.
Van Leusen, M., Millard, A. R., & Ducke, B. (2009). Dealing with uncertainties in archaeological prediction. In H.
Kamermans, M. van Leusen, & P. Verhagen (Eds.), Archaeological prediction and risk management: Alternatives to current
practice (pp. 123–160). Leiden: Leiden University Press.
Verhagen, P. (2008). Testing archaeological predictive models: A rough guide. In A. Posluschny, K. Lambers, &
I. Herzog (Eds.), Layers of perception: Proceedings of the 35th international conference on computer applications and quantita-
tive methods in archaeology (CAA), Berlin, Germany, April 2–6, 2007, Kolloquien zur Vor-und Frühgeschichte (Vol. 10,
pp. 285–291). Bonn: Dr. Rudolf Habelt GmbH.
Verhagen, P., Nuninger, L., Bertoncello, F., & Castrorao Barba, A. (2016). Estimating the “memory of landscape” to
predict changes in archaeological settlement patterns. In S. Campana, R. Scopigno, G. Carpentiero, & M. Cirillo
(Eds.), CAA 2015: Keep the revolution going: Proceedings of the 43rd annual conference on computer applications and
quantitative methods in archaeology (pp. 623–636). Oxford: Archaeopress.
Verhagen, P., & Whitley, T. G. (2012). Integrating predictive modelling and archaeological theory: A live report from
the scene. Journal of Archaeological Method and Theory, 19, 49–100.
Vita-Finzi, C., & Higgs, E. S. (1970). Prehistoric economy in the Mount Carmel area of Palestine: Site catchment
analysis. Proceedings of the Prehistoric Society, 36, 1–37.
Wakker, P. P., Timmermans, D. R. M., & Machielse, I. A. (2003). The effects of statistical information on insurance decisions
and risk attitudes. Amsterdam: Department of Economics, University of Amsterdam.
Warren, R. E. (1990). Predictive modeling in archaeology: A primer. In K. M. S. Allen, S. W. Green & E. B. W.
Zubrow (Eds.), Interpreting space: GIS and archaeology (pp. 90–111). London: Taylor and Francis.
Warren, R. E., & Asch, D. L. (2000). Site location in the Eastern Prairie Peninsula. In K. L. Wescott & R. J. Bran-
don (Eds.), Practical applications of GIS for archaeologists: A predictive modeling toolkit (pp. 5–32). London: Taylor and
Francis.
Wheatley, D. (2004). Making space for an archaeology of place. Internet Archaeology, 15. Retrieved from http://
intarch.ac.uk/journal/issue15/wheatley_index.html
Wheatley, D., & Gillings, M. (2002). Spatial technology and archaeology: The archaeological applications of GIS. London:
Taylor and Francis.
Whitley, T. G. (2003). Causality and cross-purposes in predictive modeling. In M. der Stadt Wien, R. K. Erbe, &
S. Wien (Eds.), Enter the past: The E-way into four dimensions of cultural heritage. BAR International Series, 1227
(pp. 236–239). Oxford: Archaeopress.
Whitley, T. G. (2005). A brief outline of causality-based cognitive archaeological probabilistic modeling . In M.
van Leusen & H. Kamermans (Eds.), Predictive modelling for archaeological heritage management: A research agenda
(pp. 123–138). Amersfoort: Rijksdienst voor het Oudheidkundig Bodemonderzoek.
Whitley, T. G. (2010). Re-thinking accuracy and precision in predictive modeling. In F. Niccolucci & S. Hermon
(Eds.), Beyond the artifact: Digital interpretation of the past (pp. 312–318). Budapest: Archaeolingua.
Whitley, T. G. (2013). A paleoeconomic model of the Georgia coast (4500 to 300 BP). In V. Thompson & D. H.
Thomas (Eds.), Life among the tides: Recent archaeology of the Georgia bight (pp. 235–285). New York, NY: American
Museum of Natural History Anthropological Papers.
Whitley, T. G., Moore, G., Goel, G., & Jackson, D. (2010). Beyond the marsh: Settlement choice, perception and spatial
decision-making on the Georgia coastal plain. In B. Frischer, J. Webb Crawford, & D. Koller (Eds.), Making his-
tory interactive: Computer applications and quantitative methods in archaeology (CAA): Proceedings of the 37th international
conference, Williamsburg,Virginia, United States of America, March 22–26, 2009 (pp. 380–390). Oxford: Archaeopress.
Whitley, T. G., Moore, G., Jackson, D., Dellenbach, D., Goel, G., Bruce, J., . . . Futch, J. (2011). An archaeological predic-
tive model for the USACE,Vicksburg district: Western Mississippi, Northern Louisiana, and Southern Arkansas. American
246 Philip Verhagen and Thomas G. Whitley
Recovery and Reinvestment Act 2009: Section 110 Compliance Report for the U.S. Army Corps of Engineers,
Vicksburg District: NHPA, Cultural Resources Investigations Technical Report No. 7 (Vol. 3). Atlanta, GA:
Brockington and Associates, Inc.
Winterhalder, B., & Kennett, D. (2006). Behavioral ecology and the transition from hunting and gathering to agricul-
ture. In D. Kennett & B. Winterhalder (Eds.), Behavioral ecology and the transition to agriculture (pp. 1–21). Berkeley:
University of California Press.
14
Spatial agent-based modelling
Mark Lake
Introduction
Spatial agent-based modelling (ABM) is a method of computer simulation that can be used to explore
how the aggregate characteristics of a system – for example a settlement pattern, population dispersal
or distribution of artefacts – arise from the behaviour of artificial agents. In archaeological ABM the
agents are typically individual people or social units such as households. Agent-based modelling is often
presented as part of the toolkit of complexity science (Beekman & Baden, 2005; Epstein & Axtell, 1996),
but it is a very flexible method which can be used in projects informed by many different theoretical
perspectives.
It should be noted that while the vast majority of agent-based models used in archaeology are explic-
itly spatial, the representation of space is not a necessary feature of ABM (see e.g. Ferber, 1999). Moreover,
many of the issues that arise when using explicitly spatial ABM are essentially the same as those that
apply to the use of raster GIS, or various forms of statistical spatial analysis. For that reason, this chapter
focuses in issues which are specific to the use of ABM, most of which are relevant irrespective of whether
the model is spatial. It is therefore strongly recommended that this chapter be read alongside others in
this handbook. Chapters 2, 3, 7 and 19 may be particularly relevant to the task of preparing spatial input
data, while Chapters 4, 6, 8, 9 and 21 discuss methods that may be relevant to the statistical analysis and
presentation of spatial simulation outputs.
The primary purpose of this chapter is to explain the choices that must be made when designing,
building, experimenting with and disseminating an ABM. Readers seeking a practical tutorial complete
with sample models and code should consult Railsback and Grimm’s excellent (2012) Agent-Based and
Individual-Based Modelling: A Practical Introduction. Readers who are interested in the history and theory
of ABM in archaeology will find up-to-date reviews in Cegielski and Rogers (2016), and Lake (2014,
2015). Additional discussion of the relationship between ABM and archaeological theory can be found
in Aldenderfer (1998), Beekman and Baden (2005), Beekman (2005), Costopoulos (2010), Kohler (2000),
Kohler and van der Leeuw (2007a), McGlade (2005) and Mithen (1994). Useful textbooks on agent-
based modelling include Grimm and Railsback (2005) (aimed at ecologists), the rather briefer Gilbert
(2008) (aimed at sociologists) and Ferber (1999) (aimed at artificial intelligence researchers and computer
scientists).
248 Mark Lake
• Sociality and cognition. Archaeologists have increased the realism of human agents by incorporating
aspects of social interaction. This ranges from agents learning from one another (Kohler, Cockburn,
Hooper, Bocinsky, & Kobti, 2012b; Lake, 2000a; Mithen, 1989; Premo, 2012; Premo & Scholnick,
2011), through simple collective decision-making (Lake, 2000a) to the exchange of goods (Bentley,
Lake, & Shennan, 2005; Kobti, 2012), group formation (Doran, Palmer, Gilbert, & Mellars, 1994;
Doran & Palmer, 1995) and the emergence of leaders (Kohler et al., 2012b). Another way of increas-
ing the realism of agents is to explicitly model learning and memory. These are a feature of a number
of models of hunter-gatherer foraging, including Costopoulos’ (2001) investigation of the impact
of time-discounting, Mithen’s (1989, 1990) model of decision-making in Mesolithic hunting and
Spatial agent-based modelling 249
Lake’s (2000a) spatial ABM of Mesolithic land-use. The last of these extends to each agent having
its own geographically referenced cognitive map of its environment (Figure 14.1).
• Evolution. In ABMs built to explore change over longer periods of time it may be appropriate for the
population of agents to evolve as a result of agent reproduction involving recombination or mutation
of agent rules (or other attributes that are normally fixed for the lifetime of the agent). Examples
include Premo’s model of hominin prosociality (2005), Kachel, Premo, and Hublin’s (2011) evaluation
of the ‘grandmother hypothesis’ for human evolution, Lake’s (2001b) model of the evolution of the
hominin capacity for cultural learning and Xue, Costopoulos, and Guichard’s (2011) model of the
extent to which tracking the environment too closely can be detrimental in the long term. There are
also a number of ABMs which model the cultural transmission of traits across agent generations, for
example Premo and Kuhn’s (2010) investigation of the effects of local extinctions on culture change
and diversity in the Palaeolithic.
• Environmental change. Many archaeological ABMs include environmental change. One option is
external forcing, where the environment is altered over time to reflect palaeoenvironmental time-
series data. For example, in the Long House Valley ABM (see the case study in this chapter) the
maize yield changes over time following rainfall data, while in Xue et al. (2011) model changes in
productivity are based on ice core data. Another option is to explicitly model the impact of agents
on the environment. For example, early versions of Kohler et al.’s Village ABM reduced yields from
continued farming (2000, 2012a), while recent versions also explicitly model the population growth
of prey species such as deer (Johnson & Kohler, 2012) thereby incorporating reciprocal human-
environment interaction. Additionally, archaeologists interested in the socioecological (Barton, Riel-
Salvatore, Anderies, & Popescu, 2011) dynamics of long-term human environment interaction have
coupled ABMs of human behaviour with geographical information systems or other raster models
of natural processes such as soil erosion (e.g. Barton, Ullah, & Bergin, 2010; Barton, Ullah, & Mita-
sova, 2010; Kolm & Smith, 2012).
Figure 14.1 Schematic illustration of the features of an Agent Based Model (ABM) with cognitive agents,
based on the model described (Lake, 2000a). A colour version of this figure can be found in the plates section.
Source: Mapping data ©Crown copyright and database rights 2019 Ordnance Survey (100025252)
250 Mark Lake
Figure 14.2 Example of the realistic rendering of a simulated landscape. A colour version of this figure can
be found in the plates section.
Source: Adapted from Ch’ng and Stone (2006)
• Virtual reality. Ch’ng and Stone (Ch’ng & Stone, 2006; Ch’ng, 2007; Ch’ng et al., 2011) have
combined ABM and gaming engine technology to generate dynamic vegetation models for archae-
ological reconstruction and interactive visualisation of Mesolithic hunter-gatherers foraging in a
landscape now submerged under the North Sea (Figure 14.2).
• Understanding long-term change. The notion that archaeology has much to offer contemporary society
as a science of long-term societal change and human-environment interaction (Johnson, Kohler, &
Cowan, 2005; van der Leeuw & Redman, 2002; van der Leeuw, 2008) has intellectual antecedents in
the mid-20th century programs of cultural ecology and sociocultural evolution (Kohler & van der
Leeuw, 2007a), but we now have better understanding of the importance of non-linearity, recur-
sion and noise in the evolution of living systems, whether that is couched in the language of chaos
(Schuster, 1988), complexity (Waldrop, 1992), evolutionary drive (Allen & McGlade, 1987), contin-
gency (Gould, 1989), niche construction (Odling-Smee, Laland, & Feldman, 2003), or structuration
(Giddens, 1984). Since ABMs explicitly model and give causal force to the micro-level parts (agents)
they are well suited to exploring how potentially non-linear long-term systemic change arises from
the decision-making of agents interacting with and even modifying their physical and social envi-
ronment (see Kohler & van der Leeuw, 2007a; Barton, 2013 for manifestos, Kohler & Varien, 2012
for the history and role of simulation in one long-running socionatural study, and Beekman &
Baden (2005) for a more overtly sociological perspective).
• Inferring behaviour from the archaeological record. ABM can be used in conjunction with ‘middle range
theory’ (Binford, 1977) to help infer what “organisational arrangements of behaviour” (Pierce, 1989,
Spatial agent-based modelling 251
p. 2) and human decision-making (Mithen, 1988) produced the observed archaeological evidence.
Archaeologists usually make the connection between past behaviour and its expected archaeological
outcome on the basis of “intuition or common sense, ethnographic analogies and environmental
regularities, or in some cases experimental archaeology” (Kohler et al., 2012a, p. 40), but computer
simulation is particularly advantageous for this purpose when the candidate behaviours can no lon-
ger be observed and have no reliable recent historical record. Moreover, simulation makes it possible
to explore the outcome of behaviour aggregated and sampled at the often coarse grained spatial and
temporal resolution of the archaeological record. Good examples of this are Mithen’s (1988, 1990)
use of ABM to generate virtual faunal assemblages resulting from different Mesolithic hunting goals
and Premo’s (2005) spatial ABM of Pleistocene hominin food sharing which revealed that the dense
artefact accumulations at Olduvai and Koobi I, long attributed to central place foraging, could alter-
natively have been formed by routed foraging in a patchy environment.
• Testing quantitative methods. The fact that computer simulation can be used to generate expected
outcomes of known behaviour also makes it well-suited for testing the efficacy of other analyti-
cal techniques. The role of such ‘tactical’ (Orton, 1982) simulations is to provide data resulting
from known behaviour that can then be sampled in ways that mimic the various depositional and
post-depositional processes which determine what evidence we eventually recover. By varying the
behaviour and/or the subsequent degradation of the data it is possible to investigate whether the
analytical technique in question is capable of retrieving a (typically statistical) ‘signature’ which is
unique to the original behaviour. Examples include tests of measures of the quantity of pottery
(Orton, 1982), the efficacy of multivariate statistics (Aldenderfer, 1981b) to differentiate functional
assemblages, the ability of cladistic methods to reconstruct patterns of cultural inheritance (Eerkens,
Bettinger, & McElreath, 2005), the relationship between temporal frequency distributions and pre-
historic demography (Surovell & Brantingham, 2007), the effect of field survey strategy in the recov-
ery of data from battlefields (Rubio-Campillo, María Cela, & Hernàndez Cardona, 2011), and the
robustness of population genetic methods when applied to time-averaged archaeological assemblages
(Premo, 2014).
Method
Having decided on the purpose of an ABM, the next task is to determine what should be included and
in what detail (system bounding), followed by how exactly they should be modelled (detailed design).
After this it will be necessary to choose software to implement the model and, having implemented it,
verify that it works correctly (in a software sense). As discussed next, creating a computer model can be
informative in itself, but ultimately the purpose is to run experiments, which should be carefully designed
according the purpose of the model and earlier decisions about system bounding and detailed design.
Finally, the modeller should consider how to disseminate the model to promote reproducible research and
the longer-term advancement of knowledge. Each of these major topics is discussed in turn.
issues which influence the capacity of a model to generate new knowledge (see Lake, 2015 for a more
detailed treatment).
• Informative models are generative. It is widely agreed (see Beekman, 2005; Costopoulos, 2009; Kohler,
2000; Premo, 2008) that the explanatory power of a simulation model lies in the fact that it “must
be observed in operation to find out whether it will produce a predicted outcome” (Costopoulos,
2009, p. 273). Models which must be run to determine their outcome are termed ‘generative’ (with
respect to the phenomenon of interest). The challenge when building generative models is to avoid
an infinite regress: imagine the complexity of a model in which social institutions emerge from the
actions of individual people whose self in turn emerges from explicit modelling of their underlying
neuropsychology, which is in turn modelled as an outcome of the replication and mutation of genes.
The outcomes of a model like this might be so sensitive to chance events that it would effectively
have little or no explanatory power and, in any case, it would very likely be computationally intrac-
table. The solution is to ‘bracket’ or hold constant those aspects of the world thought to be causally
distant from the question at hand. For example, in biology it is possible to win useful insights into
cycles of mammalian population growth and collapse without modelling atomic vibrations within
the biomolecules that make up muscle fibres. Even sociologists who reject the ontological reality of
social institutions accept that for practical purposes it may be necessary “to assume certain back-
ground conditions which are not reduced to their micro dimensions” (King, 1999, p. 223). Ensuring
that an ABM is “generative with respect to its purpose” (Lake, 2015, p. 25) requires a clear statement
of what question(s) the model is intended to answer in order that it is clear what can be treated as
known and thus included in the model specification, and what is to be explained, and should there-
fore be left to be discovered by running simulations (see also Kohler et al., 2012a).
• There is a trade-off between realism and generality. In practice it is impossible to simultaneously maximise
the generality, realism, and precision of models of complex systems (Levins, 1966). Broadly speaking,
one can have a generalised and probably relatively abstract model which fits many cases but none
of them in every detail, or a more specific and probably more realistic model which fits just one or
a few cases in greater detail. In the case of an ABM greater realism normally entails one or more of
the following:
1 Capturing a larger number of different properties of the modelled entities. For example, does
the environment contain woodland, or is it made up of several different tree species which have
different calorific output when burned?
2 Modelling more of the relationships between different entities and so capturing a larger number
of real-world process. For example, when a hunter kills prey does that have no effect on the
subsequent availability of prey, or does it deplete the prey population and, if so, does that in turn
impact on future prey population growth?
3 Less commonly, visual realism in the sense of being rendered in a virtual reality.
There are different views on the relative merits of realism versus generality. Kohler and van der
Leeuw (2007b, p. 3) argue that “A good model is not a universal scientific truth but fits some por-
tion of the real world reasonably well, in certain respects and for some specific purpose”, so not
unsurprisingly they suggest that the choice between realism and generality should be made accord-
ing to the scope and purpose of the model. Others see a strong presumption in favour of simplicity
(Premo, 2008; Costopoulos, 2017) on the grounds that: (a) understanding requires reducing complexity
to “intelligible dimensions” (Wobst, 1974, p. 151); (b) it is more parsimonious to discover how much
complexity is necessary to explain the observed phenomenon than it is to assume it from the outset
Spatial agent-based modelling 253
(Premo, 2007); and (c) models which have not been finely honed to fit a particular case but can
account for a greater diversity of cases have greater explanatory power because they allow one to
predict what should happen in a wider range of circumstances (Costopoulos, 2009).
Environment
It is possible to build an ABM in which the agents are not explicitly situated in any kind of space, although
in archaeology that is largely confined to tactical applications (e.g. Eerkens et al., 2005). Most archaeo-
logical ABM are spatial and the introduction of space requires consideration of three important issues.
• Geometry. Spatial ABM can have very different degrees of geometric specificity (Worboys & Duck-
ham, 2004). A purely topological network of agents explicitly models which agent is connected to
which. Adding edge-weights to the network (see also Brughmans & Peeples, this volume) allows
the modeller to provide information about the relationship between the agents (which could be the
distance between them in Euclidean space or a non-spatial property such as their similarity with
respect to some trait). More commonly, agents are located in Euclidean space, typically by placing
them on a regular grid of cells akin to a GIS raster map. The grid can be ‘empty’, simply serving to
locate agents with respect to one another, or it may contain values representing terrain or some other
aspect of the environment. Gridded environments can be abstract, or they can be a geographically
referenced representation of some part of the earth’s surface. Often the opposite edges of abstract-
gridded environments are joined to form a continuous surface on a torus (doughnut), thereby avoid-
ing edge effects such as a reduction in spatial neighbourhood (see e.g. Premo, 2005).
• Updating. An important consideration is whether the agents’ environment should be updated as the
simulation runs. For example, in a simulation run for 100 years it would probably not be necessary
to update terrain height, whereas it might be appropriate to denude a resource exploited by agents
as and when they ‘harvest’ it. The latter would require a decision about whether, when and how
the resource should regenerate. A decision of this nature will require careful thought about system
bounding because it involves determining whether the resource can simply be ‘reset’ to some fixed
value, or whether it should be set to a new value which is itself the outcome of explicitly modelling
the process of regeneration. The latter blurs the boundary between agents and environment because
in a sense the environment has acquired ‘behaviour’ whose outcome may not be known without
running simulations – it too has become a generative phenomenon.
• Input data. The task of populating an ABM environment with appropriate values varies enormously
in magnitude. An abstract model might use a synthetic environment of resource availability in
which the absolute values may be arbitrary but perhaps the environment as a whole is characterised
by a particular property, for example a specific amount of spatial autocorrelation (Lake, 2001b). In
this case, a suitable grid of values can easily be created using GIS or statistical software (see Lloyd &
Atkinson, this volume). At the other extreme are ABMs with environments that represent the real
world at some point in time. The necessary paleoenvironmental reconstruction is often a significant
project in its own right, entailing both fieldwork and modelling (e.g. the case study in this chapter
and also Barton et al., 2010; Kohler et al., 2007; Wilkinson et al., 2007). Interpolating from sparse
254 Mark Lake
Agents
ABM is scale-agnostic, so agents can be any entity which can be treated as an individual in the sense that
it acts as a cohesive whole in respect of the particular research problem (Ferber, 1999). In archaeological
ABM the agents are usually individual people, or groups of people such as households, so the most impor-
tant design decisions usually concern agent goals, behaviour and learning (sociality is discussed later in
the context of collectives). Note that many of the issues discussed here are not relevant to uncomplicated
abstract models such as, for example, ABMs of cultural transmission in which agents simply copy traits
from other agents (e.g. Lake & Crema, 2012).
• Attributes, states and behaviour. Attributes are enduring traits which an agent possesses throughout
its lifespan, for example whether it is male or female. In contrast, states change as a result of agent
behaviour and decision-making (e.g. their location, energy reserves), the passage of time (age), or
possibly external agent or environmental impacts (e.g. theft of resources). Whether a given trait
counts as a fixed attribute or variable state depends on the framing of the research question. For
example, consider two different ways of building an ABM to explore the transition from foraging
to farming: endow agents with the decision-making capacity to change their preferred subsistence
strategy (e.g. Bentley et al., 2005), or allow the relative proportion of lifetime foragers and farmers in
the overall population of agents to change as a result of differential reproduction, inter-generational
cultural transmission, or land-use competition (e.g. Angourakis et al., 2014). The former makes the
subsistence strategy a state, the latter an attribute. Which of these is the right approach depends
on the archaeological evidence, the duration of the transition and time-scale of the model, and the
modeller’s views concerning the primacy of individual human agency.
• Goals. Agents are autonomous in the sense that their behaviour is directed by their own goals,
which may be different from those of other agents (Ferber, 1999, pp. 9–10). Ordinarily, an agent’s
ultimate goals will be determined by the modeller, but its proximal (immediate) goal at any par-
ticular time during the simulation may be variable if has been endowed with the capacity for
meta decision-making (see Mithen, 1990). In evolutionary ABMs, in which agents differentially
reproduce, the modeller usually determines a set of ultimate goals but does not specify which
individual agent has which goal except perhaps for the first generation. Evolutionary ABMs, in
which the suite of goals can evolve by recombination during agent reproduction, are uncommon
in archaeological ABM.
• Rules. An agent’s behaviour depends on decision-making rules which determine how it ‘thinks’
it can best pursue its goals given the circumstances in which it finds itself. These rules are speci-
fied by the modeller (except in models where they can evolve), but if the model is generative it
will be necessary to run the simulation to discover how the agents actually behave. Ordinarily,
agents are rational in the sense that their decision-making rules ensure a non-random relationship
between their goals, circumstances and behaviour. Rationality in this sense requires that agents
have some measure of the absolute or relative ‘worth’ of the actual or predicted outcomes of differ-
ent behaviours – what biologists term ‘fitness’ and economists term ‘utility’ (Railsback & Grimm,
2012, p. 143). This terminology and the fact that many archaeological ABMs use insights from
Spatial agent-based modelling 255
behavioural ecology (see Kohler, 2000; Mithen, 1989 for arguments in favour) has led to criticism
of agent decision-making rules on the grounds that they project modern rationality back into
the past (e.g. Clark, 2000; Cowgill, 2000; Shanks & Tilley, 1987; Thomas, 1991). There are two
issues at stake here: (a) is it appropriate to invoke a rationality grounded in modern evolutionary
biology or neoliberal economics, and (b) is it actually necessary to do so when using ABM. This
debate was reviewed by Lake (2004), who argued that ABM can in principle accommodate alterna-
tive rationalities.
• Agent prediction/learning. In an ABM learning can take place at the level of individual agents and/or
the system as a whole. The latter is discussed in the context of collectives. An individual agent can
be said to learn when it:
1 Discovers what resources are present in the environment as it moves through it. Note that a
cognitivist would require that the agent forms a representation of the environment that is sepa-
rate from the environment itself – a good test of this is whether the agent can ever have incor-
rect knowledge of its environment (perhaps due to the subsequent actions of other agents).
2 Forms a view about something that is not directly observable. For example, the likelihood of
encountering a particular type of animal is not directly observable, but must be inferred from
the number of actual encounters in a given duration and, as a result, different agents could end
up with different estimates based purely on chance. The accuracy of this kind of learning in a
changing environment depends on how much weight agents give to more distant events relative
to less distant ones, where distance could be in either time or space or both (see Costopoulos,
2001; Wren, Zue, Costopoulos, & Burke, 2014).
3 Copies behaviour or obtains knowledge from another agent. Note that use of the term ‘social
learning’ to describe this is intended to emphasise the fact that such learning eschews direct
observation of the environment, not that it necessarily entails a patterned (social) relationship
between the agents involved (Hinde, 1976).
The possibility of explicitly modelling learning means that ABM can be used to build formal quantitative
models in which humans are not perfect all-knowing decision-makers (see Bentley & Ormerod, 2012;
Mithen, 1991; Reynolds, 1987; Slingerland & Collard, 2012).
Collectives
Both sociologists (Gilbert, 1995) and archaeologists (Kohler & van der Leeuw, 2007a; Beekman, 2005)
have advocated using ABM to study the emergence of social norms and institutions from the beliefs and
actions of individuals. Emergence is a thorny philosophical problem (see Bedau & Humphreys, 2008)
and readers are referred to Beekman (2005) and Lake (2015) for more detailed discussion of the issues
as they relate to archaeological ABM. Basically, the concept of emergence raises two main questions in
the social sciences. One is whether the apparently recursive relationship between individuals and society
means that social institutions actually exert irreducible causal influence on agents (see Gilbert, 1995, for an
overview). The other question is whether the fact that human agents reason about the emergent proper-
ties of their own societies makes emergence in human systems qualitatively different from emergence in
physical systems (Conte & Gilbert, 1995; Gilbert, 1995).
In practice one can distinguish three kinds of ‘collective’ phenomenon in ABM:
1 Robust population-level patterning in the interactions of individual agents who are not, however,
aware of this patterning.
256 Mark Lake
2 Patterned interaction in which agents are in some sense aware of the pattern and perhaps even adjust
their behaviour accordingly. For example, an agent might consider itself to belong to a group of
agents who share complimentary goals, but not actually engage in collective decision-making.
3 Agents contributing to and abiding by collective decision-making. Examples of this kind of strong
collective can be found in archaeological ABM of hunter-gatherer (Lake, 2000a) and small-scale
agricultural societies (Kohler et al., 2012b).
The modeller must decide how far to pre-program collectives or whether to allow them to emerge. The
first kind of collective can readily be obtained by true emergence, whereas the second and third types are
more commonly (Railsback & Grimm, 2012, p. 210) scaffolded by programming agents with additional
characteristics (such as a group ID) and/or programming the characteristics of the collective entities (for
example, specifying the possible states and behaviours of groups even before any agents actually belong
to them). At the present time, archaeological ABM typically offer either emergent collective phenomena,
or collectives with some causal influence over agents, but not both (see Lake, 2015, for a more detailed
assessment).
Treatment of time
Modelling how a process unfolds over time requires decisions about the appropriate temporal intervals
and the scheduling of events.
• Temporal intervals and duration. The temporal intervals (timesteps) should reflect the frequency and
duration of the relevant agent decision-making and behaviour. It is not always necessary to calibrate
a simulation in terms of real-world time: for example, a tactical simulation intended to help develop
measures of drift in cultural evolution might have timesteps which are just abstract generations. The
total duration should reflect the rate at which the outcomes of agent behaviour accumulate to pro-
duce detectable patterns, both in the simulation itself and in the archaeological record (if relevant).
Note that the minimum temporal envelope within which changes in behaviour can be observed in
the archaeological record will often be longer than the duration over which such changes are detect-
able in the simulation results. One of the advantages of ABM is that it can be used to investigate
what the results of ethnographic-scale human behaviours might look like when time-averaged in
the archaeological record (e.g. Premo, 2014), that is to say, what the accumulation of material from
multiple episodes of behaviour might look like when aggregated across the minimum time-span that
archaeologists can differentiate given the effects of post-depositional processes and available dating
techniques.
• Scheduling. An ABM can be event driven, in which case agents individually schedule their own
activities (e.g. Lake, 2000b), or programmed so that all agents undertake activities at the same set
intervals. The latter scenario is much more common, but unless the ABM is being run on specialised
parallel hardware, the simulation will proceed sequentially even if conceptually agents are considered
to be undertaking activities at the same time. In this case, it is good practice to ensure that agents
do not undertake activities in the same order at every timestep, so as to avoid arbitrarily advantaging
or disadvantaging those that come towards the front or back of the execution queue. It will also be
necessary to decide whether or not agents should be aware of the results of the behaviour of other
agents who preceded them in the queue. As an example, agents who are unaware that other agents
have already harvested a resource in the same timestep will base their decision-making on imperfect
knowledge, so the question is whether perfect or imperfect knowledge better captures reality.
Spatial agent-based modelling 257
Computer hardware
Productive ABMs have been run on hardware ranging from laptops to high performance computers
(HPC) offering hardware parallelism. Hardware requirements are a function of the complexity of the
model and the rigour of the experimental design (see the next section). In many cases it is the latter which
poses the greatest challenge – a simulation which completes in one hour becomes a different proposi-
tion if it is necessary to undertake 1000 runs for all possible combinations of three parameters which can
each take ten values! Hardware evolves very rapidly, but one general point worth noting is that simply
increasing the number of cores in a computer does not increase the speed of simulation unless either the
software supports parallel execution of the code, or it is possible to arrange simultaneous execution of
multiple different simulations.
Software platforms
Implementation of an ABM invariably requires some computer programming, so the modeller will either
need to learn to program or collaborate with others who can. ABM can be implemented using a variety
of programming languages and software, each of which has pros and cons.
• General purpose programming languages (e.g. C++, Java, Python). These might be a good choice if the
modeller already knows the programming language and the model is relatively simple. An ABM
written in a general purpose compiled language such as C++ is likely to run very fast, but on the
other hand the lack of existing functionality may slow down development of a more complex
model, especially if a graphical user interface (GUI) is required and/or integration with GIS or sta-
tistical software.
• Statistical/mathematical programming languages. To judge from recent examples (e.g. Crema, 2014;
Crema & Lake, 2015) statistical programming languages such as R are probably better suited to
simpler abstract ABM, especially where a GUI is not required. Quantitatively inclined archaeologists
may already be conversant with languages such as R, but the greatest advantage of this approach
is the direct integration of the model into a powerful framework for the statistical analysis of the
simulation results (see for example Bevan, this volume and Crema, this volume), which can greatly
facilitate rigorous experimental design.
• Dedicated simulation frameworks. Dedicated simulation frameworks (e.g. Ascape, Mason, Repast Sym-
phony, SWARM) may provide a ‘drop-and-drag’ graphical model building tool, but in most cases
the modeller will end up writing at least some programming code using an object-oriented lan-
guage such as Objective C, Java or Python. The main advantage of a simulation framework is that
it provides code for functionality such as controlling the simulation, setting parameters, scheduling
agents, drawing them on screen, logging results and often also exchanging data with other software
such as GIS. The most popular frameworks are largely ‘paradigm agnostic’ in that they do not impose
a particular concept of what constitutes an agent or how to model the environment. Additionally,
some frameworks (e.g. Pandora, Repast for HPC) support implementing ABM on high performance
computers. Taken together, these attributes make the popular simulation frameworks well-suited for
implementing complex computationally intensive ABMs.
• Integrated modelling environment. An integrated modelling environment provides a ‘one-stop’ solu-
tion for implementing an ABM by providing a single GUI for writing program code, running
258 Mark Lake
simulations, visualizing and logging the results and even automating multiple runs with different
parameters. The best known is NetLogo, which provides an excellent vehicle for learning ABM
(Railsback & Grimm, 2012 uses it) while at the same time being capable of supporting useful sci-
entific experiments in archaeology (e.g. Premo, 2014). Indeed, a particular advantage of NetLogo
is the built-in support for sensitivity analysis, which facilitates and encourages the experimentation
required to actually learn from an ABM. NetLogo was originally designed around a particular
concept of agents and their environment and it may (probably rarely) be unnatural or perhaps even
impossible to use it to implement a specific conceptual model.
• Loose coupling entails moving data between the ABM and GIS by saving and importing files that
both can read, typically a real or de facto interchange format such as ESRI’s shapefile and ASCII grid
formats. The most popular simulation frameworks and integrated modelling environments provide
the necessary functionality to achieve this kind of coupling, which generally occurs at the beginning
and end of each simulation.
• Tight coupling involves one or both of two enhancements over loose coupling. One is that the ABM
can directly access the GIS data in its native format by connecting to the geodatabase maintained by
the GIS software. Avoiding the need to convert data into an intermediate format and/or write it to
disk potentially increases the speed of data exchange, thereby facilitating the second enhancement,
which is synchronisation of the ABM and GIS, usually so that the GIS can actually be used to modify
the environment occupied by agents at intervals during the simulation (e.g. Barton et al., 2015).
Tight coupling of this nature generally requires that both the ABM and GIS can be controlled by a
meta-program (typically a Unix shell script or Python script).
• Integration takes tight coupling one step further and dissolves the distinction between the ABM and
GIS software by embedding one in the other. One option is to model environmental change by
implementing the relevant algorithms within the ABM, even to the extent of treating aspects of the
environment (such as woodland) as being made up of agents (individual trees). Another is to modify
the GIS software to implement agent behaviour and dynamic updating of the GIS data (e.g. Lake,
2000b); this requires that the GIS software has a rich scripting language or that its source code is
available for modification (as with open source software).
Verification
Verification is the process of ensuring that the ABM program code correctly implements the conceptual
model (Aldenderfer, 1981a). Verification is not intended to determine whether the underlying concep-
tual model is a good model of the world, although it can sometimes reveal flaws of logic, typically where
the conceptual model simply does not specify what should happen under certain circumstances. Readers
should consult Railsback and Grimm (2012), Chapter 6, for practical advice about how to verify ABM
program code.
Spatial agent-based modelling 259
• Dealing with parameter uncertainty. If there is uncertainty about what parameter values best represent
the state of the world in the past then it will be necessary to run multiple simulations with different
260 Mark Lake
values in order to establish the likelihood of different outcomes (rather as for dealing with stochas-
ticity). Note, however, that the likelihood of different outcomes can only be estimated if attention
is paid to both the range of parameter values and the probability that they are correct, since propor-
tionately more simulations should be run with the more likely parameter values. Consequently, the
parameter values should be drawn from a distribution which reflects the nature of the uncertainty:
for example a uniform distribution if all values are equally plausible, but perhaps a normal distribu-
tion if a certain value is most likely and more distant values increasingly unlikely.
Figure 14.3 Graphed Agent Based Model (ABM) simulation results which collectively illustrate several aspects
of experimental design: (a) plotted points of the same colour and k value differ due to stochastic effects alone;
(b), two different parameters s and k are varied; and (c) two different agent rules, “CopyTheBest” and “Copy-
IfBetter” are explored. A colour version of this figure can be found in the plates section.
Source: Reproduced with permission from Figure 4 in Crema and Lake (2015)
Spatial agent-based modelling 261
• Establishing what is possible. If the aim is to establish what could have happened in history under cer-
tain circumstances then it will be necessary to investigate what outcomes are possible given different
assumptions and starting points. As with the case of parameter uncertainty this requires multiple
simulation runs made with parameter values of interest. However, since the aim is not to establish
the likelihood of different outcomes there is no need to attach a probability to different parameter
values.
• Estimating unknown parameters. The aim here turns the conventional approach on its head by making
the parameter values the unknowns that are to be estimated by running simulations. The logic is to
vary the parameters and discover what values most reliably produce simulation results that match
the archaeological record. A good example of this approach is Crema, Edinborough, Kerig, and
Shennan’s (2014) use of simulation to investigate what kind of cultural transmission best explains
observed changes in European Neolithic arrowhead assemblages. Formal models of cultural trans-
mission usually have population size (of ‘teachers’) and innovation rate as important parameters, but
these values are rarely known with certainty and indeed are often quantities that the modeller would
like to infer. Crema et al. adopted an approximate Bayesian computation framework in which they
provided prior probability distributions for these parameters and then ran multiple simulations
which collectively sampled possible combinations of parameter values. By comparing the simulation
results to the observed changes in the archaeological data they were then able to provide posterior
probabilities for the parameters, in other words, to infer which values were more or less likely than
others given both initial knowledge and the results of the simulations.
The basic idea is that it is often relatively easy to ‘tune’ a model to replicate a single dataset comprising just
one variable, but rather more difficult to replicate multiple datasets and/or multiple variables. Achieving
the latter suggests that the model is ‘structurally realistic’. Railsback and Grimm (2012) provide an excel-
lent introduction to POM, but a brief archaeologically oriented example serves to illustrate the concept.
Mithen (1993) built a computer simulation in which human hunting impacted on the population growth
of mammoths. Rather than simply attempt to replicate the decline in the overall mammoth population, he
explicitly modelled the age structure of the mammoth population. This not only provided an additional
point of contact with the archaeological record (one more readily available than overall population size)
but also better captured the real-world causal dynamics – that it might matter whether or not humans
hunted animals of reproductive age.
• Dissemination of the program code and input data. Ideally it should be possible for other researchers to
run the simulation, both to verify the published results and to explore other scenarios. Program
code and data can be disseminated as ‘supplementary material’ hosted alongside published journal
articles, placed on a Web-based hosting service such as GitHub, or perhaps better still, uploaded to
a collective repository such as Open ABM (www.openabm.org/). The ABM program code should
include inline comments to help others understand how it works and should be accompanied by
information about the computational environment required to run it.
• Documentation of the conceptual model. Researchers may be able to infer many aspects of the conceptual
model from the program code itself, but that presupposes that the program is actually an accurate
reflection of the original modeller’s intention and, in any case, it is helpful to have further informa-
tion about assumptions that have been made. The ODD (Overview, Design concepts, and Details)
protocol has been proposed as a standard for describing agent-based models and ODD-style docu-
mentation has been incorporated into the NetLogo integrated simulation environment. The full
specification can be found in Grimm et al. (2006, 2010) but here is the outline:
Overview The purpose of the model (which aspects of reality are included and why?). What the
entities are and how they are characterised? What processes are included and when do they
occur?
Design concepts For example, is the model intended to produce emergent phenomena? Does it
involve individual or population-level adaptation? Does it include stochastic elements? What is
the nature of any collectives?
Details How is the model initialized? What are the external inputs? A fuller mathematical and/or
verbal description of the model.
• Documentation of the experimental design. In order to reproduce and/or extend published results, other
researchers will also need to know the exact range of parameters ‘swept’ during multiple runs. Any
Spatial agent-based modelling 263
post-processing of the raw simulation output (for example, the aggregation or averaging of agent
state variables) should also be documented.
Case study
As noted above, the Long House Valley ABM (Dean et al., 2000; Axtell et al., 2002) is a well-known
archaeological model (Kohler et al., 2005) which illustrates many of the features of a modern spatial ABM
(Figure 14.4). There are several reasons for drawing attention to this model as a case study. One is that it
tackles the kind of research question (collapse of societies) that excites interest beyond academe and to that
extent, at least, is therefore a good advertisement for the use of spatial ABM in archaeology. Moreover, and
not unrelated, a version of the model (called “Artificial Anasazi”) is available as part of the standard release
of the popular and easy to install NetLogo ABM software (Stonedahl & Wilensky, 2010b). Consequently,
the interested reader can quite quickly get to the point of running the model, experimenting with it and
ultimately exploring and even modifying the code. Finally – and unusually – this model has a history
Figure 14.4 Comparison of Long House Valley simulation results with archaeological evidence. A colour ver-
sion of this figure can be found in the plates section.
Source: Adapted with permission from Kohler et al. (2005)
264 Mark Lake
(Swedlund, Sattenspiel, Warren, & Gumerman, 2015) to the extent that it has been re-implemented and
studied by researchers who were not part of the original modelling effort, and this includes an analysis of
what actually causes the model outcomes (Janssen, 2009; Stonedahl & Wilensky, 2010a). This history of
use is an instructive lesson in how to ‘do science’ with archaeological spatial ABM.
Research question
Long House Valley, in northeastern Arizona, was sparsely occupied by hunters and gatherers until the intro-
duction of maize at around 1800 BC initiated the gradual development of substantial permanent settlements
and the Puebloan Anasazi cultural tradition. The valley was abruptly abandoned around AD 1300 and the
population migrated elsewhere. A key question is what caused the abandonment and, in particular, to what
extent it can simply be explained by the onset of climatic deterioration at circa AD 1270.
Three features of Long House Valley make it particularly suitable for the application of ethnographic-
scale spatial ABM. One is that the valley is a topographically discrete entity which, given the focus on
agricultural subsistence, provides a natural ‘edge’ for the simulated world. The second feature is the avail-
ability of very rich and high resolution palaeoenvironmental data which make it possible to estimate the
maize growing potential of every hectare in the valley annually from AD 400–1450. Third, the valley has
been intensively surveyed, so there is relatively complete knowledge of the Puebloan settlement pattern,
much of it dated by dendrochronology. Additionally, it is claimed that ethnographic studies of historic
Pueblo groups can be used to parameterise aspects of the model, such as the nutritional requirements of
agents.
Model design
The two main components of the Long House Valley model are the landscape and agents. The landscape
is a 100 × 100m raster representation of Long House Valley in which each cell is allocated to one of seven
different zones. These differ in their agricultural yield (of maize) and are variably susceptible to changes
in the Palmer Drought Severity index (a measure of the impact of moisture and temperature on crop
growth). Additionally, the model includes a raster map of water sources. In later versions of the model,
variability in soil quality within zones is modelled stochastically by the simple expedient of adding a
random number drawn from a uniform distribution between zero and some upper bound representing
the spatial harvest variance.
Each agent represents a household of five persons. Agents farm one map cell and occupy a separate
unfarmed residential location which must be within 1km of their farmland. Agents have a fission age,
at which they spawn a new household, and an age of death, when they are removed from the model.
In the first version of the model these attributes were the same for all agents, but in later versions some
stochastic heterogeneity was introduced by randomly drawing these values from a uniform distribution
with specified lower and upper bounds. The goal of agents is to grow sufficient maize to meet their
annual requirement for survival. Agents who anticipate falling short search for a new cell to farm as per
the rules in Table 14.1 and, if successful, move there. Agents who exceed their fission age have a chance
of spawning a new household, which takes a fraction of the parent household’s stored maize.
The model is run from AD 800–1350 in annual time steps. At each time step the Palmer Drought
Severity Index is updated, which alters the yield of map cells. The map of water sources is also updated,
which is one of the criteria used by agents attempting to move to a new cell to farm. Agents also pursue
their goals (harvesting maize, possibly relocating and possibly fissioning) once per time step. The result of
iterating these processes is a simulated annual record of population size and settlement location.
Spatial agent-based modelling 265
Table 14.1 Rules for choosing new farming and settlement locations (from Axtell et al., 2002, Table 2).
Further details of the model can be found in several sources. The version of the model distributed as
part of the standard NetLogo model library includes an ODD-like description, which can also be viewed
at http://ccl.northwestern.edu/netlogo/models/ArtificialAnasazi. More detail, including tables of agent
attributes and rules in the original model are published in Axtell et al. (2002). Similar information is
provided by Janssen (2009), who additionally also describes certain submodels (for example, how exactly
the agricultural yield is calculated).
Experiments
The first version of the model has 17 parameters, and the model was initially run with values based on
ethnographic accounts of historic Pueblo groups, as per Table 14.2. It was found that with these “base
case” (Axtell et al., 2002) parameter values the model could reproduce qualitative features of the history
of demographic changes and settlement patterns in Long House Valley, but the actual population sizes
were up to six times too large (Axtell et al., 2002; Kohler et al., 2005). Subsequent adjustment of farming
yields to reflect characteristics of prehistoric maize coupled with the introduction of landscape and agent
heterogeneity, as mentioned above, resulted in the model closely matching the historic population sizes
(estimated from room counts).
The experimental design for the version of the model with greater stochasticity entailed calibrating
the model by varying the upper and lower bounds of the stochastic parameters to find the values which
produced the best fit between the simulated and historic population sizes (Axtell et al., 2002). This was
undertaken for both individual runs and for averages of 15 runs, the latter reflecting the fact that runs
with identical parameters can produce different results by chance alone.
Janssen (2009) subsequently conducted a further round of experiments on a version of Long House
Valley model re-implemented in NetLogo. He was able to replicate the results reported by Axtell et al.
(2002), although it is interesting to see (Janssen, 2009, Figure 3) that even the calibrated model can pro-
duce quite variable results, some of which do not so convincingly match the qualitative features of the
population history (Figure 14.5). Perhaps more importantly, Janssen (2009, Paragraph 4.1) also conducted
experiments designed specifically to answer the question “What leads to the good fit of the simulation
Table 14.2 Original ‘base’ parameter values for the Long House Valley
model (from Axtell et al., 2002, Table 4).
Parameter Value
Figure 14.5 Population curves produced by 100 runs of the calibrated Long House Valley Model, differing
only in random seed.
Source: Reproduced under a CC-BY-4.0 license from Janssen (2009)
Spatial agent-based modelling 267
with the aggregated population data?” He found that the fit between the simulated and historic popula-
tion is primarily a function of landscape carrying capacity rather than parameters determining the lon-
gevity of households or at what age they might fission.
Implications
The best fitting runs of the calibrated model produce annual population sizes that track the estimated
historic values uncannily well up until abandonment of Long House Valley. If Janssen’s analysis is correct,
this may be primarily a function of the quality of the carrying capacity estimates derived from painstaking
palaeoenvironmental research. On the other hand, even the best-fitting runs fail to predict the complete
depopulation of Long House Valley at circa AD1300 and so all those who have analysed the model are
in agreement that it has convincingly demonstrated that environmental factors alone cannot account
for the abrupt abandonment of the valley. Indeed, Kohler et al. (2005) suggest that archaeologists should
instead look for sociopolitcal or ideological drivers of this event. The role that ABM might play in this
next instalment of research is discussed by Janssen (2009).
Conclusion
Archaeological ABM are used for a variety of purposes and vary greatly in their complexity. Twenty –
perhaps even ten – years ago, ABM was almost always computationally ‘cutting edge’ in some way, and
this is still true of some more complex models, especially those requiring high performance computing
and/or generating virtual reality visualisations. On the other hand, many recent archaeological ABMs
have been implemented using well-established software and run on relatively mainstream hardware. This
does not mean that those archaeological ABM’s are not computationally demanding, but that hardware
and software are now sufficient to permit greater focus on other issues such as experimental design. The
fact that the technological aspects of ABM have in many cases become less remarkable (literally so in
recent publications) suggest that the technique has genuinely come of age as a useful part of the archaeo-
logical toolkit. As the technology of ABM becomes ever more accessible it is hoped that this chapter will
help users understand what makes an archaeological ABM scientifically productive.
References
Aldenderfer, M. S. (1981a). Computer simulation for archaeology: An introductory essay. In J. A. Sabloff (Ed.),
Simulations in archaeology (pp. 67–118). Albuquerque: University of New Mexico.
Aldenderfer, M. S. (1981b). Creating assemblages by computer simulation: The development and uses of ABSIM. In
J. A. Sabloff (Ed.), Simulations in archaeology (pp. 11–49). Albuquerque: University of New Mexico.
Aldenderfer, M. S. (1998). Quantitative methods in archaeology: A review of recent trends and developments. Journal
of Archaeological Research, 6, 91–1220.
Allen, P., & McGlade, J. (1987). Evolutionary drive: The effect of microscopic diversity. Foundations of Physics, 17,
723–738.
Altaweel, M., Alessa, L., Kliskey, A., & Bone, C. (2010). A framework to structure agent-based modeling data for
social-ecological systems. Structure and Dynamics, 4(1), article 2.
Angourakis, A., Rondelli, B., Stride, S., Rubio-Campillo, X., Balbo, A. L., Torrano, A., . . . Gurt, J. M. (2014). Land use
patterns in Central Asia. step 1: The musical chairs model. Journal of Archaeological Method and Theory, 21, 405–425.
Aubán, J. B., Barton, C. M., Gordó, S. P., & Bergin, S. M. (2015). Modeling initial neolithic dispersal: The first
agricultural groups in west Mediterranean. Ecological Modelling, 307, 22–31.
268 Mark Lake
Axtell, R. L., Epstein, J. M., Dean, J. S., Gumerman, G. J., Swedlund, A. C., Harburger, J., . . . Parker, M. (2002).
Population growth and collapse in a multiagent model of the Kayenta Anasazi in Long House Valley. Proceedings
of the National Academy of Sciences of the United States of America, 99(Suppl 3), 7275–7279.
Barton, C., Riel-Salvatore, J., Anderies, J., & Popescu, G. (2011). Modeling human ecodynamics and biocultural
interactions in the late Pleistocene of western Eurasia. Human Ecology, 39, 1–21.
Barton, C., Ullah, I., & Mitasova, H. (2010). Computational modeling and Neolithic socioecological dynamics: A
case study from southwest Asia. American Antiquity, 75(2), 364–386.
Barton, C. M. (2013). Stories of the past or science of the future? Archaeology and computational social science. In
A. Bevan & M. Lake (Eds.), Computational Approaches to Archaeological Spaces (pp. 151–178). Walnut Creek, CA:
Left Coast Press.
Barton, C. M., Ullah, I., Mayer, G., Bergin, S., Sarjoughian, H., & Mitasova, H. (2015). MedLanD modeling laboratory
v.1 (version 1.0.0). Technical Report, CoMSES Computational Model Library. Retrieved from www.comses.net/
codebases/4609/releases/1.0.0/
Barton, C. M., Ullah, I. I., & Bergin, S. (2010). Land use, water and Mediterranean landscapes: Modelling long-term
dynamics of complex socio-ecological systems. Philosophical Transactions of the Royal Society A: Mathematical, Physical
and Engineering Sciences, 368(1931), 5275–5297.
Bedau, M. A., & Humphreys, P. (2008). Introduction. In M. A. Bedau & P. Humphreys (Eds.), Emergence: Contempo-
rary readings in philosophy and science (pp. 1–6). Cambridge, MA: The MIT Press.
Beekman, C. S. (2005). Agency, collectivities and emergence: Social theory and agent based simulations. In C. S.
Beekman & W. W. Baden (Eds.), Nonlinear models for archaeology and anthropology (pp. 51–78). Aldershot, UK:
Ashgate.
Beekman, C. S., & Baden, W. W. (2005). Continuing the revolution. In C. S. Beekman & W. W. Baden (Eds.),
Nonlinear models for archaeology and anthropology (pp. 1–12). Aldershot, UK: Ashgate.
Bentley, A., & Ormerod, P. (2012). Agents, intelligence, and social atoms. In M. Collard & E. Slingerland (Eds.),
Creating consilience: Reconciling science and the humanities (pp. 205–222). Oxford: Oxford University Press.
Bentley, R. A., Hahn, M. W., & Shennan, S. J. (2004). Random drift and culture change. Proceedings of the Royal Society
of London B, 271, 1443–1450.
Bentley, R. A., Lake, M. W., & Shennan, S. J. (2005). Specialisation and wealth inequality in a model of a clustered
economic network. Journal of Archaeological Science, 32(9), 1346–1356.
Binford, L. R. (1977). For theory building in archaeology. New York: Academic Press.
Burnham, K. P., & Anderson, D. R. (2002). Model selection and multi-model inference: A practical information-theoretic
approach (2nd ed.). New York: Springer.
Cegielski, W., & Rogers, J. (2016). Rethinking the role of agent-based modeling in archaeology. Journal of Anthro-
pological Archaeology, 41, 283–298.
Ch’ng, E. (2007). Using games engines for archaeological visualisation: Recreating lost worlds. Proceedings of CGames
2007 (11th international conference on computer games: AI, animation, mobile, educational & serious games), La Rochelle,
France (2007), 7, 26–30.
Ch’ng, E., Chapman, H., Gaffney, V., Murgatroyd, P., Gaffney, C., & Neubauer, W. (2011). From sites to landscapes:
How computing technology is shaping archaeological practice. Computer, 44(7), 40–46.
Ch’ng, E., & Stone, R. J. (2006). 3D archaeological reconstruction and visualisation: An artificial life model for determining
vegetation dispersal patterns in ancient landscapes. Proceedings of the International Conference on Computer Graph-
ics, Imaging and Visualisation (CGIV’06), Sydney, Australia.
Clark, J. E. (2000). Towards a better explanation of hereditary inequality: A critical assessment of natural and historic
human agents. In M. A. Dobres & J. E. Robb (Eds.), Agency in archaeology (pp. 92–112). London: Routledge.
Conte, R., & Gilbert, N. (1995). Introduction: Computer simulation for social theory. In N. Gilbert & R. Conte
(Eds.), Artificial societies: The computer simulation of social life (pp. 1–18). London: UCL Press.
Contreras, D., Guiot, J., Suarez, R., & Kirman, A. (2018). Reaching the human scale: A spatial and temporal downscal-
ing approach to the archaeological implications of paleoclimate data. Journal of Archaeological Science, 93, 54–67.
Costopoulos, A. (2001). Evaluating the impact of increasing memory on agent behaviour: Adaptive patterns in an
agent-based simulation of subsistence. Journal of Artificial Societies and Social Simulation, 4. Retrieved from www.
soc.surrey.ac.uk/JASSS/4/4/7.html
Spatial agent-based modelling 269
Costopoulos, A. (2009). Simulating society. In H. Maschner, R. A. Bentley, & C. Chippindale (Eds.), Handbook of
archaeological theories (pp. 273–281). Lanham, MD: Altamira Press.
Costopoulos, A. (2010). For a theory of archaeological simulation. In A. Costopoulos & M. Lake (Eds.), Simulating
change: Archaeology into the twenty-first century (pp. 21–27). Salt Lake City: University of Utah Press.
Costopoulos, A. (2017). Can you model my valley? particular people, places and times in archaeological simulation.
tDAR id: 431522.
Cousins, S. A. O., Lavorel, S., & Davies, I. (2003). Modelling the effects of landscape pattern and grazing regimes on
the persistence of plant species with high conservation value in grasslands in south-eastern Sweden. Landscape
Ecology, 18, 315–332.
Cowgill, G. E. (2000). “Rationality” and contexts in agency theory. In M.-A. Dobres & J. E. Robb (Eds.), Agency in
archaeology (pp. 51–60). London: Routledge.
Crema, E., Edinborough, K., Kerig, T., & Shennan, S. (2014). An approximate bayesian computation approach for
inferring patterns of cultural evolutionary change. Journal of Archaeological Science, 50, 160–170.
Crema, E. R. (2014). A simulation model of fission-fusion dynamics and long-term settlement change. Journal of
Archaeological Method and Theory, 21, 385–404. doi:10.1007/s10816-013-9185-4
Crema, E. R., & Lake, M. W. (2015). Cultural incubators and spread of innovation. Human Biology, 87(3), 151–168.
Crooks, A. T., & Castle, C. J. E. (2012). The integration of agent-based modelling and geographical information for
geospatial simulation. In A. J. Heppenstall, et al. (Eds.), Agent-based models of geographical systems (pp. 219–251).
Dordrecht: Springer.
Dean, J. S., Gumerman, G. J., Epstein, J. M., Axtell, R. L., Swedlund, A. C., Parker, M. T., & McCarroll, S. (2000).
Understanding Anasazi culture change through agent-based modeling. In T. A. Kohler & G. J. Gumerman (Eds.),
Dynamics in human and primate societies: Agent-based modelling of social and spatial processes. Santa Fe Institute Studies
in the Sciences of Complexity (pp. 179–205). New York: Oxford University Press.
Doran, J. E. (1970). Systems theory, computer simulations, and archaeology. World Archaeology, 1, 289–298.
Doran, J. E., & Palmer, M. (1995). The EOS project: Integrating two models of Palaeolithic social change. In
N. Gilbert & R. Conte (Eds.), Artificial societies: The computer simulation of social life (pp. 103–125). London: UCL
Press.
Doran, J. E., Palmer, M., Gilbert, N., & Mellars, P. (1994). The EOS project: Modelling Upper Palaeolithic social
change. In N. Gilbert & J. Doran (Eds.), Simulating societies (pp. 195–221). London: UCL Press.
Ducke, B. (2013). Reproducible data analysis and the open source paradigm in archaeology. In A. Bevan & M. Lake
(Eds.), Computational approaches to archaeological spaces (pp. 307–318). Walnut Creek, CA: Left Coast Press.
Eerkens, J. W., Bettinger, R. L., & McElreath, R. (2005). Cultural transmission, phylogenetics, and the archaeological
record. In C. P. Lipo, M. J. O’Brien, M. Collard, & S. J. Shennan (Eds.), Mapping our ancestors: Phylogenic methods
in anthropology and prehistory (pp. 169–183). Somerset, NJ: Transaction Publishers.
Epstein, J. M., & Axtell, R. (1996). Growing artificial societies: Social science from the bottom up. Washington: Brookings
Press and MIT Press.
Ferber, J. (1999). Multi-agent systems: An introduction to distributed artificial intelligence. Harlow, England: Addison-Wesley.
Giddens, A. (1984). The constitution of society: Outline of a theory of structuration. Cambridge: Polity Press.
Gilbert, N. (1995). Emergence in social simulation. In N. Gilbert & R. Conte (Eds.), Artificial societies: The computer
simulation of social life (pp. 144–156). London: U.C.L. Press.
Gilbert, N. (2008). Agent-based models. Quantitative Applications in the Social Sciences. Thousand Oaks, CA: Sage.
Gould, S. J. (1989). Wonderful life: The Burgess Shale and the nature of history (paperback ed.). London: Vintage.
Grimm, V., Berger, U., Bastiansen, F., Eliassen, S., Ginot, V., Giske, J., . . . DeAngelis, D. L. (2006). A standard protocol
for describing individual-based and agent-based models. Ecological Modelling, 198, 115–126.
Grimm, V., Berger, U., DeAngelis, D. L., Polhill, J. G., Giske, J., & Railsback, S. F. (2010). The ODD protocol: A
review and first update. Ecological Modelling, 221, 2760–2768.
Grimm, V., & Railsback, S. (2005). Individual-based modeling and ecology. Princeton: Princeton University Press.
Gumerman, G. J., & Kohler, T. A. (2001). Creating alternative cultural histories in the prehistoric Southwest: Agent-
based modelling in archaeology. In Examining the course of Southwest archaeology: The Durango conference, September
1995 (pp. 113–124). Albuquerque: New Mexico Archaeological Council.
Hinde, R. A. (1976). Interactions, relationships and social structure. Man, 11, 1–17.
270 Mark Lake
Innis, G. S. (1972). Simulation of ill-defined systems, some problems and progress. Simulation, 19, 33–36.
Janssen, M. A. (2009). Understanding Artificial Anasazi. Journal of Artificial Societies and Social Simulation, 12(4), 13.
Retrieved from http://jasss.soc.surrey.ac.uk/12/4/13.html
Johnson, C. D., & Kohler, T. A. (2012). Modeling plant and animal productivity and fuel use. In T. A. Kohler,
M. D. Varien, & A. M. Wright (Eds.), Emergence and collapse of early villages: Models of Central Mesa Verde archaeology
(pp. 113–128). Berkeley: University of California Press.
Johnson, C. D., Kohler, T. A., & Cowan, J. (2005). Modeling historical ecology, thinking about contemporary sys-
tems. American Anthropologist, 107(1), 96–107.
Kachel, A. F., Premo, L. S., & Hublin, J.-J. (2011). Grandmothering and natural selection. Proceedings of the Royal
Society B: Biological Sciences, 278(1704), 384–391.
King, A. (1999). Against structure: A critique of morphogenetic social theory. The Sociological Review, 47(2), 199–227.
Kobti, Z. (2012). Simulating household exchange with cultural algorithms. In T. A. Kohler & M. D. Varien (Eds.),
Emergence and collapse of early villages: Models of central Mesa Verde archaeology (pp. 165–174). Berkeley: University
of California Press.
Kohler, T. A. (2000). Putting social sciences together again: An introduction to the volume. In T. A. Kohler & G. J.
Gumerman (Eds.), Dynamics in human and primate societies: Agent-based modelling of social and spatial processes. Santa
Fe Institute Studies in the Sciences of Complexity (pp. 1–44). New York: Oxford University Press.
Kohler, T. A., Bocinsky, R. K., Cockburn, D., Crabtree, S. A., Varien, M. D., Kolm, K. E., . . . Kobti, Z. (2012a).
Modelling prehispanic Pueblo societies in their ecosystems. Ecological Modelling, 241, 30–41.
Kohler, T. A., Cockburn, D., Hooper, P. L., Bocinsky, R. K., & Kobti, Z. (2012b). The coevolution of group size and
leadership: An agent-based public goods model for prehispanic Pueblo societies. Advances in Complex Systems,
15(1 & 2), 1150007-1–1150007-29.
Kohler, T. A., & Gumerman, G. J. (Eds.). (2000). Dynamics in human and primate societies: Agent-based modeling of social
and spatial processes. Oxford: Oxford University Press.
Kohler, T. A., Gumerman, G. J., & Reynolds, R. G. (2005). Simulating ancient societies. Scientific American, 293, 76–84.
Kohler, T. A., Johnson, C. D., Varien, M., Ortman, S., Reynolds, R., Kobti, Z., . . . Yap, L. (2007). Settlement ecody-
namics in the prehispanic central Mesa Verde region. In T. A. Kohler & S. E. van der Leeuw (Eds.), Model-based
archaeology of socionatural systems (pp. 61–104). Santa Fe, NM: SAR Press.
Kohler, T. A., Kresl, J., West, C. V., Carr, E., & Wilshusen, R. H. (2000). Be there then: A modeling approach to settle-
ment determinants and spatial efficiency among Late Ancestral Pueblo populations of the Mesa Verde region, U.S.
Southwest. In T. A. Kohler & G. J. Gumerman (Eds.), Dynamics in human and primate societies. Santa Fe Institute
Studies in the Sciences of Complexity (pp. 145–178). New York: Oxford University Press.
Kohler, T. A., & van der Leeuw, S. E. (2007a). Introduction: Historical socionatural systems and models. In T. A.
Kohler & S. E. van der Leeuw (Eds.), The model-based archaeology of socionatural systems (pp. 1–12). Santa Fe: School
for Advanced Research Press.
Kohler, T. A., & van der Leeuw, S. E. (Eds.). (2007b). The model-based archaeology of socionatural systems. Santa Fe: School
for Advanced Research Press.
Kohler, T. A., & Varien, M. D. (2012). Emergence and collapse of early villages in the central Mesa Verde: An intro-
duction. In T. Kohler & M. Varien (Eds.), Emergence and collapse of early villages in the Central Mesa Verde: Models of
Central Mesa Verde archaeology (pp. 1–14). Berkeley: University of California Press.
Kolm, K. E., & Smith, S. M. (2012). Modeling paleohydrological system strucure and function. In T. A. Kohler,
M. D. Varien, & A. M. Wright (Eds.), Emergence and collapse of early villages: Models of Central Mesa Verde archaeology
(pp. 73–83). Berkeley: University of California Press.
Lake, M. (2015). Explaining the past with ABM: On modelling philosophy. In G. Wurzer, K. Kowarik, & H. Resch-
reiter (Eds.), Agent-based modeling and archaeology (pp. 3–35). Switzerland: Springer.
Lake, M. W. (2000a). MAGICAL computer simulation of Mesolithic foraging. In T. A. Kohler & G. J. Gumerman
(Eds.), Dynamics in human and primate societies: Agent-based modelling of social and spatial processes (pp. 107–143).
New York: Oxford University Press.
Lake, M. W. (2000b). MAGICAL computer simulation of Mesolithic foraging on Islay. In S. J. Mithen (Ed.), Hunter-
gatherer landscape archaeology: The Southern Hebrides Mesolithic project, 1988–98,Volume 2: Archaeological Fieldwork on
Colonsay, Computer Modelling, Experimental Archaeology, and Final Interpretations (pp. 465–495). Cambridge: The
McDonald Institute for Archaeological Research.
Spatial agent-based modelling 271
Lake, M. W. (2001a). Numerical modelling in archaeology. In D. R. Brothwell & A. M. Pollard (Eds.), Handbook of
archaeological sciences (pp. 723–732). Chichester: John Wiley & Sons.
Lake, M. W. (2001b). The use of pedestrian modelling in archaeology, with an example from the study of cultural
learning. Environment and Planning B: Planning and Design, 28, 385–403.
Lake, M. W. (2004). Being in a simulacrum: Electronic agency. In A. Gardner (Ed.), Agency uncovered: Archaeological
perspectives on social agency, power and being human (pp. 191–209). London: UCL Press.
Lake, M. W. (2014). Trends in archaeological simulation. Journal of Archaeological Method and Theory, 21(2), 258–287.
Lake, M. W., & Crema, E. R. (2012). The cultural evolution of adaptive-trait diversity when resources are uncertain
and finite. Advances in Complex Systems, 15(1 & 2), 1150013-1–1150013-19.
Levins, R. (1966). The strategy of model building in population biology. American Scientist, 54(4), 421–431.
Marwick, B. (2017). Computational reproducibility in archaeological research: Basic principles and a case study of
their implementation. Journal of Archaeological Method and Theory, 24(2), 424–450.
McGlade, J. (2005). Systems and simulacra: Modeling, simulation, and archaeological interpretation. In H. D. G.
Maschner & C. Chippindale (Eds.), Handbook of archaeological methods (pp. 554–602). Oxford: Altamira Press.
Mithen, S. J. (1988). Simulation as a methodological tool: Inferring hunting goals from faunal assemblages. In
C. L. N. Ruggles & S. P. Q. Rahtz (Eds.), Computer applications and quantitative methods in archaeology 1987. Number
393 in International Series (pp. 119–137). Oxford: British Archaeological Reports.
Mithen, S. J. (1989). Modeling hunter-gatherer decision making: Complementing optimal foraging theory. Human
Ecology, 17, 59–83.
Mithen, S. J. (1990). Thoughtful foragers: A study of prehistoric decision making. Cambridge: Cambridge University Press.
Mithen, S. J. (1991). “A cybernetic wasteland”? Rationality, emotion and Mesolithic foraging. Proceedings of the
Prehistoric Society, 57, 9–14.
Mithen, S. J. (1993). Smulating mammoth hunting and extinction: Implications for the Late Pleistocene of the Cen-
tral Russian Plain. Archeological Papers of the American Anthropological Association, 4(1), 163–178.
Mithen, S. J. (1994). Simulating prehistoric hunter-gatherers. In N. Gilbert & J. Doran (Eds.), Simulating societies:
The computer simulation of social phenomena (pp. 165–193). London: UCL Press.
Odling-Smee, F. J., Laland, K. N., & Feldman, M. W. (2003). Niche construction: The neglected process in evolution. Princ-
eton, NJ: Princeton University Press.
Orton, C. (1982). Computer simulation experiments to assess the performance of measures of quantity of pottery.
World Archaeology, 14, 1–19.
Pierce, C. (1989). A critique of middle-range theory in archaeology. Paper completed at first year of graduate school.
Premo, L. S. (2005). Patchiness and prosociality: An agent-based model of Plio/Pleistocene hominid food sharing.
In P. Davidsson, K. Takadama, & B. Logan (Eds.), MABS 2004, volume 3415 of lecture notes in artificial intelligence
(pp. 210–224). Berlin: Springer-Verlag.
Premo, L. S. (2007). Exploratory agent-based models: Towards an experimental ethnoarchaeology. In J. T. Clark &
E. M. Hagemeister (Eds.), Digital discovery: Exploring new frontiers in human heritage, CAA 2006: Computer applica-
tions and quantitative methods in archaeology (pp. 29–36). Budapest: Archeolingua Press.
Premo, L. S. (2008). Exploring behavioral terra incognita with archaeological agent-based models. In B. Frischer &
A. Dakouri-Hild (Eds.), Beyond illustration: 2D and 3D technologies as tools of discovery in archaeology. British Archaeo-
logical Reports International Series (pp. 46–138). Oxford: ArchaeoPress.
Premo, L. S. (2012). Local extinctions, connectedness, and cultural evolution in structured populations. Advances in
Complex Systems, 15(1 & 2), 1150002-1–1150002-18.
Premo, L. S. (2014). Cultural transmission and diversity in time-averaged assemblages. Current Anthropology, 55(1),
105–114.
Premo, L. S., & Kuhn, S. L. (2010). Modeling effects of local extinctions on culture change and diversity in the
Paleolithic. PLoS One, 5(12), e15582.
Premo, L. S., & Scholnick, J. B. (2011). The spatial scale of social learning affects cultural diversity. American Antiquity,
76(1), 163–176.
Railsback, S. F., & Grimm, V. (2012). Agent-based and individual-based modeling: A practical introduction. Princeton:
Princeton University Press.
Reynolds, R. G. (1987). A production system model of hunter-gatherer resource scheduling adaptations. European
Journal of Operational Research, 30(3), 237–239.
272 Mark Lake
Rogers, J., & Cegielski, W. H. (2017). Building a better past with the help of agent-based modeling. Proceedings of
the National Academy of Sciences, 114(49), 12841–12844.
Rollins, N. D., Barton, C. M., Bergin, S., Janssen, M. A., & Lee, A. (2014). A computational model library for pub-
lishing model documentation and code. Environmental Modelling & Software, 61, 59–64.
Rubio-Campillo, X. (2016). Model selection in historical research using Approximate Bayesian Computation. PLoS
One, 11(1), e0146491.
Rubio-Campillo, X., María Cela, J., & Hernàndez Cardona, F. (2011). Simulating archaeologists? Using agent-based
modelling to improve battlefield excavations. Journal of Archaeological Science, 39, 347–356.
Schuster, H. G. (1988). Deterministic Chaos. New York: VCH Publishers.
Shanks, M., & Tilley, C. (1987). Re-constructing archaeology. Cambridge: University Press.
Slingerland, E., & Collard, M. (2012). Introduction creating consilience: Toward a second wave. In E. Slingerland &
M. Collard (Eds.), Creating consilience: Integrating the sciences and humanities (e ed., pp. 123–740). Oxford: Oxford
University Press.
Stonedahl, F., & Wilensky, U. (2010a). Evolutionary robustness checking in the Artificial Anasazi Model. Proceedings of the
AAAI Fall symposium on complex adaptive systems: Resilience, robustness, and evolvability, November 11–13,
Arlington, VA.
Stonedahl, F., & Wilensky, U. (2010b). Netlogo Artificial Anasazi Model. Retrieved from http://ccl.northwestern.edu/
netlogo/models/ArtificialAnasazi
Surovell, T., & Brantingham, P. (2007). A note on the use of temporal frequency distributions in studies of prehistoric
demography. Journal of Archaeological Science, 34(11), 1868–1877.
Swedlund, A. C., Sattenspiel, L., Warren, A. L., & Gumerman, G. G. (2015). Modeling archaeology: Origins of the
Artifical Anasazi Project and beyond. In G. Wurzer, K. Kowarik, & H. Reschreiter (Eds.), Agent-based modeling and
archaeology (pp. 37–52). Switzerland: Springer.
Thomas, J. (1991). The hollow men? A reply to Steven Mithen. Proceedings of the Prehistoric Society, 57, 15–20.
van der Leeuw, S., & Redman, C. L. (2002). Placing archaeology at the center of socio-natural studies. American
Antiquity, 67(4), 597–605.
van der Leeuw, S. E. (2008). Climate and society: Lessons from the past 10000 years. AMBIO: A Journal of the Human
Environment, 37(sp14), 476–482.
Waldrop, M. (1992). Complexity: The emerging science at the edge of order and Chaos. New York: Simon & Schuster.
Westervelt, J. D. (2002). Geographic information systems and agent-based modelling. In H. R. Gimblett (Ed.),
Integrating geographic information systems and agent-based modeling techniques for simulating social and ecological processes.
Santa Fe Institute Studies in the Sciences of Complexity (pp. 83–104). Oxford: Oxford University Press.
Wilkinson, T., Christiansen, J., Ur, J., Widell, M., & Altaweel, M. (2007). Urbanization within a dynamic environ-
ment: Modeling Bronze Age communities in upper Mesopotamia. American Anthropologist, 109(1), 52–68.
Wobst, H. M. (1974). Boundary conditions for Palaeolithic social systems: A simulation approach. American Antiquity,
39, 147–178.
Worboys, M., & Duckham, M. (2004). GIS: A computing perspective (2nd ed.). London: Taylor and Francis.
Wren, C. D., Zue, J. X., Costopoulos, A., & Burke, A. (2014). The role of spatial foresight on models of hominin
dispersal. Journal of Human Evolution, 69, 70–78.
Xue, J. Z., Costopoulos, A., & Guichard, F. (2011). Choosing fitness-enhancing innovations can be detrimental under
fluctuating environments. PloS One, 6(11), e26770.
Zubrow, E. (1981). Simulation as a heuristic device in archaeology. In J. A. Sabloff (Ed.), Simulations in archaeology
(pp. 143–188). Albuquerque: University of New Mexico Press.
15
Spatial networks
Tom Brughmans and Matthew A. Peeples
Introduction
Figure 15.1 Four different network data representations of the same hypothetical Mediterranean transport
network. (a) adjacency matrix with edge length (in km) in cells corresponding to a connection; (b) node-
link-diagram where edge width represents length (in km). Please refer to the colour plate for a breakdown by
transport type where red lines = sea, green = river, grey = road); (c) edge list; (d) geographical layout. Once
again, please refer to the colour plate for a breakdown of transport type. A colour version of this figure can be
found in the plates section.
Source: Background © Openstreetmap
Spatial network data allow us to directly explore the systematic spatial relationships among nodes,
edges and attributes that would otherwise be difficult to characterize. The abstract transport network
shown in Figure 15.1 provides an instructive example. The different roles played in the Roman transport
system by Cosa and Portus cannot be understood only with reference to their spatial locations and prox-
imity to other towns, but also by the opportunities afforded by their relationships with all other towns
by way of connections across roads, rivers, and seas. From Portus all other towns can be reached directly
in one step over the transport network, whereas from Cosa two steps are needed to reach either Puteoli
or Carthage. Moreover, the maritime route between Cosa and Portus could come into use or become
popular as a result of the slower alternative route via Rome. When such dependencies are of interest, spa-
tial network methods, often coupled with GIS analytical tools, can offer extremely valuable approaches.
Spatial networks 275
Before we proceed with the archaeological application of spatial networks, we want to briefly
consider the interchangeable use of the words network and graph. The word graph is more commonly
used in the fields of mathematics, computer science and computational geometry. Indeed, graph the-
ory is a long-established subdiscipline of mathematics and one of the fundamentals of computer sci-
ence (Harary, 1969). In many disciplines where graph theory is applied to real-world phenomena the
term network is used, and this is the case for the two disciplines with the most active traditions of net-
work research: Social Network Analysis (SNA) and statistical physics. However, in practice the terms
graph and network are commonly used interchangeably and we will here consistently use the term
network.
Visibility networks
Another common topic in archaeological spatial network research is the study of visibility, usually repre-
sented as lines-of-sight: the ability for an observer to observe an object of interest within a natural or built-up
environment or to be observed (see Brughmans & Brandes, 2017, for a recent overview). Visibility networks
are typically defined based on line-of-sight data, often derived through GIS analyses (see Gillings & Wheat-
ley, this volume). In line-of-sight networks the set of nodes represents the observation locations and the edges
represent lines-of-sight. A pair of nodes is connected by an edge if a line-of-sight starting at the eye level of
an observer at one observation point can reach the second observation point, i.e. if the line-of-sight is not
blocked by a natural or cultural feature. In some studies, this point-to-point model of visibility is expanded
to landscape scale assessments of viewsheds where the total cumulative area viewable from a given viewpoint
is defined and networks are created based on areas with overlapping viewsheds or when certain key features
are mutually viewable (see O’Sullivan & Turner, 2001; Brughmans & Brandes, 2017; Bernardini & Peeples,
2015). The method is most commonly used to study hypothesised visual signalling networks, communities
sharing visual landmarks and to explore processes of site positioning and the possible expression of power
relationships through visual control (Bernardini & Peeples, 2015; Brughmans, Keay, & Earle, 2014, 2015,
Brughmans, de Waal, Hofman, and Brandes, 2017; Brughmans & Brandes, 2017; De Montis & Caschili,
2012; Earley-Spadoni, 2015; Fraser, 1980, 1983; Ruestes Bitrià, 2008; Shemming & Briggs, 2014; Swanson,
2003; Tilley, 1994, pp. 156–166). Analyses of visibility network data frequently involve assessments of the
relative importance of different nodes for sending or receiving information or resources across that network,
or to evaluate the likelihood that a given configuration suggests a concern for signaling, defense, or other
factors among the people who built those features.
Access analyses
A somewhat different use for network methods in spatial data draws on a body of work referred to as
space syntax (Hillier & Hanson, 1984; Hillier, 1996; for a detailed discussion see Thaler, this volume).
The access analysis approach in space syntax is particularly popular in archaeological research. It uses
network graphs and related visualizations to explore the nature of physical or sometimes visible access
within features, buildings, or larger landscapes. The basic idea behind the approach is that we can think of
discrete spaces being “reachable” from one another through tree-like networks that let us both examine
the overall structure of mutual reachability among spaces and also assess the relative depth (the number
of edges crossed) from one space to another. In this way individual spaces (however they are defined) are
characterized as nodes, and edges are drawn between pairs of nodes that are reachable (i.e. that share a
doorway or are mutually visible). A number of studies have employed space syntax graphs to argue that
tracking or comparing the cultural logics of spatial organization can provide insights into a range of issues
including social organization, public versus private spaces, the distribution of urban services, and social
stratification (see Branting, 2007; Brusasco, 2004; Cutting, 2003; Fairclough, 1992; Ferguson, 1996; Foster,
1989; Grahame, 1997; Wernke, 2012). Analyses of space syntax graphs are often limited to qualitative
assessments, due in part to concerns over incomplete data in archaeological contexts (see Cutting, 2003)
but archaeologists are also starting to take advantage of quantitative tools for assessing the topology of
access networks (e.g. Wernke, 2012; Wernke, Kohut, & Traslaviña, 2017).
edges. The methods used to abstract networks from archaeological material cultural data are quite diverse
but often involve the use of geochemically sourced materials or regions (e.g. Golitko, Meierhoff, Fein-
man, & Williams, 2012), or the shared presence or similarities in material cultural assemblages to define
edges among settlements or regions (e.g. Mills et al., 2013). Although the presence and/or weights of
edges in such material cultural networks are typically defined using aspatial data (such as artefact type
frequencies) the samples from which these data are drawn are often associated with spatial locations that
allow for a consideration of the propinquity of social and spatial relations. In many cases, geographic
proximity or other spatial information is used to generate a null model of geographic connections
expected under certain constraints which is then compared to the network based on material cultural
data. For example, Mills and colleagues (2013) created a two-mode network of obsidian distribution in
the late Prehispanic Southwest and compared the obsidian network to geographic expectations based on
the costs of travel across the landscape, to identify times and places where the material networks deviated
from the geographic expectation. Most material cultural networks explored using archaeological data
have a spatial component and such direct comparisons between material and geographic distance are
becoming increasingly common (e.g. Gjesfjeld, 2015; Gjesfjeld & Phillips, 2013; Hill et al., 2015).
Method
In this section we will introduce some key concepts in spatial network research, commonly applied ana-
lytical techniques and a range of spatial network models.
Figure 15.2 A planar network representing transport routes plotted geographically (a) and topologically (b).
A non-planar social network representing social contacts between communities plotted geographically (c) and
topologically (d). Note the crossing edges in the non-planar network. A colour version of this figure can be
found in the plates section.
Source: Background © Openstreetmap
Network analysis measures are commonly divided into local measures that reveal structural properties
of nodes or small sets of nodes, and global measures that reveal structural properties of the network as a
whole. The most common procedure for creating spatial variants of all these measures is to consider the
physical distance of edges, or any other spatially derived attributes of edges such as transport time or effort
of moving between two places, as a repelling “weight” in the algorithm: the higher the physical distance
between two nodes, the lower the score of the measure.
Local measures include degree, paths, centralities, and a node clustering coefficient. A node’s degree
refers to the number of edges it has, and spatial degree refers to the number of edges weighted by their
summed distance. A path is a sequence of connected node pairs from one node to another in the network.
The shortest path from any one node i to any other node j is the minimum number of connected nodes
between i and j that need to be traversed in order to reach j from i. A spatial variant of the shortest path
Cosa
a) Puteoli
Portus
Nodes scaled by degree centrality
Rome Carthage
Cosa
b) Puteoli
132 214
Portus
Nodes scaled by betweenness centrality
126
path segment lengths labeled
29 603
Rome Carthage
Cosa
c) Puteoli
132 214
Portus
Nodes scaled by closeness centrality
126
path segment lengths labeled
29 603
Carthage
Rome
Figure 15.3 Examples of three different node centrality measures: (a) nodes scaled by degree centrality,
(b) nodes scaled by betweenness centrality with path segment lengths shown, (c) nodes scaled by closeness
centrality with path segment lengths shown.
280 Tom Brughmans and Matthew A. Peeples
includes the summed distance of all edges on the path as a weight. Centrality refers to a very large number
of network measures that each reflect a node’s importance in the network according to different structural
features, the most popular of which are degree, closeness and betweenness. A node’s closeness centrality
refers to the network or spatial distance from this node over the set of shortest paths to each other node.
A node’s betweenness centrality refers to the number of all shortest paths between all node pairs in the
network that this node is positioned on. A node’s clustering coefficient is the existing proportion of all
edges that could exist between its direct network neighbours, i.e. the density in the direct neighbourhood
of the network (see O’Sullivan & Turner (2001) for a spatial variant applied to total viewsheds.
Global measures include average degree, degree distribution, density, average shortest path length,
diameter, and network clustering coefficient. The network’s average degree is the average of all nodes’
degree scores. A network’s node degree scores are most commonly explored as a distribution (see the case
study in this chapter for examples). The density is the existing proportion of all edges that could exist
in a network. Spatial networks where the edges are spatially embedded such as transport systems tend
to have very low densities, whereas spatial networks where only the nodes are explicitly embedded such
as artefact similarity networks typically have much higher densities. The average shortest path length is
the average of all shortest path lengths between all node pairs in the network. The network diameter is
the longest shortest path between any pair of nodes in the network. The network clustering coefficient
is the average of all nodes’ clustering coefficient scores.
a) b)
C
A B A B
c) C d)
A B A B
Figure 15.4 Examples showing relative and Gabriel graph neighborhood definitions: (a) A is a relative neigh-
bor of B because there are no nodes in the shaded overlap between the circles around A and B, (b) A and B
are not relative neighbors because C falls within the shaded overlap. (c) A and B are Gabriel neighbors because
there are no nodes within the circle with a diameter AB, (d) A and B are not Gabriel neighbors because C falls
within the circle with a diameter AB.
when the same principle is applied to a circular (rather than almond-shaped) region between every pair
of nodes: if no other nodes lie within the circular region with diameter d (i,j) between Ni and Nj then
Ni and Nj are connected in the Gabriel graph (Figure 15.4(c–d)). The concept of relative proximity can
be controlled and varied in an interesting way using the concept of beta skeletons (Kirkpatrick & Radke,
1985). Rather than fixing the diameter of the circle as in the Gabriel graph, the diameter can be varied
using a parameter β. Varying the value of β leads to interesting alternative network structures that are
denser with lower values of β, sparser with higher values of β, and the beta skeleton equals the Gabriel
graph when β = 1 (i.e. when the diameter of the circles equIls d (i,j)). These models create planar net-
works and have been applied in archaeology to study site and artefact distributions as well as to represent
the theoretical flow of ceramics between settlements (Brughmans, 2010; Jiménez-Badillo, 2012).
Delaunay triangulation
A triangulation network aims to create as many triangles as possible without allowing for any crossing
edges and therefore creates planar networks. The Delaunay triangulation specifically is derived from the
Voronoi diagram or Thiessen polygons: a pair of nodes are connected by an edge if and only if their
corresponding tiles in a Voronoi diagram (or Thiessen polygons) share a side. The model has seen wide-
spread application for representing archaeological theories, but mainly for the study of transport systems.
To name just a few, Fulminante (2012) used Delaunay triangulation as a theoretical model for a road and
river transport system between Iron Age towns in Central Italy (Latium Vetus), and Herzog (2013) used
it as a representation of least-cost path networks. Evans and Rivers (2017) apply Delaunay triangulation
for exploring the rise of Greek city-states.
Case study
We will illustrate some of the network measures and models introduced in this chapter through an
exploration of the structure of the Roman transport system. By applying a wide range of spatial network
models and methods we will illustrate how interesting insights can be gained by taking a topological as
well as spatial look at a past phenomenon. The following research questions will guide our exploration
of the transport system:
• In what regions is the transport system particularly dense and in what regions is it particularly sparse?
• How important is each urban settlement as an intermediary in the flow of information or goods
between all other settlements?
• How did the Roman transport system structure flows of supplies to the capital of Rome, and which
regions and supplying towns were better positioned in the system to supply Rome?
• Does the Roman transport system reveal a particular spatial structure: nearest-neighbour, relative-
neighbour or maximum distance?
An abstract representation of the Roman transport system will be used here: the Orbis geospatial network
model of the Roman world (Scheidel, 2015; Meeks, Scheidel, Weiland, & Arcenas, 2014). Orbis offers a
static and hypothetical representation of the Roman transport system with limited detail. Therefore, our
present analysis merely aims to explore our research questions within the context of the coarse-g rained
structure of the Roman transport system in the second century AD as hypothesised by the Orbis team.
Spatial networks 283
Data
We decided to use the Orbis dataset because it is well-studied and well-known among Roman archaeol-
ogy scholars, it is open access and reusable for research purposes (Meeks et al., 2014), and it provides the
only functional network dataset covering the entire Roman Empire at its largest extent. However, a key
limitation of Orbis is that it is not as detailed as our current knowledge of Roman settlements and routes
allows, precisely because it aims to represent the broad Empire-wide structure of the Roman transport
system in a comparable way. Moreover, the selection of nodes and edges, as well as the distance assigned to
edges, reflect decisions by its creators and should be submitted to sensitivity analyses (which is not within
the scope of this chapter). Finally, Orbis represents a static picture of what the Roman transport system
might have looked like in the second century AD, and does not offer the ability to explore how this
system changed through time. The longitude and latitude of all nodes was cross-checked with the Ple-
iades gazetteer of ancient placenames (Bagnall et al., 2018) and corrected where necessary. The resulting
network dataset includes a set of 678 nodes, 570 of which represent urban settlements and the remainder
cultural features such as crossroads or natural features such as capes. The node attributes include the settle-
ment name and latitude longitude coordinates. These nodes are connected by a set of 2208 directed links
representing the ability to travel between a node pair in a particular direction. Edge attributes include the
type of transport link (road, river, sea) and the distance in kilometres.
Figure 15.5 Network representation of the Orbis network: geographical layout (a, c) and topological layout
(b, d). Node size and colour represent betweenness centrality weighted by physical distance in (a) and (b), and
they represent unweighted betweenness centrality in (c) and (d): the bigger and darker blue the node, the more
important it is as an intermediary for the flow of resources in the network. By comparing (a, b) with (c, d),
note the strong differences in which settlement is considered a central one depending on whether physical
distance is taken into account (a, b) or not (c, d). Edge colours represent edge type: red = sea, green = river,
grey = road. A colour version of this figure can be found in the plates section.
Source: Background © Openstreetmap
However, this unweighted betweenness centrality measure completely ignores physical distance and
considers the traversal of each edge equally: all that is considered is the number of hops over the network
to get from one node to the other. To make this network analysis more representative of the physical
reality of the system we can weigh the edges according to their physical distance, where a shortest path is
now defined as the path between a pair of nodes with the lowest summed distance. Results of the distance
Spatial networks 285
Table 15.1 Top 20 highest ranking towns according to the topological betweenness centrality measure and the
distance weighted betweenness centrality measure. Towns highly ranked according to both measures are highlighted.
1 Messana Puteoli
2 Alexandria Delos
3 Rhodos Hispalis
4 Gades Roma
5 Apollonia-Sozousa Palantia
6 Olisipo Pisae
7 Sallentinum Pr. Ascalon
8 Flavium Brigantium Aquileia
9 Acroceraunia Pr. Rhodos
10 Lilybaeum Isca
11 Civitas Namnetum Apollonia-Sozousa
12 Portus Blendium Lydda
13 Paphos Iuliobona
14 Ostia/Portus Placentia
15 Carthago Constantinopolis
16 Corcyra Histria
17 Aquileia Ephesus
18 Caralis Mothis
19 Sigeion Patara
20 Constantinopolis Lancia
weighted betweenness centrality measure are shown in Table 15.1 and Figure 15.5(a, b). Note how differ-
ent the top scoring towns are (Table 15.1), only four towns occur in both measures’ top 20 list. The high
scoring towns are still mostly ports but are now more equally spread throughout the system, often with
one or a few high scoring towns per province (Figure 15.5a). These high scoring towns can be inter-
preted as the most important intermediaries for the flow of goods and information through this abstract
representation of the Roman transport system if we assume that the shortest possible path between towns
was always preferred. The same method can of course be applied to represent other assumptions such as
the shortest path in terms of time or financial cost.
Figure 15.6 Geographical network representation of the Orbis network: geographical layout (a) and topologi-
cal layout (b). Node size and colour represent increasing physical distance over the network away from Rome:
the larger and darker the node, the further away this settlement is from Rome following the routes of the
transport system. Note the fall-off of the results with distance away from Rome structured by the transport
routes rather than as-the-crow-flies distance. Edge colours represent edge type: red = sea, green = river, grey =
road. A colour version of this figure can be found in the plates section.
Source: Background © Openstreetmap
from other towns. These differences can be identified using spatial network methods, by calculating the
shortest paths from all towns to Rome according to the sum of their physical distance.
The results of this analysis (Figure 15.6) reveal of course a fall-off with distance away from Rome.
But note that this does not merely represent a fall-off of towns’ scores with as-the-crow-flies distance
from Rome, as could be easily calculated in GIS, but rather with their distance to Rome over the short-
est path of the network. It offers a representation of physical distance morphed and structured by the
Roman transport system. We can observe differences between the outlying regions, like Britain being
closer than much of Syria and Egypt. But a more interesting result is the proximity of areas that became
the earliest overseas provinces: the proximity of Tunisian towns around Carthage, Sardinia, as well as
the relatively short distances to towns in Southern France and Western Spain as compared to much of
Greece, for example. These results also offer an appropriate visualisation of what we know about the
well-documented large-scale and possibly partly state-organised supplies of foodstuffs to Rome from
Tunisia especially from the second century AD onwards, and it highlights the huge organisational efforts
that must have gone into the long distance and equally well-documented transport of foodstuffs from
Southern Spain and, in particular, Egypt.
Network models
The network models discussed earlier in this chapter can be applied to the Orbis settlement distribution
pattern to answer our fourth research question. What spatial structuring does the settlement distribution
included in Orbis reveal? To what extent does the Orbis network align with or deviate from this struc-
turing? Does the Roman transport system reveal a nearest-neighbour, relative-neighbour or maximum
Spatial networks 287
distance structure? We will use global network measures to compare how similar the structure of the
simulated network models are to that of the Orbis network. The models presented in this section were
implemented in NetLogo, a very accessible programming language with an intuitive user-interface and
comprehensive network science and GIS libraries (Wilensky, 1999).
K-nearest-neighbour networks
This model is very sensitive to the proximity of sets of nodes, and reveals clusters of densely settled areas in
the Orbis set of towns (Figure 15.7; Table 15.2). The nearest-neighbour networks with K equals 1 and 2
are very disconnected, although for K equals 2 the global network measures are very similar to the Orbis
network but more clustered (Table 15.2). The network becomes connected with 4-nearest-neighbours
and the 10-nearest-neighbours network emphasises the clusters in areas where the settlement pat-
tern is densest, but both these networks are much denser and more clustered than the Orbis network
(Table 15.2). The degree distributions for these K-nearest-neighbour networks shows very little variance.
The lower limit always equals K, and just a few towns have a higher degree than most other towns, a dif-
ference that increases as K increases. In contrast, the degree distribution of the real Orbis network is very
skewed (Figure 15.5): the large majority of towns are connected to less than eight other towns, whereas
very few towns have a much higher degree. The towns with the highest degree are important port towns
or large population centres: Delos, Rhodos, Carthago, Ostia/Portus, Lilybaeum, Paphos, Messana, Rome
(the first two in this list have the highest degree, but this is partly caused by the very high density of
nodes in the Aegean area). The K-nearest-neighbour networks clearly do not capture this feature of the
Orbis network. The maritime routes in the Orbis dataset which cross long distances through the Atlantic
Ocean and the Mediterranean and Black Sea, are also not recreated by this model. However, aspects of the
structure of the terrestrial roads and the dense connections between Aegean islands, as well as the coastal
and riverine connections, are better captured by this model where K equals 4.
Table 15.2 Results of global network measures for all tested models and the undirected Orbis network (in bold).
Highlighted results show some similarity in global network measures with the Orbis network.
Eurasian continents and shows some similarities in the density and structure of the terrestrial routes.
However, the degree distribution is normally distributed and there is very little variance in nodes’
degrees. The Gabriel graph similarly shows little variance in its normally distributed degree distribu-
tion, but its triangular structure does succeed in recreating some of the long distance maritime con-
nections. Moreover, it is the only model used here that has an average clustering coefficient close to
that of the Orbis network.
Figure 15.9 Results of the Orbis set of nodes; (a) relative neighbourhood network and (b) Gabriel graph. Node
size represents degree. Insets show degree distributions. Note how the networks, as compared to the results
shown in Figures 15.7 and 15.8, better succeed in representing the shape of the Orbis transport network and
the long-distance maritime routes crossing the Mediterranean.
Conclusion
In this chapter we have introduced spatial networks as consisting of sets of spatially embedded nodes and
edges whose topology is partly restricted by physical space. A strong research tradition in the archaeo-
logical application of spatial networks has focused on a few key themes: transport networks, visibility
networks, space syntax and material culture networks. The most commonly applied local and global
network measures have been introduced, along with a range of fundamental spatial network models.
Many of the methods and models introduced in this chapter were illustrated through a case study
which aimed at exploring the structure of the Roman transport system, as hypothesised by the Orbis
network. Geographical and topological visualisations of the Orbis network revealed complemen-
tary insights into regional differences in transport network density. The use of a distance weighted
292 Tom Brughmans and Matthew A. Peeples
betweenness centrality measure identified settlements that are particularly crucial as intermediaries
for the flow of information, people and goods in this system. Calculating the summed distance of
the shortest paths from all settlements to Rome highlighted regional differences in the proximity to
Rome following the transport network, which has implications for their ability to supply foodstuffs
to the capital. Finally, spatial network modelling results suggest that theories about the structure of
the Roman transport system should include nearest-neighbourhood, relative-neighbourhood and
maximum-distance effects, and a preferential attachment effect is hypothesised to be a further key
explanatory factor.
Spatial network applications have a long history in archaeological research, but they have only recently
received more attention in the research traditions at the core of network science: social network analysis
and physics. We believe the strong archaeological research tradition in spatial networks reveals an impor-
tant opportunity for archaeologists to contribute to the future development of spatial network methods
and models and to their multi-disciplinary application. More intense interaction with the broader net-
work science community will in turn lead to a richer toolbox of spatial network methods and models for
archaeologists to let loose on their research topics.
References
Bagnall, R., Talbert, R., Elliot, T., Holman, L., Becker, J., Bond, S., . . . Turner, B. (2018). Pleiades: A Gazetteer of past
places. Retrieved from http://pleiades.stoa.org/
Barthelemy, M. (2011). Spatial networks. Physics Reports, 499(1–3), 1–101. https://doi.org/10.1016/j.physrep.
2010.11.002
Bernardini, W., & Peeples, M. A. (2015). Sight communities: The social significance of shared visual landmarks.
American Antiquity, 80(2), 215–235. https://doi.org/10.7183/0002-7316.80.2.215
Bevan, A., & Wilson, A. (2013). Models of settlement hierarchy based on partial evidence. Journal of Archaeological
Science, 40(5), 2415–2427.
Brandes, U., Robins, G., McCranie, A., & Wasserman, S. (2013). What is network science? Network Science, 1(1), 1–15.
https://doi.org/10.1017/nws.2013.2
Branting, S. (2007). Using an urban street network and a PGIS-T ycladeh to analyze ancient movement. In E. M.
Clark & J. T. Hagenmeister (Eds.), Digital discovery: Exploring new frontiers in human heritage: Proceedings of the 34th
CAA conference, Fargo, 2006 (pp. 87–96). Budapest: Archaeolingua.
Broodbank, C. (2000). An Island archaeology of the early cyclades. Cambridge: Cambridge University Press.
Brughmans, T. (2010). Connecting the dots: Towards archaeological network analysis. Oxford Journal of Archaeology,
29(3), 277–303.
Brughmans, T., & Brandes, U. (2017). Visibility network patterns and methods for studying visual relational Phe-
nomena in archaeology. Frontiers in Digital Humanities: Digital Archaeology, 4(17). https://doi.org/10.3389/
fdigh.2017.00017
Brughmans, T., de Waal, M. S., Hofman, C. L., & Brandes, U. (2017). Exploring transformations in Caribbean
indigenous social networks through visibility studies: The case of late pre-colonial landscapes in East-Guadeloupe
(French West Indies). Journal of Archaeological Method and Theory, 25(2), 475–519. https://doi.org/10.1007/
s10816-017-9344-0
Brughmans, T., Keay, S., & Earl, G. (2014). Introducing exponential random graph models for visibility networks.
Journal of Archaeological Science, 49, 442–454. https://doi.org/10.1016/j.jas.2014.05.027
Brughmans, T., Keay, S., & Earl, G. (2015). Understanding inter-settlement visibility in Iron age and Roman Southern
Spain with exponential random graph models for visibility networks. Journal of Archaeological Method and Theory,
22, 58–143. https://doi.org/10.1007/s10816-014-9231-x
Brughmans, T., & Peeples, M. (2017). Trends in archaeological network research: A bibliometric analysis. Journal of
Historical Network Research, 1(1), 1–24. https://doi.org/10.25517/jhnr.v1i1.10
Brusasco, P. (2004). Theory and practice in the study of mesopotamian domestic space. Antiquity, 78, 142–157.
Chorley, R. J., & Haggett, P. (1967). Models in geography. London: Methuen.
Spatial networks 293
Collar, A. (2013). Re-thinking Jewish ethnicity through social network analysis. In C. Knappett (Ed.), Network
analysis in archaeology: New approaches to regional interaction (pp. 223–246). Oxford: Oxford University Press.
Collar, A., Coward, F., Brughmans, T., & Mills, B. J. (2015). Networks in archaeology: Phenomena, abstraction,
representation. Journal of Archaeological Method and Theory, 22, 1–32. https://doi.org/10.1007/s10816-014-9235-6
Cutting, M. (2003). The use of spatial analysis to study prehistoric settlement architecture. Oxford Journal of Archaeol-
ogy, 22, 1–21.
De Montis, A., & Caschili, S. (2012). Nuraghes and landscape planning: Coupling viewshed with complex network
analysis. Landscape and Urban Planning, 105(3), 315–324. https://doi.org/10.1016/j.landurbplan.2012.01.005
Doran, J. E., & Hodson, F. R. (1975). Mathematics and computers in archaeology. Cambridge, MA: Harvard University
Press.
Earley-Spadoni, T. (2015). Landscapes of warfare: Intervisibility analysis of early Iron and Urartian fire beacon sta-
tions (Armenia). Journal of Archaeological Science: Reports, 3, 22–30. https://doi.org/10.1016/j.jasrep.2015.05.008
Evans, T. (2016). Which network model should I use? Towards a quantitative comparison of spatial network models
in archaeology. In T. Brughmans, A. Collar, & F. Coward (Eds.), The connected past: Challenges to network studies in
archaeology and history (pp. 149–173). Oxford: Oxford University Press.
Evans, T. S., & Rivers, R. J. (2017). Was Thebes necessary? Contingency in spatial modelling. Frontiers in Digital
Humanities: Digital Archaeology, 4(8), 1–21. https://doi.org/10.3389/fdigh.2017.00008
Fairclough, G. (1992). Meaningful constructions: Spatial and functional analysis of medieval buildings. Antiquity, 66,
348–366.
Ferguson, T. J. (1996). Historic Zuni architecture and society: An archaeological application of space syntax. Papers of the
University of Arizona No. 60. Tucson: University of Arizona Press.
Foster, S. M. (1989). Analysis of spatial patterns in buildings (access analysis) as an insight into social structure:
Examples from the Scottish Atlantic Iron age. Antiquity, 63, 40–50.
Fraser, D. (1980). The cutpoint index: A simple measure of point connectivity. Area, 12(4), 301–304.
Fraser, D. (1983). Land and society in neolithic Orkney. BAR British Series, 117. Oxford: Archaeopress.
Fulminante, F. (2012). Social network analysis and the emergence of central places a case study from Central Italy
(Latium Vetus). BABESCH, 87, 27–53.
Gjesfjeld, E. (2015). Network analysis of archaeological data from hunter-gatherers: Methodological problems and
potential solutions. Journal of Archaeological Method and Theory, 22(1), 182–205.
Gjesfjeld, E., & Phillips, S. C. (2013). Evaluating adaptive network strategies with geochemical sourcing data: A case
study from the Kuril Islands. In C. Knappett (Ed.), Network analysis in archaeology: New approaches to regional interac-
tion (pp. 281–306). Oxford: Oxford University Press.
Golitko, M., Meierhoff, J., Feinman, G. M., & Williams, P. R. (2012). Complexities of collapse: The evidence of Maya
Obsidian as revealed by social network graphical analysis. Antiquity, 86, 507–523.
Grahame, M. (1997). Public and private in the Roman house: The spatial order of the Casa Del Fauno. In R. Lau-
rence & A. Wallace-Hadrill (Eds.), Domestic space in the Roman world: Pompeii and beyond. Journal of Roman
Archaeology, Supplementary Series, 22 (pp. 137–164). Portsmouth, Rhode Island.
Hage, P., & Harary, F. (1991). Exchange in Oceania: A graph theoretic analysis. Oxford: Clarendon Press.
Hage, P., & Harary, F. (1996). Island networks: Communication, kinship and classification structures in Oceania.
Cambridge: Cambridge University Press.
Harary, F. (1969). Graph theory. Reading, MA and London: Addison-Wesley.
Herzog, I. (2013). Least-cost networks. In G. Earl, T. Sly, A. Chrysanthi, P. Murrieta-Flores, C. Papadopoulos,
I. Romanowska, & D. Wheatley (Eds.), Archaeology in the digital era: Papers from the 40th annual conference of computer
applications and quantitative methods in archaeology (CAA), Southampton, 26–29 March 2012 (pp. 237–248). Amster-
dam: Amsterdam University Press.
Hill, J. B., Peeples, M. A., Huntley, D. L., & Carmack, H. J. (2015). Spatializing social network analysis in the late pre-
contact U.S. Southwest. Advances in Archaeological Practice, 3(1), 63–77. https://doi.org/10.7183/2326-3768.3.1.63
Hillier, B. (1996). Space is the machine. Cambridge: Cambridge University Press.
Hillier, B., & Hanson, J. (1984). The social logic of space. Cambridge: Cambridge University Press.
Hodder, I., & Orton, C. (1976). Spatial analysis in archaeology. Cambridge: Cambridge University Press.
Irwin, G. J. (1978). Pots and entrepots: A study of settlement, trade and the development of economic specialization
in Papuan prehistory. World Archaeology, 9(3), 299–319.
294 Tom Brughmans and Matthew A. Peeples
Isaksen, L. (2007). Network analysis of transport vectors in Roman baetica. In J. T. Clark & E. M. Hagenmeister
(Eds.), Digital discovery: Exploring new frontiers in human heritage: Proceedings of the 34th CAA conference, Fargo, 2006
(pp. 76–87). Budapest: Archaeolingua.
Isaksen, L. (2008). The application of network analysis to ancient transport geography: A case study of Roman Baetica.
Digital Medievalist, 4. Retrieved from www.digitalmedievalist.org/journal/4/isakse
Jenkins, D. (2001). A network analysis of Inka roads, administrative centers, and storage facilities. Ethnohistory, 48(4),
655–687.
Jiménez-Badillo, D. (2012). Relative neighbourhood networks for archaeological analysis. In M. Zhou, I. Romanowska,
Z. Wu, P. Xu, & P. Verhagen (Eds.), Revive the past: Proceedings of computer applications and quantitative techniques in
archaeology conference 2011, Beijing (pp. 370–380). Amsterdam: Amsterdam University Press. Retrieved from http://
proceedings.caaconference.org/files/2011/42_Jimenez-Badillo_CAA2011.pdf
Kirkpatrick, D. G., & Radke, J. D. (1985). A framework for computational morphology. Machine Intelligence and
Pattern Recognition, 2, 217–248.
Knappett, C., Evans, T., & Rivers, R. (2008). Modelling maritime interaction in the Aegean Bronze age. Antiquity,
82(318), 1009–1024.
Mackie, Q. (2001). Settlement archaeology in a Fjordland archipelago: Network analysis, social practice and the built
environment of Western Vancouver Island, British Columbia, Canada since 2,000 BP. BAR International Series
926. Oxford: Archaeopress.
Meeks, E., Scheidel, W., Weiland, J., & Arcenas, S. (2014). ORBIS (v2) network edge and node tables. Stanford Digital
Repository. Retrieved from Http://Purl.stanford.edu/mn425tz9757
Menze, B. H., & Ur, J. A. (2012). Mapping patterns of long-term settlement in Northern Mesopotamia at a large
scale. Proceedings of the National Academy of Sciences of the United States of America, 109(14), E778–E787. https://
doi.org/10.1073/pnas.1115472109
Mills, B. J., Clark, J. J., Peeples, M. A., Haas, W. R., Roberts, J. M., Hill, J. B., . . . Shackley, M. S. (2013, March).
Transformation of social networks in the late pre-hispanic US Southwest. Proceedings of the National Academy of
Sciences of the United States of America, 1–6. https://doi.org/10.1073/pnas.1219966110
O’Sullivan, D., & Turner, A. (2001). Visibility graphs and landscape visibility analysis. International Journal of Geo-
graphical Information Science, 15(3), 221–237. https://doi.org/10.1080/13658810010011393
Pailes, M. (2014). Social network analysis of early classic Hohokam corporate group inequality. American Antiquity,
79(3), 465–486. https://doi.org/10.7183/0002-7316.79.3.465
Peregrine, P. (1991). A graph-theoretic approach to the evolution of Cahokia. American Antiquity, 56(1), 66–75.
Rihll, T. E., & Wilson, A. G. (1987). Spatial interaction and structural models in historical analysis: Some possibilities
and an example. Histoire & Mesure, 2, 5–32.
Ruestes Bitrià, C. (2008). A multi-technique GIS visibility analysis for studying visual control of an Iron age land-
scape. Internet Archaeology, 23. http://intarch.ac.uk/journal/issue23/4/index.html
Scheidel, W. (2015). Orbis: The stanford geospatial network model of the Roman world. Princeton/Stanford Working
Papers in Classics. https://doi.org/10.1093/OBO/97801953896610075.2
Shemming, J., & Briggs, K. (2014). Anglo-Saxon communication networks. Retrieved from http://keithbriggs.info/
AS_networks.html
Stjernquist, B. (1966). Models of commercial diffusion in prehistoric times. Scripta Minora, 2, 1–43.
Swanson, S. (2003). Documenting prehistoric communication networks: A case study in the Paquimé polity. American
Antiquity, 68(4), 753–767.
Terrell, J. E. (1977). Human biogeography in the solomon Islands. Fieldiana Anthropology, 68(1), 1–47.
Tilley, C. (1994). A phenomenology of landscape: Places, paths and monuments. Oxford: Berg.
Toussaint, G. T. (1980). The relative neighbourhood graph of a finite planar set. Pattern Recognition, 12(4), 261–268.
https://doi.org/10.1016/0031-3203(80)90066-7
Verhagen, P., Brughmans, T., Nuninger, L., & Bertoncello, F. (2013). The long and winding road: Combining least
cost paths and network analysis techniques for settlement location analysis and predictive modelling. In G. Earl,
T. Sly, A. Chrysanthi, P. Murrieta-Flores, C. Papadopoulos, I. Romanowska, & D. Wheatley (Eds.), Archaeology
in the digital era: Papers from the 40th annual conference of computer applications and quantitative methods in archaeology
(CAA), Southampton, 26–29 March 2012 (pp. 357–366). Amsterdam: Amsterdam University Press.
Spatial networks 295
Wernke, S. A. (2012). Spatial network analysis of a terminal prehispanic and early colonial settlement in Highland
Peru. Journal of Archaeological Science, 39(4), 1111–1122. https://doi.org/10.1016/j.jas.2011.12.014
Wernke, S. A., Kohut, L. E., & Traslaviña, A. (2017). A GIS of affordances: Movement and visibility at a planned colo-
nial town in Highland Peru. Journal of Archaeological Science, 84, 22–39. https://doi.org/https://doi.org/10.1016/j.
jas.2017.06.004
White, D. A., & Barber, S. B. (2012). Geospatial modeling of pedestrian transportation networks: A case study from
Precolumbian Oaxaca, Mexico. Journal of Archaeological Science, 39(8), 2684–2696. https://doi.org/10.1016/j.
jas.2012.04.017
Wilensky, U. (1999). NetLogo. Retrieved from Http://Ccl.northwestern.edu/Netlogo/. Center for Connected
Learning and Computer-Based Modeling, Northwestern University, Evanston, IL.
16
Space syntax methodology
Ulrich Thaler
Introduction
You may not have heard of Harry Beck (Garfield, 2012, pp. 291–296), but he has probably made your life
easier at one time or another (though nobody asked him to). Beck was working as a technical draughts-
man for the London Underground Signals Office when, in 1931, he submitted a radical new draft, pre-
pared in his spare time, for the London Tube map, which had been inspired by electrical diagrams and
the realization that, once on an Underground train, passengers did not care overly much about physical
distance. You will recognize Beck’s diagram (Figure 16.1(b)) immediately, because this is, in principle,
the Tube map still used today, a design icon printed on tourist mugs and emulated by local transport
authorities worldwide. Although schematic single line diagrams had appeared as early as 1909, the pre-
Beck map of the entire network was still drawn and often also shown superimposed over a road map
of the city (Figure 16.1(a)). What Beck had done, in effect, was to transform a geographical map into
a topological one, forgoing metric properties for relative relationships and thus producing a simplified
model of the system which still retains the properties essential to its users. This, in a nutshell, is what space
syntax methodology (Al Sayed, Turner, Hillier, Iida, & Penn, 2014; Hanson, 1998; Hillier, 1996; Hillier &
Hanson, 1984) aims to do as a statistical topological or network analysis of built contexts at the settlement
or building level, as explained below.
Before looking into details of methodology, however, we should take a brief look at the history of
space syntax and, perhaps, the distinction between space syntax theory and space syntax methodology.
Space syntax was formulated as an approach to the configurational analysis of built contexts at University
College London’s Bartlett School of Architecture from the mid to late 1970s on by a group of researchers
around Bill Hillier and, later, Julienne Hanson. As results from space syntax analyses, of integration in
particular, and real-word observations of movement and traffic flows showed good correlations and due
to its resulting ability to model the global effects and repercussions of local changes within a given spatial
configuration (cf. Hillier, Penn, Hanson, Grajewski, & Xu, 1993), space syntax quickly gained traction as a
predictive tool for architectural and urban planning, fostering an extensive research community. Although
it never became part of archaeology’s methodological mainstream (if such a thing exists), most likely due
to the high quality and density of data it demands, space syntax was also picked up by some archaeologists
(e.g. Gilchrist, 1988; Foster, 1989) with surprising alacrity given the discipline’s usual tendency to adopt
more seasoned approaches from other fields. Indeed, even an early emphatic criticism (Leach, 1978) of
Figure 16.1 London Tube maps: (a) the 1908 version superimposed on a city plan. (b) the 1933 version featur-
ing H. Beck’s topological redesign.
Source: Figures 16.1(a–b): © TfL from the London Transport Museum collection (ref. nos. 2002/264, 1999/321)
298 Ulrich Thaler
space syntax in an archaeological volume (though formulated by a social anthropologist) predated by six
years the publication, in 1984, of the volume familiarly known as the ‘Old Testament’ in the space syntax
community, Hillier’s and Hanson’s “The social logic of space” (Hillier & Hanson, 1984).
It should also be noted that the analytical techniques we are concerned with here, then labelled
“alpha analysis” and “gamma analysis” for settlement and building level analyses respectively (Hillier &
Hanson, 1984, pp. 90–123, 147–155), were only introduced as one part of a wider intellectual agenda
in “The social logic of space”, a book that drew widely on ethnographic examples and sought to arrive
at fairly broad generalizations on its title matter. Another strong focus besides analytical techniques
was on considerations of how seemingly complex settlements as ‘global’ structures could arise from
specific, but potentially simple ‘local’ rules. Indeed, such “generative syntaxes” appear to have been the
first topic of discussion in the development of space syntax (Hillier, Leaman, Stansall, & Bedford, 1976),
which thus started out from a perspective that can be broadly termed as structuralist. Nonetheless, at
least some degree of appreciation of the recursive relationship of social (acts) and built structures can be
found in “The social logic”, and by the time Hanson and Hillier published their next major volumes on
urban/settlement and architectural/building studies respectively, “Space is the machine” (Hillier, 1996)
and “Decoding homes and houses” (Hanson, 1998), their stance can certainly be characterized as post-
structuralist. This shift, perhaps, might serve as an indication that the analytical techniques developed
under the label of ‘space syntax’ are actually compatible with different theoretical perspectives and frame-
works. Efforts (or at least calls) to align space syntax with a phenomenological perspective (Seamon, 1994,
2003) may provide a particular interesting illustration of this point for the archaeologist, who is elsewhere
reminded that phenomenological approaches do “not translate well into a formal theory, nor a fixed set
of methodological techniques” (Tilley, 2005, p. 202). Indeed, the continued advocacy of space syntax
as an encompassing theoretical framework by Hillier in particular (e.g. Hillier, 1999a, p. 165, 2008; cf.
Batty, 2004, p. 3) seems to contrast somewhat, at least from the etic perspective of an archaeologist, with
a mainstream within the space syntax community that is strongly oriented towards practical applications
in architecture and urban planning. Emicly speaking, it is certainly true that in archaeology itself, as well
as related disciplines such as social anthropology (cf. Dafinger, 2010, pp. 125–127, 134–140), space syn-
tax’s broader theoretical aspirations have mostly been ignored in favour of a pragmatic approach which
conceives of space syntax as a methodological tool. Notwithstanding the inspiration that may be found
in the more abstract considerations presented by Hillier, Hanson and others, the present text is therefore
deliberately framed as an introduction into “space syntax methodology”.
Method
Basic principles
Let us begin with a concrete historical example, the ground floor plan (Figure 16.2(a)) of a 17th century
residence of minor German nobles, Schloss Friedeburg in Saxony-Anhalt (Schwarzberg, 2002). In the
terms of network analysis, each room can be understood as a node and each (inside) door as an edge
connecting two nodes. For a corresponding visual representation, we can inscribe a dot in each room and
link these with straight lines where a door connects two rooms and thus arrive at a topological graph (Fig-
ure 16.2(b)). Since the information content of this graph no longer depends on the nodes’/dots’ relative
position to one another – the relevant relationships are shown by the lines representing the edges – this
graph can then be rearranged. The most commonly used form is the so-called justified graph (Al Sayed
et al., 2014, pp. 13–14; Hillier & Hanson, 1984, pp. 106–108, 149), or j-g raph for short (Figure 16.2(c)).
In this, the outside of the building, referred to as the carrier space, is shown as a further node at the root
Space syntax methodology 299
Figure 16.2 Schloss Friedeburg, Saxony-Anhalt, Germany. The ground floor of the main residential building
(17th c. CE): (a) the state plan of 1930. (b) a simplified plan with points of access marked by arrows, topological
graph superimposed. (c) the justified graph with room types, rings and depth from carrier indicated. (d) the
path matrix with sums of path lengths.
Source: Figure 16.2(a); Schwarzberg (2002), Figure 1
of a dendritic graph in which nodes are arranged in horizontal lines according to their distance from the
outside, i.e. the minimum number of doors/edges through which they can be accessed.
The j-graph already permits the determination of some numerical indicators of specific spatial prop-
erties. Ease of access from the outside as a quality of a given room or an entire building, for example, is
reflected in its depth or mean depth as numerical indicators.1 In the graph, the depth of each node can
easily be determined by simply numbering the horizontal lines of nodes (for the simple reason that this
is how the graph is organized in the first place), which in turn allows us to calculate the mean depth, a
first important indicator of how accessible the building as a whole is from its surroundings. As to internal
structure and relationships, the number of rings identifiable in the j-g raph (and thus the system) offers
a first indicator of route choice options, while each individual room/node can be characterized by, on
the one hand, its connectivity, i.e. the number of immediate links with other nodes, and as an a-, b-,
c- or d-type space (Al Sayed et al., 2014, p. 14; Hanson, 1998, pp. 173–174; Hillier, 1996, pp. 318–320;
Figure 16.2(c)). Spaces of type a display a single edge, i.e. are ‘dead-end’ rooms accessible through but
one door and thus strongly controlled by other spaces. In contrast to the a-type tips of the branches in
a dendritic system, b-type spaces constitute the (stems of) the branches themselves, i.e. they are at least
and indeed most typically two-edged – both literally and figuratively – in that they offer some degree of
control, as least locally, but contribute only moderately to linking up a system in global terms. The latter
300 Ulrich Thaler
is more characteristic of c-type spaces, i.e. those two- or more-edged nodes which form part of a (or
more precisely: one single) ring, and d-type spaces, which form part and thus link two or more rings and
consequently are the main connectors within a building (if, indeed, the configuration contains d-type
spaces at all).
While this categorization permits a first idea of the ‘connectedness’ – a local quality – and furthermore
the ‘centrality’ of a space within the network of spaces – a global quality of (literally) central importance
in space syntax analyses and referred to as integration – a better idea of the latter is gained if we do some
sums. The easiest way to do so (without a computer), but one rarely explicitly discussed (somewhat
surprisingly, unless you consider the ubiquity of computers), is a path matrix (Figure 16.2(d)) in which
the path length or step distance, i.e. the number of edges traversed, between each pair of nodes is noted
(Blanton, 1994, pp. 34–35). This permits the calculation of a sum of path lengths for each space/room.
The most integrated room in a building, i.e. the one most easily reached from all others, will be charac-
terized by the lowest, the least integrated by the highest sum of path lengths. From the sums we can also
calculate the mean path length (or mean distance, MD) for every space and from this and the number of
spaces (k) in a given system, Hillier and Hanson derived a numerical indicator of integration which they
termed ‘relative asymmetry’:
2 × ( MD − 1)
RA =
k−2
As the terminology indicates, this measure was intended to compare “how deep the system is from
a particular point with how deep or shallow it theoretically could be” (Hillier & Hanson, 1984, p. 108)
and express this in values from 0 to 1; the word ‘asymmetry’ denotes the non-correspondence of actual
and theoretically possible depth. Higher asymmetry, however, means lesser integration of a space, so that,
somewhat counter-intuitively, relative asymmetry as an indicator of integration – i.e. the spatial quality
we are, after all interested in – gives higher values for less integrated and lower values for better-integrated
spatial units. The crucial advance over simply doing sums, on the other hand, is captured in the word
‘relative’: asymmetry and integration are now considered in relation to the size of the system, i.e. relative
asymmetry takes into account how big a building any given room is in.
If instead of a single room, we look at and want to characterise the building in its entirety, not only
can we calculate mean values for depth, relative asymmetry and (as we will see) other numerical indica-
tors, but we can also consider its ‘core’ and ‘genotype’. The integration core is defined as the subset of
the (typically 10%) most highly integrated spaces (Al Sayed et al., 2014, p. 15; Hillier & Hanson, 1984,
p. 115); it may be of interest, e.g., in how far the core penetrates certain sections of a larger building or
bypasses others. The notion of a topological genotype (Hillier & Hanson, 1984, pp. 143–175; Hillier,
Hanson, & Graham, 1987), by contrast, does not aim at a more detailed internal description of a layout,
but a simplified one for comparison with other contexts. If specific room functions are attested across a
sample of buildings in a consistent hierarchy of integration – e.g. kitchen > (read: more highly integrated
than) reception room(s) > work space(s) > bedroom(s) – then this re-current organizational scheme,
which may be obscured by very different ‘phenotypical’ built forms, is referred to as a building genotype,
which can be characteristic, e.g. of certain building functions and/or cultural or social contexts.
Comparison between buildings, however, also leads us to a crucial methodological difficulty: while
relative asymmetry as a standardised value is easier to work with than sums of path lengths and while
one might thus expect it to facilitate reliable comparisons between different buildings and, in particular,
buildings of different sizes with accordingly very different sums of path lengths, the latter, unfortunately,
is not actually the case; we will address this issue later.
Space syntax methodology 301
Figure 16.3 (a–c) Simplified ground floor plan of the main residential building of Schloss Friedeburg: (a) with
the non-convex rooms highlighted and a suggestion for (approximately convex) subdivision of non-convex
rooms 1 and 6. (b) with the axial map superimposed and line segments indicated for the longest and most
integrated axial line. (c) with three overlapping isovists and their centre-points indicated. (d) the axial map of
the ground floor of the main residential building of Schloss Friedeburg, with the topological graph of axial
break-up superimposed. (e) examples of diamond-shaped topological graphs. (f) justified graph of the axial
break-up of the ground floor of the main residential building of Schloss Friedeburg.
Source: Figures 16.2(b–d), 16.3–16.4: by the author
of distance between two spatial entities. Instead, the distance between two spatial units is given as the least
sum of angles that need to be turned on a connecting path. Thus, geometrical properties of the spatial
layout under study are reintroduced into the formerly purely topological analysis.
In a similar though perhaps less direct way, the realm of ‘topology simple and pure’ is transcended when
isovists are taken as the nodes in a network analysis, with the isovists’ overlaps establishing the edges of
the network graph (Al Sayed et al., 2014, pp. 27–38; Turner, Doxa, O’Sullivan, & Penn, 2001; Turner &
Penn, 1999). The isovist (Benedikt, 1979) is defined as the volume or, in the present context, area of space
visible from a given point (Figure 16.3(c)); essentially this is what in geographical terms would be called
a viewshed (see Gillings & Wheatley, this volume). In contrast to the number of convex spaces or the
Space syntax methodology 303
minimum number of axial lines in a given spatial layout, the number of points – each of which allows the
construction of an isovist – within that layout is, of course, infinite. Hence, Visual Graph Analysis (VGA)
starts by superimposing an arbitrary grid over a layout and then constructing isovists of the centre points
of the raster cells. From this point on, analysis again follows the established methodology of ‘classical space
syntax’, yet at least two marked differences between VGA and convex or axial analysis deserve mention.
The first lies in the fact, already alluded to, that the regular grid brings with it a notable degree of sen-
sitivity for metric properties; the metric size of a given room within a building will influence the visual
integration of the points within it (which are, after all, normally intervisible). The second concerns the
way we use the results of VGA; as the centre point of a raster cell is not an intuitively meaningful spatial
entity in the same way as a visual line or – a fortiori and even under the designation ‘convex space’ – a
room are, the interpretation of results from VGA will typically focus less on numerical indicators for
individual spatial units and more on the ‘heat map’ of integration as, perhaps aptly, a visual representation,
in which red/light denotes highly and blue/dark weakly integrated areas. This does not diminish VGA’s
potential for study both at the building and the settlement level.
k + 2
2 ×k × log 2 − 1 + 1
3
Dk =
(k − 1)×(k − 2)
304 Ulrich Thaler
Inserting this formula for Dk as well as that for RA in the calculation of RRA, we arrive at:
2 × k × log k + 2 − 1 + 1
×(k − 2)
3
1
2
I HH = =
RRA 2 ×(MD − 1) ×(k − 1)×(k − 2)
If this seems like a less than perfectly elegant way of arriving at a crucial measure, we should not be
surprised by criticism or efforts at improvement. For axial analysis, these include, e.g. the proposals of an
alternative series of gridded rather than diamond-shaped standard j-g raphs for the production of correc-
tion factors (Kruger, 1989), and even of an alternative calculation of integration based not on the prob-
lematic RA, but directly on the sum of path lengths (Teklenburg, Timmermans, & van Wagenberg, 1993):
k − 2
ln
2
I Tekl =
ln (∑ D − k + 1)
Similarly, in the context of VGA, both the revival of P-values (de Arruda Campos & Fong, 2003),
developed by Hillier and Hanson (1984, pp. 113–114, cf. pp. 73, 95) as an alternative to D-values in a
very specific form of analysis of building-to-settlement relationships, but never widely used, and, more
radically, the abandonment of standardized integration measures in favour of simple mean path lengths
(Sailer, 2010, pp. 132–133) have been suggested. To the best of my knowledge, however, such proposed
alternatives seem to have been mostly ignored rather than refuted (let alone accepted and adopted),
leaving the calculation of I a black box that a thriving research community mostly seems loath to open.
Recent discussions of normalisation in the context of angular analysis need to be noted as an exception
(Al Sayed et al., 2014, pp. 77–78, 117; Hillier, Yang, & Turner, 2012), but the results are not transferable
to other types of analysis.
This might encourage us to look at other spatial qualities and their numerical indicators within space
syntax, of which there are a number. We have already encountered, in the introductory section, con-
nectivity as a simple local measure, i.e. the number of other nodes with which a given node shares edges.
Control is another local measure, calculated by assigning, for each space, the reciprocal of its connectiv-
ity to each of its neighbours and then summing up the apportioned values for each space; it is taken to
capture the degree to which a space controls access from other parts of the network to its immediate
neighbours (Hillier & Hanson, 1984, p. 109). The correlation of connectivity and integration (which,
Space syntax methodology 305
of course, cannot escape potential problems with the latter) is referred to as intelligibility and taken to
describe how far the global connective role of a spatial unit can be inferred from within that space (Al
Sayed et al., 2014, p. 15; Conroy, 2000, pp. 61–88; Hillier, 1996, p. 120).3
Despite their more specific uses, however, none of these measures comes close to integration in
either its apparent predictive potential or in its popularity among researchers, whereas a more seri-
ous ‘contender’ or complement to integration has come to the fore in recent years for some types of
analysis: choice (Al Sayed et al., 2014, pp. 15, 77, 114–115, 117; Freeman, 1977; Hillier & Iida, 2005,
p. 483; Turner, 2007). Like integration, choice is defined in reference to the set of shortest paths
between any pair of nodes within a system. But in contrast to the integration value of a given space,
which takes into account the shortest routes from that space to all others (which are the same as those
from all others to this particular space), its choice value reflects how many shortest routes between
pairs of other nodes pass through the space under consideration; fittingly, ‘betweenness’ has been used
as an alternative designation (Al Sayed et al., 2014, pp. 114, 117; Turner, 2007, p. 540). Consequently,
choice has been advocated as an indicator of the “through-movement potential” (Al Sayed et al.,
2014, pp. 26, 73; Hillier, 2007/2008, p. 2) of a node, i.e. its likeliness of attracting passing traffic, by
contrast with integration which is held to capture a node’s “destination potential” (Hillier, 2007/2008,
p. 2) or “to-movement potential” (Al Sayed et al., 2014, p. 73), i.e. its likeliness to attract visitors or
simply its accessibility. This reflects, to some degree, the earlier opposition of axial space as connected
with movement and convex space as linked to static activity. It is therefore perhaps not surprising that
choice, considered “descriptive of movement rather than occupation” (Al Sayed et al., 2014, p. 15), is
not usually used in convex analysis, despite the popularity it has gained in forms of axial analysis and
in the context of line segment analysis in particular.
A last methodological aspect that we need to address concerns the contrast between local indicators,
such as connectivity and control, and global indicators, like integration and choice: while the latter make
reference to a space’s relationship with all other spaces within a given layout, the former take into account
only the relationships of a space to its immediate neighbours. There is a third possibility, which is consid-
ering a space’s relationship to all those which fall within a certain radius around it (Al Sayed et al., 2014,
pp. 15, 25, 114; Hillier, 1996, pp. 99–101). A space’s integration in the context of all those spaces within
three topological steps from it, e.g. is referred to as integration at r = 3 or, with a more general term that
can refer to other small radii as well, as local integration (global integration, in this sense, is at r = n, while
local indicators in the strict sense are established at r = 1). While calculating integration locally offers a
useful supplement to global integration values in, e.g. a convex break-up where it can help to establish
independent hubs of circulation, in studies of angular choice the analysis of different metric rather than
topological radii takes on even greater significance in that it promises, at least in present-day contexts,
a means of distinguishing between factors influencing vehicular and pedestrian traffic flows (Al Sayed
et al., 2014, pp. 25, 74). How readily the latter can be transposed into different archaeological contexts
may be debatable, but an example for the usefulness of local convex integration will be given in the fol-
lowing case study.
Case study
The Late Bronze Age palace of Pylos in Western Messenia (Blegen & Rawson, 1966), one of the early
state centres of Mycenaean Greece, offers very favourable conditions for space syntax analysis in two
regards in particular: First, a careful study of changes to the building over the course of the 13th century
BC allows us to distinguish – and then analytically compare – an earlier and a later state of the building
(Nelson, 2017, pp. 360–365 Figures 4.7–4.8; Thaler, 2018, pp. 39–59; Wright, 1984; Figure 16.4(a–b)).4
Figure 16.4 The Palace of Pylos (Ano Englianos), Messenia, Greece (13th c. BCE): (a) a simplified plan of the
earlier building state with results of VGA and the shortest convex routes to throne room 6 superimposed as a
partial topological graph. (b) a simplified plan of the later building state with results of VGA and the shortest
convex routes to throne room 6 superimposed as a partial topological graph. (c) a simplified plan of the later
building state with shading indicating areas of convex spaces most easily accessible from the three different
points of access (indicated by arrows) and separate j-graphs for access through each of the latter (grey indicating
the spaces of the service ‘wing’). (d) a simplified plan of the later building state with shading indicating areas
of convex spaces most easily accessible from the three main courts 58, 63 and 88, “A” marking the archive,
“NEB” the Northeastern Building (the presumed clearing-house) and “P” pantries (courts assumed to be
served from these are indicated by subscript nos.). (e) j-g raph of the later building state (grey indicating the
spaces of the service ‘wing’).
Sources: Figures 16.2(b–d),16.3,16.4: by the author
Space syntax methodology 307
Second, a more impressionistic comparison of those building states has already been crucial in formulat-
ing a hypothesis that dominated our understanding of the Pylian palace complex for over two decades, i.e.
the assumption that its architectural development reflected a long-term economic decline (Shelmerdine,
1987; Wright, 1984), in reaction to which, among other things, “changes to the palace [. . .] consistently
[. . .] restrict[ed] access and circulation” (Shelmerdine, 1987, p. 564); sometimes this has even been associ-
ated with defensive considerations in a military sense (e. g. Shelmerdine, 1998, p. 87).
Restriction of access (from outside) and circulation (within) translates readily into space syntax terms
as a significant lowering of, on the one hand, the integration value of the carrier space, i.e. the outside
of the building, and, on the other, mean integration for the entire system. The carrier space does indeed
display a noticeable, if not dramatic, loss of integration, from 0.84 to 0.69 for the convex and 1.27 to 1.11
for the axial break-up; yet even these lowered values remain virtually identical or even slightly higher
than the mean for integration, calculated at 0.70 for the convex and 0.96 for the axial break-up in the
later state. If the carrier is as well integrated with the building as a whole as is the average space within
it, this hardly constitutes a defensive architecture. As to circulation within, the just cited mean values of
integration hardly change at all from the earlier state, for which they are calculated as 0.72 and 1.01 in
the axial and convex analysis respectively; in fact, if one of the aforementioned alternative suggestions for
the calculation of integration values (Teklenburg et al., 1993) is followed, this minimal drop is reversed
into an (equally insignificant) rise in integration (Thaler, 2005, p. 327).
This is remarkable not only in that it clearly contradicts (one of the underlying assumptions of) the
decline hypothesis, but also when viewed against relative proportions of space types, particularly the
increase of spaces of type b, 29% in the earlier and 38% in the later building state, and the concomitant
decrease of d-type ring-connectors, which account for 18% of all spaces in the earlier state, but only 9%
later on. It is not ease of movement that decreased, but options of route choice; i.e. circulation was not
restricted, but rather channelled. Channelling of traffic towards distinct routes is an aspect of the grow-
ing architectural differentiation of the palace complex and is particularly evident with regard to what
might be described as its service ‘wing’: If we compare the j-g raph for the palace (Figure 16.4(e)) and a
mapping of the areas most quickly (i.e. with the least topological steps) reached from each of its three
points of access (Figure 16.4(c)), then an area of store rooms at and around the back of the palace’s main
building stands out as a coherent subsystem with only few connections to the remainder of the complex.
A j-graph constructed for access only through this ‘tradesmen’s entrance’ (as it was termed in another
earlier and rather perceptive study, Kilian, 1984, p. 43), i.e. omitting the two other access points, shows the
palace as a remarkably deep and inaccessible structure; clearly, the larger (and more representative part) of
the complex was not meant to be accessed from this direction.
If we look at how official visitors were meant to enter a Mycenaean palace (or, at least, the most high-
ranking visitors, since differentiations in rights of access seem to have held great importance), there is a
canonical route through first a propylon and then a courtyard (both elements could be repeated) into
the megaron; inside the megaron itself, there was first an open porch, from which a vestibule could be
accessed which in turn lead into the hearth/throne room. Although the latter, numbered as room 6 by
the excavators, was the most integrated a-type space, thus combining an accessible/commanding position
and privacy, in both the earlier and later state of the Pylos palace, it was only in the later state that the
topologically shortest route through the convex map into the throne room came to coincide with the
canonical route just set out (Figure 16.4(a–b)); concomitantly, VGA documents a shift of visual integra-
tion within the large courtyard in front of the main building from its sides to its centre and thus towards
the propylon. This may well be a case of a specific social practice, i.e. a canonical way of approaching
the ruler’s seat, becoming embodied in the (architectural but also highly) social structure of the palace
building, which then, of course, was instrumental in perpetuating it.
308 Ulrich Thaler
Another aspect of the growing differentiation of the palace complex besides such channelling can be
found in the comparison not of routes, but of individual spaces and none are more informative in this
regard than the major courts. While the earlier building state displayed a largely undifferentiated ring of
hypaethral spaces around the main building, three distinct and separate courtyards develop there in the
course of the 13th century, courts 58, 63 and 88 of the later building state (Figure 16.4(d)). Their crucial
role in the palace complex is documented by the fact that, of all convex spaces, the three display the high-
est values for local integration (58 > 63 > 88), connectivity (58 > 63 = 88) and control (58 > 88 > 63).
Clearly, together these were the circulation hubs of the later palace, but nonetheless their roles were not
one and the same, as a comparison between 58 and 63 in particular illustrates.
Court 58 displays a significantly lower depth from the carrier space, indeed it is a mere two steps
from two access points to the palace (out of a total of three, the third being the aforementioned ‘trades-
men’s entrance’) and no route from these two entrances into the deeper sections of the palace can bypass
58. Little surprisingly, the palace archive is nearby and what has been identified as a clearing-house for
the redistributive palace economy opens directly onto 58, which can be identified as the interface for
all official outside contacts of both political/ceremonial and formal economic nature (anything other
than deliveries of goods for consumption within the palace, it would seem). And yet, in terms of global
integration, i.e. in its relevance for circulation within the palace, 58 is eclipsed by court 63 as the most
highly integrated convex space of the entire complex. To add insult to injury, both courts can be associ-
ated with pantries containing, among other things, thousands of drinking vessels, presumably employed
during palatial feasts; but, by comparison with the kylikes apparently used in the more deeply sited and
thus apparently more exclusive court 63, those associable with 58 are of noticeably inferior quality, indi-
cating how architectural differentiation could be translated into social differentiations during specific
events hosted within the palace (Bendall, 2004). As to court 88, mapping those areas topologically more
closely associated with this courtyard rather than either 58 or 63 (Figure 16.4(d)) produces a result that is
almost completely congruent with the aforementioned mapping of primary access through the ‘trades-
men’s entrance’; court 88 was the hub of the service ‘wing’ and presumably no more than a staging area
in the context of feasts.
Conclusion
Given its underlying (and, it would appear, empirically proven) premise that a three-dimensional Euclid-
ian space can be reduced to a two-dimensional topological one and still be meaningfully analysed in social
terms, it should not surprise us that space syntax methodology is not difficult to apply, both in general
and in particular, i.e. to archaeological contexts (cf. Hacıgüzeller & Thaler, 2014). This ease of applica-
tion is further helped, to no small degree, by the fact that the Bartlett School of Architecture has made
fairly user-friendly analytical software available free of charge for academics, first with Alasdair Turner’s
Depthmap (Turner, 2011; cf. Turner, 2001b, 2004), which largely replaced an earlier bundle of Macintosh-
based programmes, and more recently with Tasos Varoudis’s depthmapX (Varoudis, 2012). It is so simple
(and, indeed, takes so little understanding of the underlying procedures) to produce a plausible(-looking)
output with just a few mouse-clicks, that anyone planning to work with space syntax more extensively
or in some depth should perhaps consider starting with a ‘manual’ analysis or two.
Any prospective user should also keep in mind that space syntax does not produce any meaningful
results in a vacuum. I have elsewhere (Thaler, 2006, 2018, pp. 8–26) proposed an analytical framework
that uses different levels of diachronic stability as a guiding principle to meaningfully relate different
perspectives on the social definition and the archaeological documentation of architectural spaces, includ-
ing space syntax. Yet a more general and two-fold caveat should be emphasized in the present context,
Space syntax methodology 309
namely that both a critical assessment of the source materials, i.e. the plans intended for analysis, and a
considered contextualization of results are important and, at least in the latter case, indispensable steps in
archaeological studies employing space syntax methods. We are no longer in a position to see whether
the high integration value of space X, Y or Z in any building or settlement under analysis correlates with
actual movement patterns, but have to assume that it does based on the analogy with observations on
present-day contexts. Similarly, we cannot question occupants on what specific spaces are used for and
therefore will have to relate analytical results to archaeological indications of space use. Finds inventories
for specific spaces are the obvious example, but the study of wall-painting locations through space syntax
(Letesson, 2012) – and thus of a category of non-movable, ‘diachronically stable’ finds less prone to pre-
and postdepositional dislocation than most other finds – provides a good illustration that we need not
confine ourselves to the obvious.
That said, the fact remains – or comes into focus even more clearly – that space syntax approaches
entail high demands on archaeological data. This holds true with regard to both ‘coverage’ – understand-
ably, the analysis of incomplete building plans is not an issue widely discussed in the non-archaeological
space syntax community – and level of detail. Some of the most exciting approaches in archaeological
space syntax research, although ones whose potential may not have been fully realized in extant case
studies, are therefore those which explicitly address the weaknesses in archaeological data quality, e.g.
by harnessing the concept of topological genotypes in order to reconstruct incomplete building plans
(Romanou, 2007), by trying to open up the large-scale coverage of geophysical survey to space syntax
analysis (Spence-Morrow, 2009) or by aligning space syntax perspectives with data recovery methods
that promise very fine-g rained information on space use in built contexts, such as micro-refuse analysis
(Milek, 2006).
In the light of these latter studies, it could be suggested that the greatest hope for space syntax in
archaeology does not lie in specialist desk- and literature-based studies, though given the inherently
comparative stance of space syntax their potential contribution remains great, but in research designs for
fieldwork that take into account the needs of topological analyses of social space, not in order to ‘cater
for’ specialists, but in order to enlist a further useful tool for the detailed published study that is the aim
and raison d’être of research excavations and surveys.
Notes
1 It should be noted that the use of the term ‘depth’ in the present introductory text differs slightly – and deliber-
ately – from much of the extant literature, where a wider meaning is adopted. Hence, what is here termed simply
‘depth’ may be read as ‘depth from carrier’ in more conventional terms, whereas what in the following will be
referred to as ‘path length’ or ‘step distance’, i.e. the distance between any two nodes in a system, will often be
described simply as ‘depth’. There is a clear logic in the latter usage in that, e.g. the mean depth of (or for) a given
node will indicate how deep the system is from that node, but the more descriptive designation ‘path length’ was
felt to offer a more intuitive appreciation of methodological foundations and is thus preferred here. Correspond-
ingly, in the formulae given in the text ‘MD’ can be read as either ‘mean distance’ (in the terms chosen in this paper)
or ‘mean depth’ (in the more conventional usage). For analytical purposes, calculating the integration (cf. below)
of the carrier will often be a strong alternative to considering the mean depth (from the carrier) of a system.
2 Further illustration of this point can be seen in deliberate divergences from the strict definition of convexity in
Figure 16.3(a). While it seems clear – to the author and hopefully the reader, too – that the insertion of a staircase
in the room labelled ’1’ breaks the latter up into two separate spaces, the non-convexity of rooms 5 and 9 was
considered too little pronounced to warrant subdividing these spaces. The most arbitrary decision was certainly
to subdivide room 6, which like room 1 houses a staircase in one corner, into two spaces in such a manner that
the larger one, room 6 a, ‘controls’ the door to corridor 3 at the cost of its strict convexity (rather than assign that
control to the ‘nook’ 6 b or sharing it between 6 a and 6 b). In the strictest sense, the floor area inside each door
opening would have to be considered a separate convex space; these connecting spaces would, coincidentally, paral-
lel the ‘connectors’ needed in the early space syntax software ‘Pesh’. Elsewhere, I have used the term “semiconvex
310 Ulrich Thaler
breakup” to indicate “a suitable compromise between the analysis of convex and bounded spaces in a complex
containing both roofed and open areas” (Thaler, 2005, p. 327; cf. Thaler, 2018), but arguably such a designation
could be considered to imply an impractically absolute concept of convexity.
3 In a similar vein, entropy, calculated through a logarithmic formula, and other measures derived from it were
studied, particularly in the context of VGA, for their potential to “give an insight into how ordered the system
is from a location“ (Turner, 2001b, p. 9); but while high entropy could be associated with even distributions of
path lengths from a given space to other spaces, a low entropy value for a space could either indicate many other
spaces in close proximity or the opposite, a clustering of spaces at a distance.
4 The case study briefly set out here is presented in more detail in: Thaler (2005, 2018, pp. 39–185) and Hacıgüzeller
and Thaler (2014). Its results are contextualized in: Thaler (2006, 2018). As also briefly discussed in n. 2, the con-
cept of convexity was applied with a deliberate degree of latitude in this case study, but corresponding terms like
‘convex space’, ‘convex break-up’ etc. are retained in their conventional form in the present introductory text.
References
Al Sayed, K., Turner, A., Hillier, B., Iida, S., & Penn, A. (2014). Space Syntax methodology (4th ed.). London: University
College London, Bartlett School of Architecture.
Batty, M. (2004). CASA working papers: Vol. 74. A new theory of space syntax. Retrieved from www.casa.ucl.ac.uk/
working_papers/paper75.pdf
Bendall, L. (2004). Fit for a king? Hierarchy, exclusion, aspiration and desire in the social structure of Mycenaean
banqueting. In P. Halstead & J. C. Barrett (Eds.), Sheffield studies in Aegean archaeology: Vol. 5. Food, cuisine and
society in prehistoric Greece (pp. 105–135). Oxford: Oxbow Books.
Benedikt, M. L. (1979). To take hold of space: Isovists and isovist fields. Environment and Planning B, 6, 47–65.
Blanton, R. E. (1994). Houses and households: A comparative study. New York, NY: Springer.
Blegen, C. W., & Rawson, M. (1966). The palace of Nestor at Pylos in Western Messenia: Vol. 1. The buildings and their
contents. Princeton: Princeton University Press.
Conroy, R. (2000). Spatial navigation in immersive virtual environments (Doctoral dissertation). Retrieved from www.
thepurehands.org/phdpdf/thesis.pdf
Dafinger, A. (2010). Die Durchlässigkeit des Raums: Potenzial und Grenzen des Space Syntax-Modells aus sozialan-
thropologischer Sicht. In P. Trebsche, N. Müller-Scheeßel, & S. Reinhold (Eds.), Der gebaute Raum: Bausteine einer
Architektursoziologie vormoderner Gesellschaften (pp. 123–142). Münster: Waxmann.
Dalton, N. (2001). Fractional configuration analysis and a solution to the Manhattan problem. In J. Peponis, J. D.
Wineman, & S. Bafna (Eds.), Proceedings of the 3rd international symposium on space syntax (pp. 26.1–26.13). Ann
Arbor: University of Michigan, College of Architecture & Urban Planning.
de Arruda Campos, M. B., & Fong, P. S. P. (2003). A proposed methodology to normalise total depth values
when applying the visibility graph analysis. In J. Hanson (Ed.), 4th international space syntax symposium, confer-
ence, London, 17–19 June 2003 (pp. 35.1–35.10). Retrieved from http://217.155.65.93:81/symposia/SSS4/
fullpapers/35Campos-Fongpaper.pdf
Foster, S. M. (1989). Analysis of spatial patterns in buildings (access analysis) as an insight into social structure:
Examples from the Scottish Atlantic Iron Age. Antiquity, 63, 30–40.
Freeman, L. (1977). A set of measures of centrality based on betweenness. Sociometry, 40, 35–41.
Garfield, S. (2012). On the map: Why the world looks the way it does. London: Profile Books.
Gilchrist, R. (1988). The spatial archaeology of gender domains: A case study of medieval English nunneries. Archaeo-
logical Review from Cambridge, 7, 21–28.
Hacıgüzeller, P., & Thaler, U. (2014). Three tales of two cities? A comparative analysis of topological, visual and
metric properties of archaeological space in Malia and Pylos. In E. Paliou, U. Lieberwirth, & S. Polla (Eds.), Topoi:
Berlin studies of the ancient world: Vol. 18. Spatial analysis in past built environments: Proceedings of the international and
interdisciplinary workshop (pp. 203–262). Berlin: de Gruyter.
Hanson, J. (1998). Decoding homes and houses. Cambridge: Cambridge University Press.
Hillier, B. (1996). Space is the machine: A configurational theory of architecture. Cambridge: Cambridge University Press.
Hillier, B. (1999a). Guest editorial: The need for domain theories. Environment and Planning B, 26, 163–167.
Hillier, B. (1999b). The hidden geometry of deformed grids: Or, why space syntax works, when it looks as though
it shouldn’t. Environment and Planning B, 26, 169–191.
Space syntax methodology 311
Hillier, B. (2007/2008). Using DepthMap for urban analysis: A simple guide on what to do once you have an analysable map
in the system. Unpublished manuscript, Bartlett School of Architecture, University College London, London, UK.
Hillier, B. (2008). Space and spatiality: What the built environment needs from social theory. Building Research &
Information, 36, 216–230.
Hillier, B., & Hanson, J. (1984). The social logic of space. Cambridge: Cambridge University Press.
Hillier, B., Hanson, J., & Graham, H. (1987). Ideas are in things: An application of the space syntax method to dis-
covering house genotypes. Environment and Planning B, 14, 363–385.
Hillier, B., & Iida, S. (2005). Network and psychological effects in urban movement. In A. G. Cohn & D. M. Mark
(Eds.), Lecture notes in computer science: Vol. 3693. Proceedings of spatial information theory: International conference
(pp. 475–490). Berlin: Springer-Verlag. Retrieved from http://eprints.ucl.ac.uk/1232/
Hillier, B., Leaman, A., Stansall, P., & Bedford, M. (1976). Space syntax. Environment and Planning B, 3, 147–185.
Hillier, B., Penn, A., Hanson, J., Grajewski, T., & Xu, J. (1993). Natural movement: Or, configuration and attraction
in urban pedestrian movement. Environment and Planning B, 20, 29–66.
Hillier, B., Yang, T., & Turner, A. (2012). Normalising least angle choice in Depthmap: And how it opens up new
perspectives on the global and local analysis of city space. Journal of Space Syntax, 3, 155–193.
Kilian, K. (1984). Pylos – Funktionsanalyse einer Residenz der späten Palastzeit. Archäologisches Korrespondenzblatt,
14, 37–48.
Kruger, M. J. T. (1989). On node and axial grid maps: Distance measures and related topics. Paper presented at the European
Conference on the Representation and Management of Urban Change, Cambridge.
Leach, E. (1978). Does space syntax really “constitute the social”? In D. R. Green, C. Haselgrove, & M. Spriggs
(Eds.), British archaeological reports: International series: Vol. 47ii. Social organisation and settlement: Contributions from
anthropology, archaeology and geography (pp. 385–401). Oxford: BAR Publishing.
Letesson, Q. (2012). ‘Open day gallery’ or ‘private collections’? An insight on Neopalatial wall paintings in their spatial
context. In D. Panagiotopoulos & U. Günkel-Maschek (Eds.), Aegis: Vol. 5. Minoan realities: Approaches to image,
architecture, and society in the Aegean Bronze Age (pp. 27–61). Louvain-la-Neuve: Presses universitaires de Louvain.
Milek, K. (2006). Houses and households in early Icelandic society: Geoarchaeology and the interpretation of social space (Doctoral
dissertation). Retrieved from www.dropbox.com/sh/qirb8c8m5o7lisa/AAB7UD7cwNIYAF9WDTJqVB_6a
Nelson, M. C. (2017). The architecture of Epano Englianos, Greece. In F. A. Cooper & D. Fortenberry (Eds.), Brit-
ish archaeological reports: International series:Vol. 2856. The Minnesota Pylos project, 1990–98 (pp. 283–418). Oxford:
BAR Publishing.
Romanou, D. (2007). Residence design and variation in residential group structure: A case study, Mallia. In R. West-
gate, N. R. E. Fisher, & J. Whitley (Eds.), Building communities: House, settlement and society in the Aegean and beyond:
Proceedings of a conference held at Cardiff University, 17–21 April 2001 (pp. 77–90). London: British School at Athens.
Sailer, K. (2010). The space-organisation relationship: On the shape of the relationship between spatial configuration and collective
organisational behaviours (Unpublished doctoral dissertation). TU Dresden, Dresden.
Schwarzberg, H. (2002). Zu Geschichte und baulicher Entwicklung von Schloß Friedeburg im Mansfelder Land.
Burgen und Schlösser in Sachsen-Anhalt: Mitteilungen der Landesgruppe Sachsen-Anhalt der Deutschen Burgenvereinigung,
11, 217–238,
Seamon, D. (1994). The life of the place: A phenomenological commentary on Bill Hillier’s theory of space Syntax.
Nordisk Arkitekturforskning, 7, 35–48.
Seamon, D. (2003). Review of the book Space is the machine, by B. Hillier. Environmental and Architectural Phenomenol-
ogy, 14(3), 6–8.
Shelmerdine, C. W. (1987). Architectural change and economic decline at Pylos. Minos, 20–22, 557–568.
Shelmerdine, C. W. (1998). The palace and its operations. In J. L. Davis (Ed.), Sandy Pylos: An archaeological history
from Nestor to Navarino (pp. 81–96). Austin, TX: University of Texas Press.
Spence-Morrow, G. (2009). Analyzing the invisible syntactic interpretation of archaeological remains through
geophysical prospection. In D. Koch, L. Marcus, & J. Steen (Eds.), Proceedings of the 7th international space syntax
symposium (pp. 106.1–106.10). Stockholm; KTH Royal Institute of Technology.
Stöger, H. (2011). Archaeological studies Leiden university:Vol. 24. Rethinking Ostia: A spatial enquiry into the urban society
of Rome’s imperial port-town. Leiden: Leiden University Press.
Teklenburg, J. A. F., Timmermans, H. J. P., & van Wagenberg, A. F. (1993). Space syntax: Standardised integration
measures and some simulations. Environment and Planning B, 20, 347–357.
312 Ulrich Thaler
Thaler, U. (2005). Narrative and syntax: New perspectives on the Late Bronze Age palace of Pylos, Greece. In A. van
Nes (Ed.), Space Syntax: 5th international symposium (pp. 323–339). Amsterdam: Techne Press.
Thaler, U. (2006). Constructing and reconstructing power: The palace of Pylos. In J. Maran, C. Juwig, H. Schwen-
gel, & U. Thaler (Eds.), Geschichte: Forschung und Wissenschaft: Vol. 19. Constructing power: Architecture, ideology and
social practice (pp. 93–116). Münster: LIT Verlag.
Thaler, U. (2018). Universitätsforschungen zur prähistorischen Archäologie: Vol. 320. me-ka-ro-de: Mykenische Paläste als
Dokument und Gestaltungsrahmen frühgeschichtlicher Sozialordnung. Bonn: Habelt.
Tilley, C. (2005). Phenomenological archaeology. In C. Renfrew & P. Bahn (Eds.), Archaeology: The key concepts
(pp. 201–207). London: Routledge.
Turner, A. (2000). CASA working papers:Vol. 23. Angular analysis: A method for the quantification of space. Retrieved from
http://discovery.ucl.ac.uk/1368/
Turner, A. (2001a). Angular analysis. In J. Peponis, J. D. Wineman, & S. Bafna (Eds.), Proceedings of the 3rd interna-
tional symposium on space syntax (pp. 30.1–30.11). Ann Arbor: University of Michigan, College of Architecture &
Urban Planning.
Turner, A. (2001b). Depthmap: A program to perform visibility graph analysis. In J. Peponis, J. D. Wineman, &
S. Bafna (Eds.), Proceedings of the 3rd international symposium on space syntax (pp. 31.1–31.9). Ann Arbor: University
of Michigan, College of Architecture & Urban Planning.
Turner, A. (2004). Depthmap 4: A researcher’s handbook. London: University College London, Bartlett School of Gradu-
ate Studies. Retrieved from http://eprints.ucl.ac.uk/2651/
Turner, A. (2007). From axial to road-centre lines: A new representation for Space Syntax and a new model of route
choice for transport network analysis. Environment and Planning B, 34, 539–555.
Turner, A. (2011). UCL Depthmap: Spatial network analysis software (Version 10.14) [Computer software]. London:
University College London, VR Centre of the Built Environment.
Turner, A., Doxa, M., O’Sullivan, D., & Penn, A. (2001). From isovists to visibility graphs: A methodology for the
analysis of architectural space. Environment and Planning B, 28, 103–121.
Turner, A., & Penn, A. (1999). Making isovists syntactic: Isovist integration analysis. Proceedings 2nd International
Symposium on Space Syntax, Universidad de Brasil, Brazil. Retrieved from www.vr.ucl.ac.uk/publications/
turner1999-000.html
Varoudis, T. (2012). DepthmapX – Multi-platform spatial network analysis software [Computer software]. London:
University College London and The Bartlett School of Architecture. Retrieved from http://varoudis.github.io/
depthmapX/
Wright, J. C. (1984). Changes in the form and function of the palace at Pylos. In C. W. Shelmerdine & T. G.
Palaima (Eds.), Pylos comes alive: Industry and administration in a Mycenaean palace (pp. 19–29). New York: Fordham
University Press.
Plate 13.2 Southern portion of the coastal Georgia study area: maximum available calories for all shellfish
species for the month of September (ca. 500 BP).
Plate 13.3 Southern portion of the coastal Georgia study area: returnable calories for all resources combined
for the month of January (ca. 500 BP).
Plate 13.4 Southern portion of the coastal Georgia study area: returnable calories for all resources combined
for the month of September (ca. 500 BP).
Plate 14.1Schematic illustration of the features of an Agent Based Model (ABM) with cognitive agents, based
on the model described (Lake, 2000a).
Plate 14.3 Graphed Agent Based Model (ABM) simulation results which collectively illustrate several aspects
of experimental design: (a) plotted points of the same colour and k value differ due to stochastic effects alone;
(b), two different parameters σ and k are varied; and (c) two different agent rules, “CopyTheBest” and “Copy-
IfBetter” are explored.
Plate 14.4 Comparison of Long House Valley simulation results with archaeological evidence.
Plate 15.1 Four different network data representations of the same hypothetical Mediterranean transport
network. (a) adjacency matrix with edge length (in km) in cells corresponding to a connection; (b) node-
link-diagram where edge width represents length (in km). Please refer to the colour plate for a breakdown by
transport type where red lines = sea, green = river, grey = road); (c) edge list; (d) geographical layout. Once
again, please refer to the colour plate for a breakdown of transport type.
Plate 15.2 A planar network representing transport routes plotted geographically (a) and topologically (b). A
non-planar social network representing social contacts between communities plotted geographically (c) and
topologically (d). Note the crossing edges in the non-planar network.
Plate 15.5 Network representation of the Orbis network: geographical layout (a, c) and topological layout (b,
d). Node size and colour represent betweenness centrality weighted by physical distance in (a) and (b), and
they represent unweighted betweenness centrality in (c) and (d): the bigger and darker blue the node, the more
important it is as an intermediary for the flow of resources in the network. By comparing (a, b) with (c, d), note
the strong differences in which settlement is considered a central one depending on whether physical distance
is taken into account (a, b) or not (c, d). Edge colours represent edge type: red = sea, green = river, grey = road.
Plate 15.6 Geographical network representation of the Orbis network: geographical layout (a) and topological
layout (b). Node size and colour represent increasing physical distance over the network away from Rome: the
larger and darker the node, the further away this settlement is from Rome following the routes of the trans-
port system. Note the fall-off of the results with distance away from Rome structured by the transport routes
rather than as-the-crow-flies distance. Edge colours represent edge type: red = sea, green = river, grey = road.
Plate 17.7 (a) The cumulative viewshed generated by summing the binary viewsheds of the 17 coin hoard loca-
tions depicted in Figure 17.4 with a maximum view radius of 6,880 m. Colours at the red end of the green-red
scale indicate locations from which higher numbers of mounds are visible. (b) The total viewshed calculated for
the entire study region (in this case, the convex hull depicted in Figure 17.4 with a 500 m buffer – 8,938 view-
point locations). This encodes views-from the individual viewpoints, where the red end of the green-red scale
indicates those locations from which a larger area is modelled as being visible. (c) The above analysis repeated
with viewpoint/target offsets adjusted to encode views-to the viewpoint locations.
Plate 17.8 Viewsheds generated for each of the tower-kivas. The green zone represents the view from ground
level and the red the top of the tower. Blue dots indicate Puebloan archaeological sites in the landscapes of the
tower kivas. The radiating buffers extend for 20 km around each site – the maximum viewing range used for
the analyses (Kantner & Hobgood, 2016, Figure 3).
Plate 20.1 Multisensor (8 sensors) vertical magnetic gradient survey with SENSYS at Pella, North. Greece. Left
image indicates the original data suffering from various spikes, traverse striping and grid mismatching. Right image
indicates the results of processing that tried to remove those specific effects.
Plate 20.2 Application of the Fast Fourier Transform (FFT) power spectrum analysis of the magnetic data
obtained from the Bronze Age cemetery of the Békés Koldus-Zug site cluster – Békés 103 (BAKOTA proj-
ect). The depth of the various targets (h) is easily determined by measuring the slope of the power spectrum
at different segments and dividing it by 4π (Spector & Grant, 1970). The radially averaged spectrum was cal-
culated and used to separate the magnetic signals coming from deep sources (h=2.87 m) and shallow sources
(h=0.73 m) below the sensor. The spectrum was also used as a guide to define a bandwidth filter in order to
eliminate the sources with wavenumber more than 550 radians/m and less than 100 radians/m respectively,
and enhance the magnetic signal coming from the potential archaeological structures.
Plate 20.6 Results of a seismic refraction survey at the area of the assumed ancient port of Priniatikos Pyrgos in East Crete,
Greece: (a) 2D image representing the depth to the bedrock, which reaches about 40 m below the current surface (bluish
colors). The black dots represent the position of the geophones along the seismic transects. The area has been completely
covered by alluvium deposits and other conglomerate formation fragments as a result of past landslide and tectonic activity.
The interpretation of the velocity of propagation of the acoustic waves revealed the spatial distribution of (b) the alluvium
deposits at the top (velocity of 491 m/sec), (c) the lower and upper terrace deposits (velocity of 1830 m/sec), (d) the medium
depth sandstones and conglomerates (velocity of 2400 m/sec) and (e) the deeper weathered limestone or cohesive conglom-
erates (velocity of 4589 m/sec) (Sarris, Papadopoulos, & Soupios, 2014).
Plate 20.7 Results of the geophysical surveys at Velestino Mati. The magnetic data (a) indicates the nucleus of the settlement
at the west top of the magoula with some expansion towards the east top. A number of high dipolar magnetic anomalies are
associated with burnt daub foundations that were also confirmed from the Electromagnetic Induction (EMI) soil magnetic
susceptibility (b) and the soil resistance data (c). Magnetic susceptibility also confirmed the existence of enclosures around
the tell.
Plate 20.8 Results of the geophysical surveys at Almyriotiki. The magnetic data (a) presented a clear image of the
internal planning of the settlement: Burnt daub structures follow a circular orientation around the top of the tell.
The houses expand further to the south, where some weaker magnetic anomalies representing stone houses with
internal divisions are also present. An irregular wide ditch system encloses the settlement from the east and the
north and it is confirmed from the EMI magnetic susceptibility (b) and soil conductivity measurements (c). The
high soil conductivity to the north coincides with an area susceptible to periodic flooding. The above were also
confirmed from the soil viscosity measurements (d) as an indicator of the soil permittivity.
Plate 20.9 Results of the geophysical surveys at Almyros 2. The magnetic data (a) depict clearly the concentration
of burnt daub structures at the centre of the tell, expanding further to the south. The settlement is surrounded by
a double ditch system, which is confirmed by both EMI magnetic susceptibility (b) and soil conductivity data (c).
A number of breaks in this double enclosure are most probably associated with multiple entrances to the settle-
ment. Soil conductivity seems also to increase outside the settlement to the south and west directions (north to
the top), namely in the area which is most susceptible to flooding.
Plate 21.5 The TimeMap Data Viewer (TMView) (from Johnson and Wilson 2003, 127).
Plate 21.6 Sample frame from an animation in the ‘Up In Flames’ study that combined synchronised animated
density graphs (produced in the R Software Environment) with animated density maps (produced in ArcGIS).
Plate 21.8 Time-GIS (TGIS) screenshot showing dates symbolised according to temporal topology. The
colour coding is according to the temporal topological relationship between each date and the currently
selected time period.
Plate 23.1 A 3D model of the house of Caecilius Iucondus visualized through Unity 3D.
Plate 23.5 3D Model of the plaster head (21666) after conservation. The model was generated using Agisoft
PhotoScan pro version 1.2.6. with acquisition campaign and processing done by Nicoló Dell’Unto.
Plate 24.1 A citation network of results returned from a Google Scholar search of ‘geographic + visualization’,
using Ed Summers’ python package ‘Etudier’. “Google Scholar aims to rank documents the way researchers do,
weighting the full text of each document, where it was published, who it was written by, as well as how often
and how recently it has been cited in other scholarly literature.” https://scholar.google.com/intl/en/scholar/
about.html. The results give us a sense of the most important works by virtue of these citation patterns. Thus,
MacEachren, Boscoe, Hau, & Pickle (1998); Slocum et al. (2001); Brewer, MacEachren, Abdo, Gundrum, &
Otto (2000); Crampton (2002); Howard & MacEachren (1996) are most functionally important in tying
scholarship together. This is not the same thing as being the most often cited work. Rather, these are the works
whose ideas bridge otherwise disparate clumps; they are most central.
Plate 24.2 Citation analysis using Summers’ Etudier package, from a Google Scholar Search for ‘data + soni-
fication’. Colours are works that have similar patterns of citation; size are central works that tie scholarship
together. This is not the same thing as ‘most cited’. On this reading, one should begin with Madhyastha and
Reed (1995); Wilson and Lodha (1996); Zhao, Plaisant, Shneiderman, and Duraiswami (2004); De Campo
(2007); Zhao, Plaisant, Shneiderman, and Lazar (2008).
17
GIS-based visibility analysis
Mark Gillings and David Wheatley
Introduction
What do archaeologists mean by visibility and why are they so interested in it?
Archaeologists have long recognised that the visual properties of landscape locations (and configurations
of such) were sometimes significant factors in the structuring of past activities and as a consequence
analyses of visibility have become common within landscape-based archaeological research. Whether
carried out through rich description, simple mapping or formal modelling and statistical analysis, the
visualisation, exploration and interrogation of visual patterns have increasingly relied upon the function-
ality of GIS (Jerpåsen, 2009).
The thoroughness with which visual properties of landscape have been investigated has varied widely.
At its simplest it has taken the form of anecdote or simple description, perhaps noting that a given loca-
tion was visually-commanding or acknowledging that visual relationships played a role in the organisa-
tion of a given landscape (e.g. Cummings & Pannett, 2005; Bongers, Arkush, & Harrower, 2012). More
methodologically informed studies have sought to interrogate the relationships observed, grounding such
investigations in explicit bodies of theory. Some have taken a broadly functionalist approach, in which
observed visual properties have been related to how the landscape operated or was used in the past. Good
examples concern questions of defensibility and the assumed inter-visibility of watchtowers and signal-
ling systems (e.g. Gaffney & Stančič, 1991; Sakaguchi, Morin, & Dickie, 2010). Others have adopted a
more experiential stance, approaching visibility as a perceptual act carried out by an animal in an envi-
ronment. A variety of theoretical frameworks have been deployed in these latter studies, ranging from
Gibson’s theory of direct perception and his concept of affordance (Llobera, 1996 see also Webster, 1999;
Gillings, 2012) to Higuchi’s landscape theories and proposed visual indices (Wheatley & Gillings, 2000;
Ogburn, 2006; Ruestes Bitriá, 2008; Murrieta-Flores, 2014; Williamson, 2016). Many of these studies
have drawn direct inspiration from landscape phenomenology, which seeks to investigate the relational
properties of looking and seeing in order to understand how people comprehended and engaged with the
world around them. This might involve the deliberate positioning of a sequence of monuments in order
to ‘dominate’ views or effect a gradually unfolding visual choreography of concealment and revelation,
314 Mark Gillings and David Wheatley
or the careful positioning of motifs on a rock-art panel (e.g. Tilley, 1994, 2004; Cummings & Pannett,
2005; Rennell, 2012).
From the previous paragraph, it will be clear that ‘visibility’ has been characterised by archaeologists
in a variety of ways: a property (functional, aesthetic or otherwise) inherent to certain locations; some-
thing that manifests only in a network or configuration of locations; or an act of perception on the part
of an animal in an environment. In turn these characterisations have resulted in different methodologies
and interpretative strategies, ranging from explicit attempts to simulate (or replicate) individual acts of
looking and seeing, to the identification, extraction and interrogation of visual linkages and, finally, more
abstracted analyses of global visual affordance.
Each of these engages its own set of concerns and considerations, and they often differ in terms of the
degree of quantitative rigour that is demanded of a given analysis. Compare, for example, approaches that
simply aim to map and describe visual zones (e.g. Risbøl, Petersen, & Jerpåsen, 2013) with studies that
instead seek to assess the statistical veracity of the visual patterns observed in order to explain locational
choices in the past (e.g. Lopez-Romero de la Aleja, 2008; Lake & Ortega, 2013; Wright, MacEachern, &
Lee, 2014). The focus of analyses can also be direct, insofar as the analysis of visibility is the desired end-
product of the research (e.g. Loots, 1997), or more circumspect, the resultant analysis merely an ingredi-
ent (or step) in a more complex programme of analysis or exploration (e.g. Llobera, 2003; Paliou, 2013).
What they all share is a series of core concerns with questions of scale, clarity, acuity, verisimilitude, with
the particular emphasis shifting as a result of the type of analysis being attempted.
Method
This is effectively the complement of the viewshed (or less elegantly a ‘how-f ar-out-of-view-shed’)
and can be used to assess how hidden locations appear to be. These basic ingredients (the LOS, the
viewshed and its variants) lie at the heart of routine GIS-based visibility studies. One point of note is
that in urban studies, and within the field of space syntax (see Thaler this volume), the terms ‘Isovist’
and ‘Isovist Field’ have usually been used to describe maps that are functionally equivalent to the view-
shed (Benedikt, 1979; Turner, Doxa, O’Sullivan, & Penn, 2001) although sometimes calculated on a
slightly different basis.
Figure 17.2 (a) Conceptualisation of the basic line-of-sight (LOS) algorithm: LOS between two locations in an
altitude matrix can be established by comparing the height of each cell that intersects the line with the height
of the line at that location, interpolating where necessary. (b) Note that view-to and view-from are not neces-
sarily reciprocal because they represent different assumptions about the location of the viewer. An R3 viewshed
algorithm essentially repeats this calculation for every cell in the altitude matrix (except the viewpoint) and
records the result(s) for each cell.
Concentric Sweep and XDraw, often in creative combination. In each case the aim is to trade accuracy
for speed of calculation which is why the latter approaches are termed approximate.
In brief, the R2 approach optimises R3 by reducing the number of individual LOS calculations that
need to be carried out (Figure 17.3). It does this by first running an LOS from the viewpoint to each
location on the horizon or study area boundary. It then works outwards from the viewpoint along each
of the LOS, determining the elevations of the grid intersections crossed for each grid cell adjacent to the
line. By calculating the gradients in each case the status of the intersection can be determined (in-view
or out-of-view) and this is then assigned to the closest grid cell (Larsen, 2015, pp. 26–30; Kaučič & Žalik,
2002, p. 179). Where cells receive multiple approximations (as multiple LOS may pass by) it is the closest
that determines the visibility status (Franklin & Ray, 1994, p. 6).
In much the same way as R2, Sweep (Haverkort, Toma, & Zhuang, 2008; van Kreveld, 1996) and Radar
(Ben-Moshe, Carmmi, & Katz, 2008) approaches rely upon approximating the visibility status of cells
along the line of a given LOS, in this case rotating the LOS around the selected viewpoint like the second
hand of a clock. Many other algorithms exist, both variants of the above (such as Distribution and Con-
centric sweeping) and innovative horizon based approaches such as the XDraw algorithm (Franklin &
Ray, 1994; Wang, Robinson, & White, 2000) and it is important to be aware that research into optimised
viewshed algorithms continues apace (e.g. Izraelevitz, 2003; Xu & Yao, 2009; Ferreira, Andrade, Magal-
hães, Franklin, & Pena, 2014). The key point to stress here is that different algorithms are available that –
in an attempt to balance speed of computation with accuracy – will produce output of varying quality. In
any given GIS analysis it is therefore important to be aware of the particular strengths and weaknesses of
the algorithm being deployed. Unfortunately, outside of artificial (and carefully controlled) test datasets,
assessing the accuracy of a given LOS determination or viewshed can be extremely difficult and this has
been exacerbated by a tendency for analysts to treat one algorithm (invariably black-boxed but usually an
R3 implementation) as a de-facto standard against which optimised algorithms can be tested (e.g. Fisher,
1993; Kaučič & Žalik, 2002). As we discuss below, GIS-based visibility determinations are perhaps best
treated as probabilities established using imperfect models rather than actualities. As a consequence, the
318 Mark Gillings and David Wheatley
Figure 17.3 The concept of the R2 viewshed algorithm which operates by: (a) generating a ‘view horizon’ (the
white box), noting the visibility of each cell on that horizon, and storing the elevation angle of view to the
observer a1, a2 etc.; (b) expanding the horizon by one cell and calculating new angles of view B for each cell
on the new view horizon; (c) the angle of intersection with the previous horizon A is inferred (from a2 and
a3) and the new angle B is compared with A to determine whether a new cell is visible or not.
field testing of results is still strongly recommended although, whilst very much in the spirit of approaches
such as landscape phenomenology, for reasons that will be discussed next this also presents difficulties.
2 or 3D?
In landscape and urban studies, visibility has traditionally been mapped as a 2D property of landscape in
the form of a map (generally stored as a 2D matrix). More recently, there has been a growing interest in
the investigation of the 3D properties of visibility fields. At its simplest, this has acknowledged the vertical
dimension of landscape through the analysis of Above Ground Level (AGL) metrics and the calculation of
vertical visibility indices (Nutsford, Reitsma, Pearson, & Kigham, 2015), however these are still recorded
as 2D matrices of values, albeit related to 3D properties of space.
In highly complex environments, such as the interior of buildings or urban settings, this simplification
can be too restrictive and 3D matrices of visibility values which record variation on three, rather than
two, axes become important. Instead of an altitude matrix, fully 3D methods analyse 3D models which
can represent the complex forms of rooms and buildings, permitting the representation of archways, for
example. This has led to the development of the 3D isovist (e.g. Suleiman, Joliveau, & Favier, 2013) and
to visibility analyses conducted within 3D modelling systems. A good example can be found in the work
of Paliou (2011, 2013, 2014), in which the visibility of wall paintings within complex buildings and from
the outside of buildings (through windows and doorways) was formally analysed to gain insights into the
consumption of Theran mural art.
GIS-based visibility analysis 319
Figure 17.4 Buffering to avoid edge effects. The map depicts a group of Roman coin hoards in the Don
Valley in northern England. A series of visibility analyses were carried out in order to determine whether the
hoards were preferentially placed in relatively concealed (or hidden) locations. (a) In the centre is the convex
hull bounding the group of hoard locations. (b) Assuming a maximum view radius of 3,440 m (corresponding
to Ogburn’s (2006) limit of normal 20/20 vision for a 1 m wide object) we would need to process the area
included in this buffer to avoid edge effects. (c) If we increased this to 6,880 m (the limit of human acuity for
a 1 m wide object) we would need to extend our processing area accordingly – in this case to the outer buffer.
Hoard data taken from Bland et al. (2019).
Source: (Incorporates data © Crown database right 2007. An Ordnance Survey/(EDINA) supplied service.)
320 Mark Gillings and David Wheatley
is sufficiently buffered to negate such effects. For example, if your maximum viewing range is set to
6,880m – the absolute limit of human resolution and recognition acuity under ideal conditions (Ogburn,
2006, p. 410) – the DEM used to carry out the analysis needs to cover an area equivalent to the study area
plus a 6,880m-wide buffer, and results in the buffer zone can be discarded. An alternative to buffering
would be to use a scaling factor to account for the loss of available area as we approach the edge of the
region. For example, we could apply a factor ranging from 1 (for viewpoint locations with no edge effect)
through 0.5 (for locations on the edge of the DEM) to 0.25 (for those in the corners). It should be noted
that this would only offer a partial solution as whilst it would correct the scaling error, the accuracy of
the parameter estimates would still be compromised (Figure 17.5).
It is important to understand that the crisp, binary viewshed (or definitive LOS determination) is a
simplified model of visibility within a virtual environment, and so we should perhaps regard the results of
such analyses more as theoretical possibilities than definitive statements of in-view or out-of-view. One
particularly elegant solution to this was proposed by Fisher (1994, 1995) through the notion of the probable
and fuzzy viewshed though these important ideas have not gained traction within archaeological studies
(for notable exceptions see Nackaerts & Govers, 1997; Ruggles & Medyckyj-Scott, 1996). Fisher set out a
probabilistic model of the errors in DEMs, including a term for spatial autocorrelation (because errors in
DEMs are not spatially uncorrelated), and advocated using a Monte Carlo approach involving repeated vis-
ibility determinations on different simulated DEMs. The resulting viewshed estimates were then summed
to produce the final output in the form of mapped view probability (the ‘probabilistic viewshed’) (Fig-
ure 17.6). Fuzzy viewsheds, according to Fisher’s definition, employ fuzzy set theory to incorporate acuity
effects (Loots, Nackaerts, & Waelkens, 1999) (see de Runz & Fusco, this volume). Rather than being clear
cut, the edges of a given viewshed are instead graded between 1 and 0 depending upon distance from the
viewpoint. Needless to say, probable and fuzzy viewshed analyses can also be combined.
A further issue is that the vast majority of the DEMs on which we base our analyses reflect contem-
porary topography which may have changed (through e.g. erosion, deposition, or later activity) from the
GIS-based visibility analysis 321
Figure 17.6 (a) A binary viewshed generated from the prehistoric post setting at Avebury depicted in Fig-
ure 17.1 (circled in white) shown over a shaded relief model. (b) A probabilistic viewshed calculated from the
same location (digital elevation model (DEM) errors are modelled as normally distributed with a root mean
square error (RMSE) of 3 m). White areas represent 100% probability, with the probability declining as the
shading becomes darker.
Source: (Incorporates data © Crown database right 2007. An Ordnance Survey/(EDINA) supplied service.)
period we are interested in. DEMs generally also lack the vegetation which may significantly reduce the
level of visibility possible within a landscape, and which can introduce a strong seasonal dynamic. Whilst
contemporary vegetation may be represented in LiDAR data (for example) or can be added to a ‘bare’
DEM (as a series of height stands), it cannot model sufficient detail to identify where views through sparse
stands of trees are possible (particularly in winter), for example. A more obvious problem is not knowing
where, exactly, vegetation grew in antiquity. Here a Monte Carlo approach similar to that used by Fisher
for modelling DEM errors (see earlier) may be fruitful: a stochastic model of the vegetation distribution
can be used to simulate many different modified DEM-plus-vegetation surfaces, and the results summed
to produce an estimate of the probability of view, given the vegetation model.
conditions (fog, mist, haze) may place restrictions on the level of visibility and may change at different
times or seasons (requiring an estimate of the refractive index of the environment to be input).
The most obvious parameters that need to be considered are elevation offsets to represent the height
of the viewing and observed locations. For example, to generate a viewshed for the region visible from
a walkway atop a defensive structure would require an offset to be entered for the viewer height at the
origin location, which will in this case be an estimate of the eye-height of an individual plus the height
of the walkway they are standing on (e.g. Mitcham, 2002). It is also important to remember that LOS
determinations cannot be assumed to be reciprocal (Figure 17.2(b)) so that, to extend the earlier example,
if we are interested instead in establishing where in the landscape a guard on the walkway could be visible
from, we would also need to enter an offset for every other location (i.e. each other cell in the DEM) to
represent the eye-height of the potential viewer looking towards the walkway. This issue was recognised
in some of the earliest systematic archaeological studies of visibility (e.g. Fraser, 1983) and confirmed in
a GIS context by Fisher (1996, p. 1298). Loots (1997) even proposed that two distinct terms were needed
to describe viewsheds depending on what was being modelled: projective (views-from) and reflective
(views-to) though this terminology does not appear to have been widely adopted. A conventional and
widely used estimate of the height of the human eye above the ground is 1.65m, although that should be
considered for each context as, clearly, it may not be an appropriate figure (or may not be the only appro-
priate figure) for a given human population, and would be inappropriate for modelling, for example, the
visual field of children or seated viewers. In some circumstances, we may also wish to model the direction
in which the viewer is looking. If not instructed otherwise, most algorithms will rather artificially assume
that a viewer will rotate through the full 360 degrees whilst viewing at all possible elevations of the head,
much like a terrestrial laser scanner. Needless to say, people rarely ‘look’ in that way. Fortunately, viewing
angles in the horizontal and vertical planes can usually be controlled.
Acuity is more difficult to model, despite the fact that a number of robust metrics are available to
allow the limits of vision to be established for different distances and scales of target object (Wheatley &
Gillings, 2000; Ogburn, 2006). Being able to see something and being able to recognise what it is that
you are looking at are not necessarily the same thing and we must assume that the quality of eyesight in
the past would have varied in much the same way that it does today. As we have already noted, a host of
dynamic effects such as the weather and air quality can equally impact upon the distance that we can see
and the maximum viewing distance we establish can have a marked impact on the final results generated
(Figure 17.6). It is also the case that there is more to acuity than simply distance, as some targets are easier
to see than others (large, distinctively coloured, moving versus small, camouflaged and still). Although
there are methods for accounting for some of these factors (e.g. the ‘Fuzzy Viewsheds’ discussed earlier),
these factors bring into question the usefulness of many simple binary visibility analyses. With the excep-
tion of horizons, fields of view do not abruptly end, and the precise viewshed possible on different days
and for different viewers will vary.
The upshots of all of this are twofold. First, all LOS and viewshed calculations must be carefully con-
sidered, and then the parameters modelled through the appropriate selection of variables such as offsets
and maximum view distances. Second, great care should be taken in the selection and preparation of the
DEM upon which they are based.
of interest to create a new map. That may be a super-set of the viewsheds in which cells that can see
(or can be seen from) one or more of the locations of interest are identified (often called a Multiple
Viewshed) or they may be numerical summaries in which the cells encode the frequency of locations
of interest that are in-view. A consistent terminology for these has not emerged although, where the
number of viewsheds being summed is relatively small (e.g. less than 100) and relate to a set of related
locations, these tend to be referred to as Cumulative Viewsheds (Wheatley, 1995; though see Lake,
Woodman, & Mithen, 1998) and statistical analysis of these as Cumulative Viewshed Analysis (CVA)
(see Figure 17.7).
Where the goal is to obtain the visibility characteristics of an entire landscape such that, ideally, each
cell in the DEM is iteratively treated as a separate viewpoint, a range of terms have been proposed for the
resulting map: Total Viewsheds (Llobera, 2003); Inherent Viewsheds (Llobera, Wheatley, Steele, Cox, &
Parchment, 2010); visual exposure density (Berry, 1993, p. 169); visibility index (Olaya, 2009, p. 157);
viewgrid, dominance-viewgrid (Lee & Stucky, 1998, p. 893); affordance viewshed (Gillings, 2009); vis-
ibility fields (Eve & Crema, 2014); and visibility surfaces (Caldwell, Mineter, Dowers, & Gittings, 2003).
Incidentally, the same operations can be carried out with AGL layers in order to produce maps of summed
‘hiddenness’ in relation to the sample of viewpoints. The utility of the “Affordance Viewshed” (to pick
one) is twofold – it provides a visual and quantitative summary of some visual property of the landscape
(usually area-of-view) that can reveal subtle patterns in the opportunities afforded by the landscape with
respect to visibility. It also represents the statistical population of that property against which hypothesis
testing of groups of locations is possible, enabling significant patterns to be more rigorously identified
(Figure 17.7).
Taking inspiration from Graph theory, an alternative approach is to integrate multiple LOS deter-
minations into a visibility network (e.g. De Montis & Caschili, 2012; Brughmans, Keay, & Earle, 2015;
Brughmans, Waal, Hofman, & Brandes, 2018; Van Dyke, Bocinsky, Windes, & Robinson, 2016; for a
detailed methodological discussion see Brughmans & Brandes, 2017). In these studies individual loca-
tions (viewpoints) are linked by edges if an LOS exists between them. In the same way as viewsheds these
viewpoints can correspond to particular sites of interest or simply locations in the landscape. The edges
linking inter-visible points can be coded in terms of factors such as direction and can also be weighted.
Once generated, the configuration and density of the resultant network can then be analysed to explore
factors such as centrality, neighbourhood size, degree of clustering and mean shortest path length. The
networks can also be used to derive second-order graphs (e.g. where locations lacking direct LOS can be
connected if they share other visible locations) (O’Sullivan & Turner, 2001; Brughmans & Brandes, 2017,
pp. 7–9). As such they offer a powerful set of heuristics and an intriguing alternative to more standard
raster approaches.
There are also a host of descriptive properties that can be extracted from viewsheds in order to furnish
information about the perceptual character of a given view. For example, shape, compactness, directional-
ity, eccentricity and degree of fragmentation to name but a few (e.g. Aguiló & Iglesias, 1995; Wheatley &
Gillings, 2000; Llobera, 2003; Trick, 2004). In the case of Cumulative Viewsheds and Total Viewsheds,
these can be treated as visibility surfaces and subjected to a range of geomorphometric analyses in order
to extract descriptive parameters such as roughness and texture (Olaya, 2009; Gillings, 2015). Neighbour-
hood based analyses are also possible in order to move beyond discretely bounded viewpoints to consider
instead the visual properties of chunks of landscape (Brughmans, van Garderen, & Gillings, 2018).
As well as end-products in their own right, viewsheds can be fed forward into GIS-based studies of
other phenomenon. When applied to the entire landscape, Total Viewsheds offer what is in effect a loca-
tion independent global index of visibility that can serve as a key ingredient in the study of landscape
affordances such as visual prominence (Llobera, 1996, 2001; Gillings, 2009; Bernardini, Barnash, & Wong,
2013), concealment (Gillings, 2015) and visual exposure (Llobera, 2003). They can also enrich analyses
of properties not immediately thought of as visual such as liminality (e.g. Gillings, 2017) and movement
(e.g. Murrieta-Flores, 2014), the latter through the use of viewsheds as frictions in the generation of
cost-surfaces and least-cost pathways (Lee & Stucky, 1998; Lu, Zhang, & Fan, 2008; Lock, Kormann, &
Pouncett, 2014; see also Herzog, this volume).
Case study
Figure 17.8 Viewsheds generated for each of the tower-kivas. The green zone represents the view from
ground level and the red the top of the tower. Blue dots indicate Puebloan archaeological sites in the landscapes
of the tower kivas. The radiating buffers extend for 20 km around each site – the maximum viewing range
used for the analyses (Kantner & Hobgood, 2016, Figure 3). A colour version of this figure can be found in
the plates section.
tower kiva would be discernible. To assess the signalling station hypothesis, intervisibility between the two
sites was assessed. In practice two viewsheds were generated for each site, one based upon a 1.7m high
viewer on the ground surface and the second placing the same viewer on top of the tower, reconstructed
to a height of 12m (i.e. 13.7m in total). The complete lack of either direct intervisibility or overlap
between the viewsheds generated (where a relay station could have been placed) led the researchers to
conclude that signalling was not a function.
GIS-based visibility analysis 327
POINTS TO CONSIDER: it could be argued that the problem of reciprocity would have been better
addressed by also controlling target height offsets – for example, where a viewer on a tower (i.e.
viewing location offset = 13.7m) could see a viewer on a tower (i.e. target offset also of 13.7m). Given
the historically documented use of smoke as a signalling mechanism, a sensitivity analysis would also
have been prudent. This could have taken the form of repeated viewshed calculations with increasing
target heights to ascertain how high a smoke plume would have had to rise in order to have been
visible. Incidentally, the same information could be extracted directly from an AGL layer.
To analyse the defensive hypothesis, a sensitivity analysis was carried out to assess the extent to which
vertical height enhanced the location’s ability to see over long distances (thus providing critical early-
warning of any approach). In practice a series of viewsheds were generated with an incremental increase
in viewing location height from 1.7m (i.e. on the ground) to 17.6m in steps of 2m. By measuring the area
of viewshed within a series of radiating 1km bands (Buffers) around each site the results demonstrated
that beyond a distance of 7km (Kin Ya’a) and 9km (Haystack) increases in maximum viewing distance
were negligible suggesting that defence was not the driver for constructing the tower kivas.
POINTS TO CONSIDER: no detail was given with regard to target offsets – i.e. what observers
in the tower kivas were expected to provide early warning of. Were they looking for approach-
ing individuals (on foot or horseback for example) or dust clouds generated by groups of such?
Either way, appropriate target heights could have been factored into the analyses. Likewise, the
maximum viewing distance could have been calibrated to reflect the maximum range at which a
person could be discerned against the background. The assumption is that 20km was used again,
but how likely is it that objects 20m in height and 10m in width would have been sneaking up on
the tower kiva?
To investigate the question of status, and enhanced visual presence, views-to the tower kivas were inves-
tigated and cross-referenced against the number of contemporary sites in the surrounding landscape.
The aim was to determine to what extent the tower-structures were a prominent part of daily life in the
landscape; i.e. afforded a tangible visual presence. The results were argued to demonstrate that construc-
tion of the tower kiva resulted in 34% more sites within 5km of Kin Ya’a being able to see the great house
complex. At Haystack the figure was 12.5%. In each case increases beyond 5km were extremely low. This
result was used to argue that the role of the tower kivas was to enhance the visibility of the monumental
great house complex within its immediate community, thus reinforcing its status as a social focus and
political centre. Indeed, the authors likened them to minarets or church steeples.
POINTS TO CONSIDER: once again offsets and acuity are key factors that are not described in the
case-study. There are also other ways in which local visual prominence could have been assessed.
For example, using the workflows described by Llobera (1996) and Bernardini et al. (2013) or
perhaps a views-to Total Viewshed to assess the extent to which the locations of the monument
328 Mark Gillings and David Wheatley
complexes were more (or less) visible than any other locations in the immediate landscape. Is
there preferential clustering of sites in certain parts of the visual envelope? Does the envelope itself
have any inherent directionality? At present the population of sites surrounding the tower kivas are
also presented as a uniform, homogenous block (all sites – dots on the map – are the same) with no
sense of chronology or temporality (they were all seemingly in place and active at the same time).
A network approach here would enable the researchers to determine precisely which kinds of site
make up the 34% and 12.5% increases effected by the tower kiva’s height. This approach has been
taken by Van Dyke et al. (2016) in their analyses of great house location through the creation of
inter-visibility networks they term ‘viewnets’ (2016). It would also be prudent to take a careful look
at sites that are spatially close yet visually excluded.
This research demonstrates nicely the value of GIS-based approaches to the carefully structured analysis
and exploration of visibility. A series of hypotheses, grounded in clear bodies of archaeological theory,
have been posited and carefully explored. The questions and suggestions we have raised as to where the
study might go next, are designed to stress the key point that rather than an end in themselves, the results
of GIS-based visibility studies are always best thought of as the first stage in the analytical progress, gen-
erating as many provocative and productive questions as they answer.
Conclusion
Looking ahead
Whilst the full range of possible visibility heuristics was defined at an early stage in the development of
GIScience (e.g. Nagy, 1994; Aguiló & Iglasias, 1995) the traditional barrier to realising the full potential of
computational approaches to visibility has been the time it takes to generate them. With the introduction
of optimised algorithms and more efficient architectures this has now been overcome. Archaeology has also
worked hard to address early criticisms of visibility analyses from within the discipline itself that highlighted
the lack of any robust theoretical (or indeed archaeological) rationale for carrying them out (e.g. Rennell,
2012; Brughmans et al., 2015; Gillings, 2017). Whilst the research field is currently a vibrant one, we would
argue that three essential developments need to take place if GIS-based visibility research in archaeology is to
reach its full potential. First, we need to unite what has been a rather fragmented field (based upon 30 or so
years of often piecemeal proof-of-method experiments and developments) into a single place with a coher-
ent nomenclature. In this way we will at the very least shift the balance between true innovation and the
repeated re-discovery of the same good ideas more towards the former (Gillings, 2017). Second, the veracity
of the myriad of tweaks and refinements to simple, binary visibility analyses that have been proffered need
to be evaluated and assessed. Finally, in order to prevent analyses from becoming formulaic and limiting, the
trend towards treating the results of a given visibility analysis as merely the first stage in a process of analysis
rather than the end-product, needs to be embraced and encouraged.
References
Aguiló, M., & Iglesias, E. (1995). Landscape inventory. In E. Martínez-Falero & S. González-Alonso (Eds.), Quantita-
tive techniques in landscape planning (pp. 47–85). Boca Raton: CRC.
Benedikt, M. L. (1979). To take hold of space: Isovists and Isovist fields. Environment and Planning B: Urban Analytics
and City Science, 6, 47–65.
GIS-based visibility analysis 329
Ben-Moshe, B., Carmmi, P., & Katz, M. J. (2008). Approximating the visible region of a point on terrain. GeoIn-
formatica, 12(1), 21–36.
Bernardini, W., Barnash, A., & Wong, M. (2013). Quantifying visual prominence in social landscapes. Journal of
Archaeological Science, 40, 3946–3954.
Berry, J. K. (1993). Beyond mapping: Concepts, algorithms and issues in GIS. Fort Collins: GIS World Books.
Bland, R., Chadwick, A., Haselgrove, C., Mattingly, D., Rogers, A., & Taylor, J. (2019). Iron age and Roman coin hoards
in Britain. Oxford: Oxbow.
Bongers, J., Arkush, E., & Harrower, M. (2012). Landscapes of death: GIS-based analyses of chullpas in the western
Lake Titicaca basin. Journal of Archaeological Science, 39, 1687–1693.
Brughmans, T., & Brandes, U. (2017). Visibility network patterns and methods for studying visual relational Phenom-
ena in archeology. Frontiers in Digital Humanities, 4, 1–17. https://doi.org/10.3389/fdigh.2017.00017
Brughmans, T., Keay, S., & Earle, G. (2015). Understanding inter-settlement visibility in Iron age and Roman Southern
Spain with exponential random graph models for visibility networks. Journal of Archaeological Method and Theory,
22(1), 58–143.
Brughmans, T., van Garderen, M., & Gillings, M. (2018). Introducing visual neighbourhood configuration for total
viewsheds. Journal of Archaeological Science, 96, 14–25. https://doi.org/10.1016/j.jas.2018.05.006
Brughmans, T., Waal, M. S. de, Hofman, C. L., & Brandes, U. (2018). Exploring transformations in Carib-
bean indigenous social networks through visibility studies: The case of late pre-colonial landscapes in East-
Guadeloupe (French West Indies). Journal of Archaeological Method and Theory, 25(2), 475–519. doi:10.1007/
s10816-017-9344-0
Caldwell, D. R., Mineter, M. J., Dowers, S., & Gittings, B. M. (2003). Analysis and visualisation of visibility surfaces
[Poster]. Retrieved from www.geocomputation.org/2003/Papers/Caldwell_Paper.pdf
Chapman, H. (2006). Landscape archaeology and GIS. Stroud: Tempus.
Cummings, V., & Pannett, A. (2005). Island views: The settings of the chambered cairns of southern Orkney. In
V. Cummings & A. Pannett (Eds.), Set in Stone: New approaches to Neolithic monuments in Scotland (pp. 14–24).
Oxford: Oxbow.
De Floriani, L. D., & Magillo, P. (1994). Visibility algorithms on triangulated digital terrain models. Geographical
Information Systems, 8(1), 13–41.
De Montis, A., & Caschili, S. (2012). Nuraghes and landscape planning: Coupling viewshed with complex network
analysis. Landscape and Urban Planning, 105, 315–324.
Eve, S., & Crema, E. (2014). A house with a view? Multi-model inference, visibility fields, and point process analysis
of a Bronze Age settlement on Leskernick Hill (Cornwall, UK). Journal of Archaeological Science, 43, 267–277.
Ferreira, C., Andrade, M. V., Magalhães, S. V., Franklin, W. R., & Pena, G. C. (2014). A parallel algorithm for viewshed
computation on grid terrains. Journal of Information and Data Management, 5, 171–180.
Fisher, P. F. (1993). Algorithm and implementation uncertainty in viewshed analysis. International Journal of Geographi-
cal Information Systems, 7(4), 331–347.
Fisher, P. F. (1994). Probable and fuzzy models of the viewshed operation. In M. F. Worboys (Ed.), Innovations in
GIS: Selected papers from the First national conference on GIS research UK (pp. 161–175). London: Taylor and Francis.
Fisher, P. F. (1995). An exploration of probable viewsheds in landscape planning. Environment and Planning B: Planning
and Design, 22, 527–546.
Fisher, P. F. (1996). Extending the applicability of viewsheds in landscape planning. Photogrammetric Engineering &
Remote Sensing, 62(11), 1297–1302.
Franklin, W. R., & Ray, C. K. (1994). Higher isn’t necessarily better: Visibility algorithms and experiments. In
T. Waugh & R. Healey (Eds.), Advances in GIS research: Sixth international symposium on spatial data handling
(pp. 751–770). Edinburgh: Taylor and Francis.
Fraser, D. (1983). Land and society in neolithic Orkney. British Series 117. Oxford: British Archaeological Reports.
Gaffney, V., & Stančič, Z. (1991). GIS approaches to regional analysis: A case study of the island of Hvar. Ljubljana: Znanst-
veni inštitut, Filozofske fakultete.
Gillings, M. (2009). Visual affordance, landscape and the megaliths of Alderney. Oxford Journal of Archaeology, 28(4),
335–356.
Gillings, M. (2012). Landscape phenomenology, GIS and the role of affordance. Journal of Archaeological Method and
Theory, 19(4), 601–611. https://doi.org/10.1007/s10816-012-9137-4
330 Mark Gillings and David Wheatley
Gillings, M. (2015). Mapping invisibility: GIS approaches to the analysis of hiding and seclusion. Journal of Archaeologi-
cal Science, 62, 1–14. http://dx.doi.org/10.1016/j.jas.2015.06.015
Gillings, M. (2017). Mapping liminality: Critical frameworks for the GIS-based modelling of visibility. Journal of
Archaeological Science, 84, 121–128. http://dx.doi.org/10.1016/j.jas.2017.05.004
Haverkort, H., Toma, L., & Zhuang, Y. (2008). Computing visibility on terrains in external memory. ACM Journal
of Experimental Algorithmics, 13, Article 1.5, 1–23.
Hengl, T., & Evans, I. S. (2009). Mathematical and Digital Models of the Land Surface. In T. Hengl, & H. I. Reuter
(Eds.), Geomorphometry: Concepts, Software, Applications (pp. 31–63). Amsterdam: Elsevier
Hurtado, F., Löffler, M., Matos, I., Sacristán, V., Saumell, M., Silveira, R., & Staals, F. (2013). Terrain visibility with
multiple viewpoints. In L. Cai, S. Cheng, & T. Lam (Eds.), Algorithms and computation: ISAAC 2013: Lecture notes
in computer science (Vol. 8283, pp. 317–327). Berlin: Springer.
Izraelevitz, D. (2003). A fast algorithm for approximate viewshed computation. Photogrammetric Engineering & Remote
Sensing, 69(7), 767–774.
Jerpåsen, G. B. (2009). Application of visual archaeological landscape analysis: Some results. Norwegian Archaeological
Review, 42(2), 123–145.
Kantner, J., & Hobgood, R. (2016). A GIS-based viewshed analysis of Chacoan tower kivas in the US Southwest:
Were they for seeing or to be seen? Antiquity, 90(353), 1302–1317.
Kaučič, B., & Žalik, B. (2002). Comparison of viewshed algorithms on regular spaced points. In A. Chalmers (Ed.),
Proceedings of the 18th spring conference on computer graphics (pp. 177–183). New York: ACM.
Lake, M., & Ortega, D. (2013). Compute-intensive GIS visibility analysis of the settings of prehistoric stone circles.
In A. Bevan & M. Lake (Eds.), Computational approaches to archaeological space (pp. 213–242). Walnut Creek: Left
Coast Press.
Lake, M., & Woodman, P. (2003). Visibility studies in archaeology. Environment & Planning B: Planning and Design,
30, 689–707.
Lake, M., Woodman, P. E., & Mithen, S. (1998). Tailoring GIS Software for Archaeological Applications: An example
concerning viewshed analysis. Journal of Archaeological Science, 25, 27–38.
Larsen, M. V. (2015). Viewshed algorithms for strategic positioning of vehicles. Norwegian Defence Research Establishment
(FFI): FFI-Rapport 2015/01300.
Lee, J., & Stucky, D. (1998). On applying viewshed analysis for determining least-cost paths on Digital Elevation
Models. International Journal of Geographical Information Science, 12(8), 881–905.
Llobera, M. (1996). Exploring the topography of mind: GIS, social space and archaeology. Antiquity, 70, 612–622.
Llobera, M. (2001). Building past landscape perception with GIS: Understanding topographic prominence. Journal
of Archaeological Science, 28, 1005–1014.
Llobera, M. (2003). Extending GIS-based visual analysis: The concept of the visualscape. International Journal of
Geographical Information Science, 17(1), 25–48.
Llobera, M., Wheatley, D., Steele, J., Cox, S., & Parchment, O. (2010). Calculating the inherent visual structure of a
landscape (inherent viewshed) using high-throughput computing. In F. Niccolucci & S. Hermon (Eds.), Beyond
the artefact: Digital interpretation of the past: Proceedings of CAA2004, Prato, 13–17 April 2004 (pp. 146–151). Buda-
pest: Archaeolingua.
Lock, G., Kormann, M., & Pouncett, J. (2014). Visibility and movement: Towards a GIS-based integrated approach.
In S. Polla & P. Verhagen (Eds.), Computational approaches to the study of movement in archaeology: Theory, practice and
interpretation of factors and effects of long term landscape formation and transformation (pp. 23–42). Topoi – Berlin Studies
of the Ancient World/Topoi – Berliner Studien der Alten Welt, 23. Berlin: De Gruyter.
Loots, L. (1997). The use of projective and reflective viewsheds in the analysis of the Hellenistic City defence system
at Sagalassos, Turkey. Archaeological Computing Newsletter, 49, 12–16.
Loots, L., Nackaerts, K., & Waelkens, M. (1999). Fuzzy viewshed analysis of the Hellenistic City defence system
at Sagalassos, Turkey. In L. Dingwall, S. Exon, V. Gaffney, S. Laflin, & M. van Leusen (Eds.), Archaeology in the
age of the internet, CAA97: Computer applications and quantitative methods in archaeology: Proceedings of the 25th
anniversary conference, University of Birmingham, April 1997. BAR International Series, 750 [CD-ROM]. Oxford:
Archaeopress.
Lopez-Romero Gonzalez de la Aleja, E. (2008). Characterising the evolution of visual landscapes in the late prehistory
of south-west Morbihan (Brittany, France). Oxford Journal of Archaeology, 27(3), 217–239.
GIS-based visibility analysis 331
Lu, M., Zhang, J., & Fan, Z. (2008). Least visible path analysis in raster terrain. International Journal of Geographical
Information Science, 22(6), 645–656.
Mitcham, J. (2002). In search of a defensible site: A GIS analysis of Hampshire Hillforts. In D. Wheatley, G. Earl, &
S. Poppy, S (Eds.), Contemporary themes in archaeological computing (pp. 73–79). Oxford: Oxbow Books.
Murrieta-Flores, P. (2014). Developing computational approaches for the study of movement: Assessing the role of
visibility and landscape markers in terrestrial navigation during Iberian Late Prehistory. In S. Polla & P. Verhagen
(Eds.), Computational approaches to the study of movement in archaeology: Theory, practice and interpretation of factors and
effects of long term landscape formation and transformation (pp. 99–132). Topoi – Berlin Studies of the Ancient World/
Topoi – Berliner Studien der Alten Welt, 23. Berlin: De Gruyter.
Nackaerts, K., & Govers, G. (1997). A non-deterministic use of a DEM in the calculation of viewsheds. Archaeological
Computing Newsletter, 49, 3–11.
Nagy, G. (1994). Terrain visibility. Computers and Graphics, 18(6), 763–773.
Nutsford, D., Reitsma, F., Pearson, A., & Kigham, S. (2015). Personalising the viewshed: Visibility analysis from the
human perspective. Applied Geography, 62, 1–7.
Ogburn, D. E. (2006). Assessing the level of visibility of cultural objects in past landscapes. Journal of Archaeological
Science, 33, 405–413.
Olaya, V. (2009). Basic land surface parameters. In T. Hengl & H. I. Reuter (Eds.), Geomorphometry: Concepts, software,
applications (pp. 141–169). Amsterdam: Elsevier.
O’Sullivan, D., & Turner, A. (2001). Visibility graphs and landscape visibility analysis. International Journal of Geo-
graphical Information Science, 15(3), 221–237.
Paliou, E. (2011). The communicative potential of theran murals in late Bronze age Akrotiri: Applying viewshed
analysis in 3D townscapes. Oxford Journal of Archaeology, 30(3), 30–33.
Paliou, E. (2013). Reconsidering the concept of visualscapes: Recent advances in three-dimensional visibility analysis.
In A. Bevan & M. Lake (Eds.), Computational approaches to archaeological spaces (pp. 243–263). Walnut Creek: Left
Coast Press.
Paliou, E. (2014). Visibility analysis in 3D built spaces: A new dimension to the understanding of social space.
In S. Polla, U. Lieberwirth, & E. Paliou (Eds.), Spatial analysis and social spaces: Interdisciplinary approaches to
the interpretation of prehistoric and historic built environments (pp. 91–114). Berlin: De Gruyter. http://dx.doi.
org/10.1515/9783110266436.91
Rennell, R. (2012). Landscape, experience and GIS: Exploring the potential for methodological dialogue. Journal of
Archaeological Method and Theory, 19(4), 510–525.
Reuter, H. I., Hengl, T., Gessler, P., & Soille, P. (2009). Preparation of DEMs for geomorphometric analysis. In T.
Hengl & H. I. Reuter (Eds.), Geomorphometry: Concepts, software, applications (pp. 141–169). Amsterdam: Elsevier.
Risbøl, O., Petersen, T., & Jerpåsen, G. (2013). Approaching a mortuary monument landscape using GIS-and ALS-
generated 3D models. International Journal of Heritage in the Digital Era, 2(4), 509–525.
Ruestes Bitriá, C. (2008). A multi-technique GIS visibility analysis for studying visual control of an Iron age land-
scape. Internet Archaeology, 23. http://intarch.ac.uk/journal/issue23/4/
Ruggles, C., & Medyckyj-Scott, D. (1996). Site location, landscape visibility, and symbolic astronomy: A scottish case
study. In H. D. G. Maschner (Ed.), New methods, old problems: Geographic information systems in modern archaeologi-
cal research (pp. 127–146). Carbondale: Center for Archaeological Investigations Occ. Paper No. 23, Southern
Illinois University.
Sakaguchi, T., Morin, J., & Dickie, R. (2010). Defensibility of large prehistoric sites in the Mid-Fraser region on the
Canadian Plateau. Journal of Archaeological Science, 37(6), 1171–1185.
Suleiman, W., Joliveau, T., & Favier, E. (2013). A new algorithm for 3D isovists. In S. Timpf & P. Laube (Eds.),
Advances in spatial data handling: Advances in geographic information science (pp. 157–173). Berlin: Springer.
Tilley, C. (1994). A phenomenology of landscape. London: Berg.
Tilley, C. (2004). The materiality of stone: Explorations in landscape phenomenology. London: Berg.
Toma, L. (2012). Viewsheds on terrains in external memory. SIGSPATIAL Special, 4(2), 13–17.
Trick, S. (2004). Bringing it all back home: The practical visual environments of Southeast European Tells. Internet
Archaeology, 16. https://doi.org/10.11141/ia.16.7
Turner, A., Doxa, M., O’Sullivan, A., & Penn, A. (2001). From isovists to visibility graphs: A methodology for the
analysis of architectural space. Environment and Planning B: Urban Analytics and City Science, 28(1), 103–121.
332 Mark Gillings and David Wheatley
Van Dyke, R., Bocinsky, R. K., Windes, T. C., & Robinson, T. J. (2016). Great houses, shrines, and high places: Inter-
visibility in the Chacoan world. American Antiquity, 81(2), 205–230.
van Kreveld, M. (1996). Variations on sweep algorithms: Efficient computation of extended viewsheds and class
intervals. In M. J. Kraak & M. Molenaar (Eds.), Proceedings of the 7th international symposium on spatial data handling
(pp. 15–27). Delft: TU Delft.
Wang, J., Robinson, G., & White, K. (2000). Generating viewsheds without using sightlines. Photogrammetric Engineer-
ing & Remote Sensing, 66(1), 87–90.
Webster, D. (1999). The concept of affordance and GIS: A note on Llobera (1996). Antiquity, 73, 915–917.
Wheatley, D. W. (1995). Cumulative viewshed analysis: A GIS-based method for investigating intervisibility, and
its archaeological application. In G. Lock & Z. Stančič (Eds.), Archaeology and geographical information systems: A
European perspective (pp. 171–186). London: Taylor and Francis.
Wheatley, D. W., & Gillings, M. (2000). Vision, perception and GIS: Developing enriched approaches to the study of
archaeological visibility. In G. Lock (Ed.), Beyond the map (pp. 1–27). Amsterdam: IOS Press.
Williamson, C. G. (2016). Mountain, myth and territory: Teuthrania as focal point in the landscape of Pergamon. In
J. McInerney & I. Sluiter (Eds.), Valuing landscape in classical antiquity: Natural environment and cultural imagination
(pp. 70–100). Leiden: BRILL.
Wright, D. K., MacEachern, S., & Lee, J. (2014). Analysis of feature intervisibility and cumulative visibility using
GIS, Bayesian and spatial statistics: A study from the Mandara mountains, Northern Cameroon. PLoS One, 9(11),
e112191. doi:10.1371/journal.pone.0112191
Xu, Z., & Yao, Q. (2009). A novel algorithm for viewshed based on digital elevation model. 2009 Asia-Pacific Confer-
ence on Information Processing, 2, 294–297.
18
Spatial analysis based on
cost functions
Irmela Herzog
Introduction
Many archaeologists are no longer satisfied with presenting distribution maps, but their aim is the iden-
tification of the patterns of movement that explain how the people and the artefacts of the time period
considered got to the sites (e.g. Rademaker, Reid, & Bromley, 2012). For most regions and periods of
time, the distribution of artefacts provides the most important evidence for the movement of people.
Archaeological remains of ancient paths, roads and ship wrecks are fairly rare, and in most cases indicate
only small sections of the original trajectory. Moreover, dating land routes is often difficult due to con-
tinuous use after initial path creation and absence of diagnostic finds. However, the archaeological record
of human movement can sometimes be supplemented by historical sources.
Each model of past movement based on the historical and archaeological evidence nowadays relies
implicitly or explicitly on a cost function estimating costs of movement in terms of time, calories or some
other currency for the study area and period of time considered. Evidence for the popularity of such
approaches in archaeology are not only numerous case studies published since 2000 but also several ses-
sions at the annual Computer Applications in Archaeology (CAA) conference dealing with this subject
as well as two edited volumes with contributions focusing solely on least-cost methods or applications
(Polla & Verhagen, 2014; White & Surface-Evans, 2012).
Two popular GIS-based areas of spatial analysis in archaeology are based on cost functions: site catch-
ments and least-cost paths. A site catchment is the region accessible from a site, and often archaeological
studies analyse the resources within this region (Conolly & Lake, 2006, p. 214). A least-cost path (LCP)
ideally is the route minimizing the costs of movement between two given locations (Conolly & Lake,
2006, pp. 294, 252–255).
In fact, the most basic application of a cost function is the generation of a least-cost site catchment
(LCSC). The LCSC includes all areas that can be reached by expending less than a user-selected cost limit.
The term isochrone is often used for the boundary when costs are measured in terms of time. According
to Wheatley and Gillings (2002, p. 159), the concept of LCSC was derived from defining the exploitation
territory of a site. Beyond the boundary of this territory the costs of exploitation exceed the benefit.
This concept is closely related to time geography introduced by Mlekuz (2013) into archaeological least-
cost modelling. Site catchment analysis for foraging societies mainly focuses on the types and quantities
334 Irmela Herzog
of resource areas within each catchment zone (e.g. Surface-Evans, 2012). Most publications presenting
site catchments for sedentary agrarian cultures study the potentials for crop production, in terms of soil,
topography, slope etc. (e.g. Korczyńska, Cappenberg, & Kienlin, 2015). Wheatley and Gillings (2002,
p. 160) refer to a widely cited paper published in 1970 suggesting a cost limit of a 1 hour walk for a sed-
entary agricultural site and 2 hours for a herding/hunting community. The LCSC derived from several
cost limits may be appropriate, if each settlement is surrounded by rings of different utilisation, as was
described in the early 19th century by von Thünen’s model of rural land use (Waugh, 2002, pp. 471–475).
For instance, Posluschny (2010) uses the popular Tobler hiking function (Tobler, 1993) with two time
limits, 60 and 15 minutes, with 15 minutes delimiting the area of daily farming activities. For Posluschny’s
study area in southwestern Germany, early Iron Age settlements with overlapping catchments on the
15 minute scale are probably not contemporary. So catchment overlap may indicate some issues with
dating or the cost limit selected. The aim of Gaffney and Stančič (1992) was to define realistic mutually
exclusive exploitation areas for the seven principal hillforts on the island of Hvar, Croatia. Therefore, they
calculated LCSC based on a 90-minute walking time limit. For a project reconstructing land use patterns
of sites, catchments may define the survey area (Peeples, Barton, & Schmich, 2006). Comparing catch-
ment sizes may provide insights into the function of settlements. For instance, the study of Posluschny
(2010) mentioned above compares the catchment sizes for early Iron Age princely sites with those of
non-princely settlements of the same period and comes to the conclusion that agriculture was more
important for the normal settlements. Site catchment analysis was introduced by processual archaeology
(Conolly & Lake, 2006, p. 209) focusing on economic costs although it is also possible to include social
aspects such as visibility or taboo zones in a cost model (Table 18.1). Lee and Stucky (1998) provide a
comprehensive overview of approaches for including viewsheds in least-cost calculations. As the LCSC
comprises all LCPs that expend the catchment’s cost limit or less, it can be spoken of as the ‘potential
path area’ (Mlekuz, 2013).
In archaeological studies, LCPs often provide reconstructions of ancient routes or route sections (e.g.
Chapman, 2006, pp. 110–111; Herzog, 2013e; Rademaker et al., 2012; Verhagen & Jeneson, 2012), take
for example Rogers, Collet, and Lugon (2015) who calculate LCPs in an attempt to predict high moun-
tain passes in prehistoric times. LCPs may also be applied to identify the principal factors governing the
construction of known roads or road segments (e.g. Bell & Lock, 2000; Fovet & Zakšek, 2014; Güimil-
Fariña & Parcero-Oubiña, 2015; van Lanen, 2017, pp. 123–134). If LCPs coincide with known roads
only after forcing the LCP to visit an intermediate location, this is evidence for the importance of this
additional node (e.g. Güimil-Fariña & Parcero-Oubiña, 2015).
In most studies, a set of points is connected by LCPs (e.g. Canosa-Betés, 2016). Alternatively, LCPs in
all directions can be constructed starting from a given site, resulting in focal mobility networks (Fábrega
Álvarez & Parcero Oubiña, 2007; Herzog, 2013c; Llobera, Fábrega-Álvarez, & Parcero-Oubiña, 2011;
Lynch & Parcero-Oubiña, 2017). If the movement costs depend on topography, topographic data (mostly
a digital elevation or surface model) with adequate resolution is required, whereas in ancient cities, each
house is a barrier and should be modelled accordingly (Branting, 2007). Often the reconstruction of past
routes by LCPs is the basis of further research, e.g. Hudson (2012).
The sites of many cultural groups prefer locations close to ancient roads or paths (e.g. Fovet & Zakšek,
2014), therefore, successful road reconstruction often allows predictive modelling of road-related sites
such as mansiones, i.e. resting places along Roman roads, but also of archaeological features such as rock
art or burial mounds. Another focus has been on what might be termed ‘natural’ pathways in a given
landscape. An early example for a site prediction approach based on pathway reconstruction using a cost
model was presented by Bellavia (2002) who sought to derive “natural pathways” from a digital elevation
Spatial analysis based on cost functions 335
Table 18.1 Cost components applied in selected archaeological least-cost studies published in 2010 or later.
Canosa-Betés (2016) Tobler (1993); walker cost function of 4 categories of water courses (breadth:
Llobera and Sluckin (2007); Herzog (2013a 200 m, 150 m, 50 m, and 25 m); no extra
based on Minetti, Moia, Roi, Susta, and costs for possible locations of fords or
Ferretti (2002) bridges
Fovet and Zakšek 3rd degree polynomial based on Minetti Visibility, based on a variable similar to
(2014) et al. (2002) sky view
Güimil-Fariña and Tobler (1993); Pandolf, Givoni, and Penalty for crossing rivers equivalent to
Parcero-Oubiña (2015) Goldman (1977); Herzog (2013a based on ascending a 15° gradient
Minetti et al., 2002); walker cost function of
Llobera and Sluckin (2007)
Groenhuijzen and Velocity estimate derived from Pandolf et al. Terrain coefficients based on Soule and
Verhagen (2017) (1977) assuming constant values Goldman (1972); coefficient 20 for rivers
for metabolic rate, weight and load. and streams
Herzog (2013e) Vehicle cost function. Avoiding wet soils including streams;
Herzog (2013a) lower costs for fords
Korczyńska et al. (2015) Tobler (1993) none
Van Lanen (2017) Slope classes based on natural breaks; slopes Terrain classification: factor 1.2 for higher
> 10% are considered impassable sandy heath land, 1.8 for lower wetlands;
groundwater level
Lynch and Parcero Walker cost function of Llobera and Impedance factor 2 for areas from which
Oubiña (2017) Sluckin (2007) no high mountain top is visible.
Posluschny (2010) Tobler (1993) none
Rademaker et al. (2012) Pandolf et al. (1977) with various values for Terrain coefficients based on Soule and
variables W, L, V (see Table 18.2) Goldman (1972)
Rogers et al. (2015) Tobler (1993); alternatively: Swiss 15th landcover
degree polynomial
Surface-Evans (2012) Tobler (1993) none
Verhagen and Jeneson Tobler (1993) Alternative to slope: Visibility based on
(2012) low-pass filtered openness
model (DEM) in several areas of the UK including Stonehenge. Some evidence has also been published
for animals travelling on least-effort routes (Ganskopp, Cruz, & Johnson, 2000), so if early hunters fol-
lowed the paths of animals as has been suggested by some authors (e.g. Whitley & Burns, 2008) they
most probably walked on LCPs.
An appropriate cost function determines the costs of movement in the region studied and should take
the means of transportation available to the people living at that time into account, e.g. the use of pack
or draft animals, wheeled vehicles or boats. Nearly all archaeological case studies applying cost functions
include the cost factor slope, often combined with factors depending on soil, land use, the presence of
streams, or visibility (Table 18.1; cf. Herzog, 2014b for some additional archaeological LCSC and LCP
publications).
336 Irmela Herzog
Method
Overview
The initial step in LCSC and LCP calculation is the decision concerning the principal factors govern-
ing movement costs, and establishing an appropriate cost function combining the costs of these factors.
The next step is the creation of an accumulated cost surface (ACS), that is a raster grid storing the costs
of movement from the origin to every other cell in the raster grid. The ACS is normally calculated by
spreading out from the origin and accumulating the costs of the cells as each is visited. For LCSC, the
origin is the site location. For LCPs, the origin is one of the two locations to be connected. The LCSC
is derived from the ACS by stopping the spreading process for cells whose accumulated costs exceed the
predefined cost limit. These cells form the boundary of the catchment. Alternatively, an isoline at the
cost limit value may be derived from the ACS. The LCP is derived from the ACS by backtracking from
the target location to the origin. These three steps will be described in more detail in the next sections.
Finally, some validation and analysis of the stability of the outcomes should be included in each study
creating LCSC or LCPs; this is discussed in the Conclusion below.
Figure 18.1 Cost functions estimating walking time; on the x-axis downhill slopes are negative.
most paths are used in both directions so that a cost function averaging the costs of movement in
both directions seems appropriate in many situations. By averaging, the asymmetric cost curve is
converted to a symmetric curve (Herzog, 2013a, Figure 18.2). Likewise, if the load carried by a
descending walker, pack animal or vehicle differs from the load on the way up, the cost functions
used should vary accordingly. Note that the slope-dependent cost functions do not include the costs
for climbing stairs or ladders.
Several publications provide terrain factors that model reduced speed or energy consumption of a
walker (Table 18.3). In many least-cost studies, water is considered as barrier, for instance Rogers et al.
(2015) select factor 5 for traversing water courses and 499.5 for water bodies. In general, the effort
needed for crossing a stream depends on many factors including width, depth and current (Langmuir,
2004, pp. 185–199).
The formula of Pandolf et al. (1977) shows one way of combining cost components, i.e. slope and a
terrain factor. It consists of a term depending on weight and load only (estimating the energy consump-
tion of standing) plus the extra energy required for movement, and only the latter term is multiplied by
the terrain coefficient. The formula presented by Givoni and Goldman (1971), that is also used by Soule
and Goldman (1972), is the product of the terrain coefficient with a factor depending on weight, load,
velocity and slope. Alternatively, a (weighted) sum of the two or more cost components may be applied
(e.g. Fovet & Zakšek, 2014), if the cost components are independent. Additional approaches for combin-
ing cost components and their drawbacks are discussed by Herzog (2013a).
The time and energy consumption required for walking a path also depend on factors varying
throughout the year and on weather conditions: Rain, muddy paths due to recent rain, snow, storm, fog,
Spatial analysis based on cost functions 339
Figure 18.2 Cost functions estimating walking time: uphill and downhill costs are averaged.
high humidity, very high or very low temperatures may slow down progress considerably. Moreover,
movement costs depend on the sex, age, weight, load and fitness of the walker as well as on the number of
hikers in the walking group. Therefore, the stability of any least-cost result should be analysed by varying
the model parameters.
Estimating the costs of movement by boat or ship is even more difficult than estimating the costs of
walking, due to differences in boat or ship technology, seasonal variations, currents and substantial changes
of the rivers or coastlines since the period considered. Some estimates for the costs of water transport
provided by different studies can be found in Herzog (2014a).
Animals play different roles in path creation: according to Lay (1992, pp. 6–7), the first human ways
had an animal path origin (cf. Whitley & Burns, 2007). Moreover in some areas special paths for herds
existed in the past. Horse riding, pack or draft animals also had an impact on the velocity of the travel-
ler. It is very difficult to find appropriate cost functions taking animal movement into account due to
the large variety within each species (e.g. oxen or horses) and the high number of possible species to be
considered (Ganskopp & Vavra, 1987).
340 Irmela Herzog
Table 18.3 Published terrain factors for cost functions measuring time (unit: hour) or energy consumption (unit:
joule) of a walker. Note: ‘m asl’ refers to metres above sea level.
1.00 Blacktop roads and improved dirt paths hour MIDE: París Roche (2008), p. 11
1.00 Pavement (cement) hour de Gruchy, Caswell, and Edwards (2017)
1.03 Lawn grass hour de Gruchy et al. (2017)
1.19 Loose beach sand hour de Gruchy et al. (2017)
1.24 Disturbed ground (former stone quarry) hour de Gruchy et al. (2017)
1.25 horse riding paths, flat trails and meadows hour MIDE: París Roche (2008), p. 11
1.35 Tall grassland (with thistle and nettles) hour de Gruchy et al. (2017)
1.50 Open space above the treeline i.e. 2000 m asl hour Rogers et al. (2015)
1.67 Bad trails, stony outcrops and river beds hour MIDE: París Roche (2008), p. 11
1.67 Off-path hour Tobler (1993)
1.79 Bog hour de Gruchy et al. (2017)
2.00 Off-path areas below the treeline including hour Rogers et al. (2015)
pastures, forests, heathland, beaches etc.
2.50 Rock hour Rogers et al. (2015)
5.00 Swamp, water course hour Rogers et al. (2015)
1.00 Asphalt/blacktop joule de Gruchy et al. (2017)
1.10 Dirt road or grass joule de Gruchy et al. (2017)
1.20 Hard-surface road joule Givoni and Goldman (1971)
1.20 Light brush joule de Gruchy et al. (2017)
1.30 Ploughed field joule de Gruchy et al. (2017)
1.50 Ploughed field joule Givoni and Goldman (1971)
1.50 Heavy brush joule de Gruchy et al. (2017)
1.60 Hard-packed snow joule de Gruchy et al. (2017)
1.80 Swampy bog joule de Gruchy et al. (2017)
1.80 Sand dunes joule Givoni and Goldman (1971)
2.10 Loose sand joule de Gruchy et al. (2017)
Figure 18.3 Simple example of an isotropic cost grid (left) and the corresponding accumulated cost surface
(ACS) (right). The origin of the accumulation process is the centre of the cost grid (cost value = 10).
Find the cell centre with the lowest ACS value in the candidate set.
Step 2 This is now the current posi on.
Step 3 For each neighbouring cell centre of the current posi on check:
no All
Proceed with next
neighbours
neighbour
checked?
yes
Proceed to no candidate
Step 2 set empty?
yes
The end
For LCSCs, the algorithm has to be modified: in Step 3 only those cell centres are inserted in the
candidate set, whose ACS value is below the predefined cost limit.
For LCP generation modifications of the algorithm are also needed. Firstly, whenever a new ACS value
is assigned to a cell centre in Step 3, the backlink for this cell is stored, i.e. the current position. Secondly,
if the target of the LCP is selected in Step 2, the LCP is generated by connecting the backlinks from the
target to the origin. An optional procedure may save computation time: Initially the costs of the most
direct connection to the target may be calculated. Only those possible candidates should be inserted in
the set, for which the sum of the current ACS value and the minimum costs for the straight-line distance
to the target are below this initial cost limit.
The results of the LCP algorithm are independent of the units of measurement chosen, i.e. they do
not change if all costs are multiplied by a constant factor. So applying the multiplier of 0.8 for horse-
riding suggested by Tobler (1993) will result in the same calculated paths as the initial formula for hikers.
Similarly, the LCPs generated for male and female walkers based on one of the formulas proposed by
Irmischer and Clarke (2017; no. 3 in Table 18.2) do not differ.
Both the conversion of vector data to a cost raster and the subsequent conversion of this raster to a
graph may produce unexpected results. These issues are illustrated in Figures 18.5 and 18.6. Figure 18.5(a)
shows an isotropic cost grid with a linear barrier (cost value of 100) in an area of uniform costs (white
cells are assigned a cost value of 10). The grey cells indicate three ways of converting the barrier to raster
cell values: including all cells whose centre is within a 5, 7.5 and 10 m distance from the line.
Figure 18.5 (a) Simple isotropic cost grid, (b) possible moves starting at the origin, (c) traversing the barrier
cells by long moves, (d) subdividing long moves.
Spatial analysis based on cost functions 343
Figure 18.6 (a–f) Depict ACS results based on the cost grid shown in Figure 18.5(a). The outcomes of an
inadequate barrier radius of merely 5 m are shown in (a–c). For (d–f) an adequate barrier radius of 7.5 m was
chosen. The N values indicate the number of nearest neighbouring cells that can be reached from the origin
without detour. Images (g) and (h) illustrate the impact of different N values by grids showing the differences
in accumulated costs.
Three ways of converting grid cells to a graph have been used in archaeological least-cost calculations:
(a) linking each cell with its 8 nearest neighbours (queen moves in Figure 18.5(b–d)), (b) ensuring that
all cells within a 24 cell neighbourhood can be reached without detour (queen and knight moves), or
(c) connecting to cells in all directions in a 48 cell neighbourhood (all lines starting from the origin in
Figure 18.5(b)).
Figure 18.5(c) illustrates how a simple diagonal move may traverse the 5 m radius barrier without
paying due costs; knight moves can jump over the 7.5 m radius barrier, and the long 3–1 and 3–2 moves
cross the 10 m radius barrier without touchdown on a high cost cell. This issue can be avoided by cut-
ting the long moves into two or three sub-moves respectively, as indicated by the lines with arrows in
Figure 18.5(d). The cost values of the cut points are the weighted averages of the two values stored in the
cells connected by the arrow lines, with the weights depending on the distance to the cut point.
Figure 18.6(a–c) depicts the ACS based on the cost grid with the 5 m radius barrier shown in Fig-
ure 18.5(a). Due to the diagonal moves that jump over the barrier, some ACS cells beyond the barrier
have a lower accumulated cost value than any barrier cell. This unwanted effect is avoided with the 7.5 m
barrier (Figure 18.6(d–f)). The cut point implementation of the long moves ensures that due costs are
paid for the barrier. The correct shortest paths from the origin to the cells in the west half of the cost
grid are straight lines, so that cost distance and straight line distance should coincide in this area, i.e. cells
of equal cost distance from the origin should form a semicircle in the west. The image for N=48 shows
more of a semi-circular structure than the N=8 image. In fact, by increasing the number of move direc-
tions, the detours necessary for reaching the target locations in general are made smaller. This is illustrated
by the difference images in Figure 18.6(g) and 18.6(h): in the uniform terrain area the largest difference
is about 6 m, the distance of the corresponding cells to the origin is about 77 m. So with N=8, the larg-
est detour is about 7.8 percent of the true shortest distance. This elongation error decreases substantially
for N=24 (Figure 18.6(h)): in the uniform area, it is about 1.5 m on covering a straight line distance of
63 m, i.e. about 2.4 percent of the true shortest distance (Herzog, 2013b).
344 Irmela Herzog
Figure 18.7 Small digital elevation model (DEM) with a cell size of 10 m and a constant slope value of 10%
with the corresponding ACS for the cost function Q(ŝ) = 1 + (ŝ/š)², with š = 10 (cf. no. 9 in Table 18.2).
Various algorithms for calculating slope exist (Conolly & Lake, 2006, pp. 191–192; Lock & Pouncett,
2010; Wheatley & Gillings, 2002, pp. 120–121) providing different outcomes. Moreover, confusion of
units for measuring slope may result in unrealistic ACS grids. This is why deriving slope directly from
the DEM in the process of ACS calculation is recommended as illustrated in Figure 18.7.
Even with an isotropic slope-dependent cost function, the direction of movement is important. The
move from the centre cell with an altitude of 100 to the north has a slope of 10 percent accumulating
costs of 2, whereas the moves to the east or west remain at the same altitude accumulating only half
of the costs. The moves to the diagonal cells are longer, resulting in a lower slope than for the cells to
the main directions (ŝ =10√2), but still a higher accumulated cost value. Figure 18.8 presents another
visualisation of anisotropic movement on different gradients with constant slope and different cost
functions.
According to the approach presented in Figure 18.7, movement on the contour line of a DEM is
equivalent to the movement on level ground. For contour lines of a steep gradient, this is only true if
some construction work has been done to create the path (Figure 18.9). For informal routes that did not
involve any building effort, very steep gradients may be considered as barriers. It would be important to
note that with the exception of contour line routes, construction work such as the removal of outcrops,
building bridges and tunnels is mostly not included in ACS generation. Clearly, the outcome of the
approach outlined above depends on the accuracy and resolution of the DEM (Herzog & Posluschny,
2011).
An anisotropic cost grid may be combined with an isotropic grid by multiplying or adding accu-
mulated cost values (Herzog, 2013a). The terrain factors listed in Table 18.3 suggest multiplication, and
multiplication is independent of the resolution of the cost grids. However, several authors prefer adding
cost components (e.g. Fovet & Zakšek, 2014), often a weighted sum of cost grids is created (e.g. Whit-
ley & Burns, 2008). For modelling anisotropic costs such as currents or wind directions, more complex
approaches are required (Collischonn & Pilar, 2000; Indruszewski & Barton, 2005, 2007).
After deciding on the relevant cost model, the main difficulty for LCSC generation is the decision
for a cost limit. For single farmsteads with a crop-based economy the farm sizes listed in the study by
Kerig (2008) may provide some guidance: the minimum farmland is 2 hectare, and the maximum is 4 to
5 hectare if all work is to be carried out by humans. With oxen larger farmlands of up to 10 hectare can
be ploughed. Horses allow ploughing even larger plots, up to 33 hectare.
Spatial analysis based on cost functions 345
Figure 18.8 ACS for different slope dependent cost functions (nos. 1, 6, and 9 in Table 18.2) on three gra-
dients, with an additional barrier as in Figure 18.5(a) (radius 7.5 m). For each cost function, the costs vary
from 0 at the origin, depicted in white, to the largest accumulated cost value depicted in black. (a) Ericson &
Goldstein, (b) Tobler, (c) Q(12).
Figure 18.10 clearly shows that the LCP may deviate from the true shortest path depending on N (for
details see Herzog, 2013b). By increasing N, calculations will be rendered more accurate but the com-
putation time will increase as well. The figure shows LCPs created with Dijkstra’s algorithm for paths in
both directions, based on the ACS grids presented in Figure 18.6(d–f). In an area of uniform costs such
as the western part of the cost grid displayed in Figure 18.5(a), the optimal path is a straight line. But for
N=8 and uniform costs, only the LCPs in the eight directions considered coincide with the straight line,
346 Irmela Herzog
Figure 18.9 A path built to be level on a steep slope in the hilly area east of Cologne, Germany.
an example is the LCP to target no. 3. But for paths in other directions such as to target no. 5, the LCP
deviates from the correct shortest path (dotted line in Figure 18.10(a)), this deviation decreases when
increasing N. Moreover, two different LCPs are depicted for most targets: the return path accumulates
the same costs.
Case study
In the hilly rural area southwest of Cologne quite a few Roman sites including the remains of farms
(villae rusticae), temples and roads have been recorded. The aim of the case study is to compare the site
catchments of a farm (a villa in Blankenheim) and a temple (known as Görresburg). The cost model
derived from the Roman roads in this area is applied for the catchments. Although the movement pat-
terns of travelling on a Roman road may differ from small-scale movement within a site catchment, no
data concerning the latter is readily available. The first step is to identify the principal factors governing
the construction of Roman roads in this area (Figure 18.11).
In this study area, large parts of the Roman road known as the Agrippa Road have been recorded by
aerial photography, ALS data and some small-scale excavations (Grewe, 2004, 2007; Horn, 2014, map
p. 169).1 Another Roman road section in this area was proposed and verified by Hagen (1931, p. 176).
The Roman road section suggested by Schneider (1879, p. 21) relies mainly on straight-line sections of
Spatial analysis based on cost functions 347
Figure 18.10 Least-Cost Paths (LCPs) (black lines) from the origin in the centre to five different targets. The
outcome of the LCP algorithm depends on the number of nearest neighbours that can be reached without
detour (N) and the width of the barrier ((a) width = 5 m; (b, e, g) width = 7.5 m; (c, f, h) width = 10 m; cf.
Figure 18.6).
roads that were still in use in the mid-19th century and passes a known Roman road site, therefore it is
also tentatively included in this set of Roman roads. Unfortunately, landscape reconstruction is beyond
the scope of this small case study, so Figure 18.11 shows some modern features, mostly roads such as a
modern motorway east of the site labelled “Roman road remains” in the northeast of the map.
When discussing the Agrippa Road, Grewe (2004) pointed out that Roman roads avoided steep slopes
in order to allow horse or oxen driven carts to proceed. According to Grewe, the slopes of Roman roads in
the Rhineland normally do not exceed 8 percent although at some exceptional locations 16 to 20 percent
have been recorded. Due to the slope restrictions for Roman roads, the cost factor slope is an obvious
choice. Slope is derived from the two DEMs available for the study area (Table 18.4).
348 Irmela Herzog
Figure 18.11 The study area southwest of Cologne covering approximately 13 × 10 km.
Table 18.4 DEM data provided by the ordnance survey institution (Geobasis NRW) responsible for this part of
Germany
Experience with another hilly region in the Rhineland suggested including another cost factor that
models streams as barriers (Herzog, 2013a, 2013b, 2013c, 2013e). This was tested for DEM10: a buffer
with a radius of 7.5 m was created for the streams and isotropic costs of 5 assigned to the cell centres
within the buffer (Figure 18.12: iso = 5). All streams in the neighbourhood of the Roman routes con-
sidered belong to the class “width below 3 m”, therefore a uniform penalty for traversing streams is
considered appropriate. The LCPs derived from this cost model agree quite well with the Roman road
proposed by Hagen, but deviate from the route suggested by Schneider. With respect to the Agrippa
Road, the results are not very convincing. Modifying the penalty for crossing streams does not change
the outcome, because the number of stream crossings for the initial LCPs and the Agrippa Road is about
the same. An alternative is a model derived from the soil map that takes the wet soils in the stream val-
leys into account. Moreover, ford or bridge locations were digitized from the maps created in the years
1846–1847 and assigned costs of 2. Based on this model, the LCP generated from the slope-dependent
cost function with a critical slope of 6 percent reconstructs the Agrippa Road somewhat better than the
other LCPs (Figure 18.12).
Spatial analysis based on cost functions 349
Figure 18.12 LCPs based on the formulas by Tobler, Irmischer, and Herzog/Minetti (see nos. 1, 3, and 11 in
Table 18.2) as well as quadratic slope dependent cost functions (see no. 9 in Table 18.2) combined with costs
for traversing water courses and/or wet soils.
Moreover, the LCPs based on Tobler’s hiking function were generated, and though they do not pay
penalties for crossing water, they agree quite well with some of those derived from the vehicle cost func-
tion with streams modelled as barriers. The LCPs generated from a cost model combining the Irmischer
off-road cost function with penalties for wet soils except at ford locations are often more direct than the
rest of the LCPs presented and often do not reconstruct the known roads as well as these.
After the first sobering results, Görresburg and the Roman smelting site were included as additional
possible origins besides Pt1 (cf. Figure 18.11). This allows testing if the road made a detour on purpose
to pass these sites.
The Agrippa Road and the Q(10) LCP are of similar length, but the total of elevation differences
derived from a trail elevation profile is considerably lower for the LCP (height change in Table 18.5).
So neither minimising height change nor avoiding crossing streams or wet soils are the principal factors
governing the construction of the Agrippa Road. Long sections of the Q(10) LCP run in the stream
valleys, whereas the Agrippa Road after crossing a stream immediately climbs to more elevated terrain
(Figure 18.13(a)). Viewsheds probably did not play an important role in this forest area, but a method for
calculating local visual prominence can be applied to altitude data for identifying elevated areas (Llobera,
2003; Figure 18.13(b)).
350 Irmela Herzog
Table 18.5 Comparison of the Agrippa Road section and the LCP generated based on the cost function Q(10) with
a critical slope of 10 percent (see no. 9 in Table 18.2) combined with a penalty factor of 5 for crossing streams. For
the two routes, the percentage in each prominence category is given.
Length height –6.3 –1.0 to 0.0 to 1.0 to –15.0 –2.0 to 0.0 to 2.0 to
(km) change to –1.0 0.0 1.0 5.2 to –2.0 0.0 2.0 13.4
Figure 18.13 (a) Comparison of the Q(10) LCPs with the Agrippa road, (b) comparison of the local promi-
nence for these two routes (white = low, black = high prominence) (c) LCPs with increased isotropic costs in
areas of low prominence.
Table 18.5 clearly shows that the Agrippa Road avoids areas of low prominence. Therefore LCPs with
different cost multipliers (w values in Figure 18.13(c)) attributed to low prominence areas were calcu-
lated. The cost multiplier 2 generated the best results combined with slope-dependent cost functions that
assign less costs to steep slopes than Q(10). Such cost functions tend to generate straight road sections
typical for Roman roads. But omitting the slope-dependent cost component produces LCPs that are not
as close to the Agrippa Road as the LCPs highlighted in Figure 18.13(c).
Only the LCPs starting at the Roman iron smelting site coincide well with the Agrippa Road sug-
gesting that this site determined the layout of the road to some extent. Figure 18.13(c) shows also that
the choice of the DEM can have an impact on the result when considering the LCPs connecting Pt1
and Pt2. However, the more important LCPs connecting the Roman iron smelting site with Pt3 coincide
quite well independent of the DEM chosen.
Based on the Q(14) cost function and a cost multiplier of 2 for low prominence areas, LCSCs were
calculated for the temple on the Görresburg and the villa near Blankenheim (Figure 18.14). Cost limits
Spatial analysis based on cost functions 351
are measured n terms of moving on level ground, without stepping into low prominence areas and are
given in multiples of 250 m. With respect to sizes, the two different sets of catchments do not differ
substantially (Table 18.6). South of the Görresburg hill, excavations found a Roman settlement (Horn,
2014, pp. 196–197), so agricultural use was probably important for both locations.
This case study covering only a small area mainly suggests hypotheses to be tested in larger areas with
a larger number of Roman sites. It should be noted that the cost model for Roman roads found in this
example bears similarity with that found by Verhagen and Jeneson (2012) dealing with a Dutch Roman
road section close to the German border.
Figure 18.14 The best performing LCPs for the models considered and the Least-Cost Site Catchments
(LCSCs) derived from the Q(14) cost model for the Görresburg temple and the villa.
Conclusion
Validation, assessing the accuracy, and analysis of the stability of the outcomes
For a convincing application of a cost function for LCSC or LCP generation, analysing the archaeological
or historical evidence and some validation is required. For instance, Garmy et al. (2005) mention that the
cost function chosen reproduces already known old footpaths in their study region.
GPS trails (Márquez-Pérez et al., 2017) and walking experiments (Kondo et al., 2011) can provide
some data for validating the cost function applied in the study area considered, but in general, modern
people are not as used to walking as people in past times. Energy expenditure might be overestimated by
cost functions based on modern measurements because the energy consumption of walking or running
is lower for people used to walking or running most of the day compared to that of modern white collar
workers (Pontzer et al., 2012).
If the aim of an LCP study is to reconstruct a known road, the similarity between the LCP to the
known route can be assessed by determining the proportion of the LCP that lies within a buffer distance
from the known road (Goodchild & Hunter, 1997). Applications of this simple measure of similarity in
archaeological LCP studies were published by Canosa-Betés (2016), Güimil-Fariña and Parcero-Oubiña
(2015) as well as Lynch and Parcero-Oubiña (2017).
If the location of the roads to be reconstructed is not known, validation often relies on road indica-
tor sites, for example grave monuments or mile stones that can be found close to Roman roads (e.g.
Güimil-Fariña & Parcero-Oubiña, 2015). On a larger scale, archaeologists often assume that settlements
are located close to main roads. For instance, Lynch and Parcero-Oubiña (2017) calculated the distance
from each site in their study area to the closest calculated path. If the road indicator sites are clustered
(e.g. Güimil-Fariña & Parcero-Oubiña, 2015) statistical tests relying on independent observations are
problematic. Therefore, validation based on such sites is not straight-forward.
Finally, validation of LCP results by survey is another possibility (e.g. Rademaker et al., 2012; Rogers
et al., 2015). However, the number of finds in the vicinity of old routes that can be discovered by field
walking is limited, Rogers et al. (2015) detected only one artefact that was older than 200 years dur-
ing two days of prospection. The remains of minor roads such as sunken lanes may lead to inadequate
conclusions. Moreover, continuous use of routes until today and stray finds are issues in LCP validation
by field survey.
Some conclusions
A wide variety of cost functions for walkers is available, but validated cost functions for the movement
of pack or draft animals as well as for water transport can rarely be found. Moreover, footpaths following
animal tracks might exhibit a large variation of preferred slopes, because the study by Ganskopp and Vavra
(1987) shows that different species prefer different slopes: the average slopes of sites utilised by cattle, feral
horses, mule deer, and bighorn were 5.8, 11.2, 15.7, and 42.5% respectively within one study area. For
non-pedestrian transport further research is required to provide reliable cost functions.
Some authors of archaeological LCP studies believe that the selection of the cost model is of
minor importance (Bellavia, 2002; Verhagen & Jeneson, 2012). But many publications present quite
different LCP results for different slope-dependent cost functions (e.g. Canosa-Betés, 2016; Güimil-
Fariña & Parcero-Oubiña, 2015; Rademaker et al., 2012, Plate 2). Often, LCPs derived from several
cost models coincide only in areas where this route is the obvious choice such as mountain passes or
flat areas.
Spatial analysis based on cost functions 353
Most archaeological LCP and LCSC studies rely on software created by somebody else. Gietl, Doneus,
and Fera (2008) showed some time ago that it is often not possible to recreate the LCP results of one soft-
ware package with another. Some of the LCP software used by Gietl and his colleagues has been improved
in the last decade, but there are still substantial differences in their potential for modelling anisotropic
friction and movement steps in more than eight directions.
Note
1 VIA Erlebnisraum Römerstraße, Stationen [VIA adventure area Roman road, stops] (www.erlebnisraum-
roemerstrasse.de/stationen).
354 Irmela Herzog
References
Bell, T., & Lock, G. (2000). Topographic and cultural influences on walking the Ridgeway in later prehistoric times.
In G. Lock (Ed.), Beyond the map: Archaeology and spatial technologies (pp. 85–100). Amsterdam: IOS Press.
Bellavia, G. (2002). Extracting “Natural Pathways” from digital elevation model: Applications to landscape archaeo-
logical studies. In G. Burenhult & J. Arvidson (Eds.), Archaeological informatics: Pushing the envelope. CAA, 2001,
BAR International Series, 1016, 5–12. Oxford: Archaeopress.
Branting, S. (2007). Using an urban street network and a PGIST approach to analyze ancient movement. In J.
Clark & E. Hagemeister (Eds.), Digital discovery: Exploring new frontiers in human heritage: Computer applications and
quantitative methods in archaeology: Proceedings of the 34th conference, Fargo, United States, April 2006 (pp. 99–108).
Budapest: Archaeolingua.
Canosa-Betés, J. (2016). Border surveillance: Testing the territorial control of the Andalusian defense net-
work in center-south Iberia through GIS. Journal of Archaeological Science: Reports, 9, 416–426. doi:10.1016/j.
jasrep.2016.08.026
Chapman, H. (2006). Landscape archaeology and GIS. Stroud: Tempus Publishing.
Collischonn, W., & Pilar, J. V. (2000). A direction dependent least cost path algorithm for roads and canals. Interna-
tional Journal of Geographical Information Science, 14(4), 397–406.
Conolly, J., & Lake, M. (2006). Geographical information systems in archaeology. Cambridge Manuals in Archaeology.
Cambridge, UK: Cambridge University Press.
Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2001). Introduction to algorithms (2nd ed.). Cambridge, MA:
The MIT Press and McGraw-Hill Book Company.
de Gruchy, M., Caswell, E., & Edwards, J. (2017). Velocity-based terrain coefficients for time-based models of human
movement. Internet Archaeology, 45. doi:10.11141/ia.45.4
Dijkstra, E. W. (1959). A note on two problems in connexion with graphs. Numerische Mathematik, 1, 269–271.
doi:10.1007/BF01386390
Ericson, J. E., & Goldstein, R. (1980). Work space: A new approach to the analysis of energy expenditure within site
catchments. Anthropology UCLA, 10(1&2), 21–30.
Fábrega Álvarez, P., & Parcero Oubiña, C. (2007). Proposals for an archaeological analysis of pathways and movement.
Archeologia e Calcolatori, 18, 121–140.
Fovet, É., & Zakšek, K. (2014). Path network modelling and network of aggregated settlements: A case study in
Languedoc (Southeastern France). In Polla & Verhagen (2014, pp. 43–72).
Gaffney, V., & Stančič, Z. (1992). Diodorus Siculus and the island of Hvar, Dalmatia: Testing the text with GIS. In
G. Lock & J. Moffet (Ed.), Computer applications and quantitative methods in archaeology 1991. BAR International
Series, 577, 113–125. Oxford: Tempvs Reparatvm.
Ganskopp, D., Cruz, R., & Johnson, D. E. (2000). Least-effort pathways?: A GIS analysis of livestock trails in rugged
terrain. Applied Animal Behaviour Science, 68, 179–190.
Ganskopp, D., & Vavra, M. (1987). Slope use by cattle, feral horses, deer, and bighorn sheep. Northwest Science, 61(2),
74–81.
Garmy, P., Kaddouri, L., Rozenblat, C., & Schneider, L. (2005). Logiques spatiales et “systèmes de villes” en Lodévois
de l’Antiquité à la période moderne. [Spatial investigations concerning village distributions in Lodévois from
antiquity until modern times]. In Temps et espaces de l’homme en société, analyses et modèles spatiaux en archéologie.
XXVème rencontres internationales d’archéologie et d’histoire d’Antibes (pp. 1–12). Editions APDCA.
Gietl, R., Doneus, M., & Fera, M. (2008). Cost distance analysis in an alpine environment: Comparison of differ-
ent cost-surface modules. In A. Posluschny, K. Lambers, & I. Herzog (Eds.), Layers of perception: Proceedings of the
35th international conference on computer applications and quantitative methods in archaeology (CAA), Berlin, Germany,
April 2–6, 2007. Kolloquien zur Vor-und Frühgeschichte, 10, (p. 342, full paper on CD). Bonn: Rudolf Habelt.
Givoni, B., & Goldman, R. (1971). Predicting metabolic energy cost. Journal of Applied Physiology, 30, 429–433.
Goodchild, M. F., & Hunter, G. J. (1997). A simple positional accuracy measure for linear features. International Journal
of Geographical Information Science, 11(3), 299–306. doi:10.1080/136588197242419
Grewe, K. (2004). Alle Wege führen nach Rom – Römerstraßen im Rheinland und anderswo [All roads lead to Rome:
Roman roads in the Rhine area and elsewhere]. In H. Koschik (Ed.), Alle Wege führen nach Rom: Internationales
Römerstraßenkolloquium Bonn. Materialien zur Bodendenkmalpflege im Rheinland, 16, 9–42.
Spatial analysis based on cost functions 355
Grewe, K. (2007). Die Agrippastraße zwischen Köln und Trier. [The Agrippa road between Cologne and Trier]. In
Erlebnisraum Römerstraße Köln-Trier. Erftstadt-Kolloquium, 31–64.
Groenhuijzen, M. R., & Verhagen, P. (2017). Comparing network construction techniques in the context of
local transport networks in the Dutch part of the Roman limes. Journal of Archaeological Science: Reports, 15,
235–251.
Güimil-Fariña, A., & Parcero-Oubiña, C. (2015). “Dotting the joins”: A non-reconstructive use of least cost paths
to approach ancient roads: The case of the Roman roads in the NW Iberian Peninsula. Journal of Archaeological
Science, 54, 31–44. doi:10.1016/j.jas.2014.11.030
Hagen, J. (1931). Römerstraßen der Rheinprovinz. [Roman roads of the Rheinprovinz] Erläuterungen zum
Geschichtlichen Atlas der Rheinprovinz, Publikationen der Gesellschaft für Rheinische Geschichtskunde
XII 8 (2nd ed.).
Herzog, I. (2013a). Theory and practice of cost functions. In F. Contreras, M. Farjas, & F. J. Melero (Eds.), Fusion of
cultures: Proceedings of the 38th annual conference on computer applications and quantitative methods in archaeology, Granada,
Spain, April 2010. BAR International Series, 2494, 375–382, Granada. Oxford: Archaeopress.
Herzog, I. (2013b). The potential and limits of optimal path analysis. In A. Bevan & M. Lake (Eds.), Computational
approaches to archaeological spaces (pp. 179–211). Walnut Creek, CA: Left Coast Press.
Herzog, I. (2013c). Least-cost networks. In G. Earl, T. Sly, A. Chrysanthi, P. Murrieta-Flores, C. Papadopoulos, I.
Romanowska, & D. Wheatley (Eds.), Archaeology in the Digital Era: CAA 2012: Proceedings of the 40th annual confer-
ence of computer applications and quantitative methods in archaeology (CAA) (pp. 237–248). Amsterdam: Amsterdam
University Press.
Herzog, I. (2013d). Calculating accessibility. In G. Earl, T. Sly, A. Chrysanthi, P. Murrieta-Flores, C. Papadopoulos,
I. Romanowska, & D. Wheatley (Eds.), Archaeology in the Digital Era, Volume II: CAA 2012: Proceedings of the
40th annual conference of computer applications and quantitative methods in archaeology (CAA), Southampton, 720–734.
Retrieved from http://dare.uva.nl/cgi/arno/show.cgi?fid=545855
Herzog, I. (2013e). Medieval mining sites, trade routes, and least-cost paths in the Bergisches Land, Germany. In P.
Anreiter, K. Brandstätter, G. Goldenberg, K. Hanke, W. Leitner, K. Nicolussi, . . . P. Tropper (Eds.), Mining in
European history and its impact on environment and human societies: Proceedings for the 2nd mining in European history
conference of the FZ HiMAT, November 7–10, 2012, Innsbruck (pp. 201–206). Innsbruck: Innsbruck University
Press.
Herzog, I. (2014a). Least-cost paths: Some methodological issues. Internet Archaeology, 36. doi:10.11141/ia.36.5
Herzog, I. (2014b). A review of case studies in archaeological least cost analysis. Archeologia e Calcolatori, 25, 223–239.
Herzog, I. (2016). Dispersal versus optimal path calculation. In S. Campana, R. Scopigno, G. Carpentiero, & M.
Cirillo (Eds.), CAA 2015: Keep the revolution going: Proceedings of the 43rd annual conference on computer applications
and quantitative methods in archaeology held in Siena 2014 (pp. 567–577). Oxford: Archaeopress.
Herzog, I., & Posluschny, A. (2011). Tilt: Slope-dependent least cost path calculations revisited. In E. Jerem, F. Redö, &
V. Szeverényi (Eds.), On the road to reconstructing the past: Proceedings of the 36th CAA conference 2008 in Budapest,
Budapest, 212–218 (on CD: pp. 236–242). Budapest: Archaeolingua.
Herzog, I., & Yépez, A. (2013). Least-cost kernel density estimation and interpolation-based density analysis applied
to survey data. In F. Contreras, M. Farjas, & F. J. Melero (Eds.), Fusion of cultures: Proceedings of the 38th annual
conference on computer applications and quantitative methods in archaeology, Granada, Spain, April 2010 (pp. 367–374).
BAR International Series, 2494. Oxford: Archaeopress.
Horn, H. G. (2014). Agrippa Straße: Von Köln bis Dahlem in 4 Etappen und 8 Exkursen. [Agrippa road: Cologne to
Dahlem in 4 stages and 8 excursions]. Cologne: Bachem Verlag.
Hudson, E. (2012). Walking and watching: New approaches to reconstructing cultural landscapes. In White &
Surface-Evans (2012, pp. 97–108).
Indruszewski, G., & Barton, C. M. (2005). DSMs, anisotropic spread and least cost paths for simulating Viking Age
routes in the Baltic Sea. In Stadtarchäologie Wien (Ed.), Workshop 9, Archäologie und Computer, November 3–5, 2004
(PDF file on CD). Vienna: Phoibos Verlag.
Indruszewski, G., & Barton, C. M. (2007). Simulating sea surfaces for modeling Viking Age seafaring in the Bal-
tic sea. In J. Clark & E. Hagemeister (Eds.), Digital discovery: Exploring new frontiers in human heritage: Computer
applications and quantitative methods in archaeology: Proceedings of the 34th conference, Fargo, United States, April 2006
(pp. 616–629). Budapest: Archaeolingua.
356 Irmela Herzog
Irmischer, I. J., & Clarke, K. C. (2017). Measuring and modeling the speed of human navigation. Cartography and
Geographic Information Science, 45(2), 177–186. doi:10.1080/15230406.2017.1292150
Kerig, T. (2008). Als Adam grub . . . Vergleichende Anmerkungen zu landwirtschaftlichen Betriebsgrössen in
prähistorischer Zeit. [When Adam was digging . . . remarks on farm sizes in prehistoric times]. Ethnographisch-
Archäologische Zeitschrift, 48, 375–402.
Kondo, Y., Ako, T., Heshiki, I., Matsumoto, G., Seino, Y., Takeda, Y., & Yamaguchi, H. (2011). FIELDWALK@
KOZU: A preliminary report of the GPS/GIS-aided walking experiments for remodelling prehistoric pathways
at Kozushima Island (East Japan). In E. Jerem, F. Redő, & V. Szeverényi (Eds.), On the road to reconstructing the past:
Computer applications and quantitative methods in archaeology (CAA): Proceedings of the 36th international conference,
Budapest, April 2–6, 2008 (pp. 226–232; CD-ROM 332–338). Budapest: Archaeolingua.
Korczyńska, M., Cappenberg, K., & Kienlin, T. L. (2015). Lauter Lausitzer Burgwälle? Zur Bedeutung land-
wirtschaftlicher Gunstfaktoren während der späten Bronzezeit und frühen Eisenzeit entlang des Dunajec [So-
called Lusatian Ramparts? Significance of agricultural factors favouring settlement in the Dunajec Valley in the
Late Bronze Age and early Iron Age] In J. Gancarski (Ed.), Pradziejowe osady obronne w Karpatach/Prehistoric fortified
settlements in the Carpathians (pp. 215–244). Krosno.
Langmuir, E. (2004). Mountaincraft and leadership (3rd ed.). Cordee, UK: Mountain Leader Training England &
Mountain Leader Training Scotland.
Lay, M. G. (1992). Ways of the world: A history of the world’s roads and of the vehicles that used them. New Brunswick,
NJ: Rutgers University Press.
Lee, J., & Stucky, D. (1998). On applying viewshed analysis for determining least-cost paths on digital elevation
models. International Journal of Geographical Information Science, 12(8), 891–905.
Llobera, M. (2003). Extending GIS-based visual analysis: The concept of visualscapes’. International Journal of Geo-
graphical Information Science, 17(1), 25–48. doi:10.1080/713811741
Llobera, M., Fábrega-Álvarez, P., & Parcero-Oubiña, C. (2011). Order in movement: A GIS approach to accessibility.
Journal of Archaeological Science, 38(4), 843–851.
Llobera, M., & Sluckin, T. J. (2007). Zigzagging: Theoretical insights on climbing strategies. Journal of Theoretical
Biology, 249, 206–217.
Lock, G., & Pouncett, J. (2010). Walking the Ridgeway revisited: The methodological and theoretical implications
of scale dependency for the derivation of slope and the calculation of least-cost pathways. In B. Frischer, J. Webb
Crawford, & D. Koller (Eds.), Making history interactive: Computer applications and quantitative methods in archaeology
(CAA): Proceedings of the 37th international conference, Williamsburg, Virginia, United States of America, March 22–26
(pp. 192–203). BAR International Series, S2079. Oxford: Archaeopress.
Lynch, J., & Parcero-Oubiña, C. (2017). Under the eye of the Apu: Paths and mountains in the Inka settlement
of the Hualfín and Quimivil valleys, NW Argentina. Journal of Archaeological Science: Reports, 16(Supplement
C), 44–56.
Márquez-Pérez, J., Vallejo-Villalta, I., & Álvarez-Francoso, J. I. (2017). Estimated travel time for walking trails in
natural areas. Geografisk Tidsskrift – Danish Journal of Geography, 117(1), 53–63. doi:10.1080/00167223.2017.
1316212
Minetti, A. E., Moia, C., Roi, G. S., Susta, D., & Ferretti, G. (2002). Energy cost of walking and running at extreme
uphill and downhill slopes. Journal of Applied Physiology, 93, 1039–1046.
Mlekuz, D. (2013). Time geography, GIS and archaeology. In F. Contreras, M. Farjas, & F. J. Melero (Eds.), Fusion of
cultures: Proceedings of the 38th annual conference on computer applications and quantitative methods in archaeology, Granada,
Spain, April 2010 (pp. 359–366). BAR International Series, 2494. Oxford: Archaeopress.
Negre, J., Muñoz, F., & Barcelo, J. (2017). A cost-based Ripley’s K function to assess social strategies in settlement
patterning. Journal of Archaeological Method and Theory, 25(3), 777–794. doi:10.1007/s10816-017-9358-7
Orengo, H. A., & Miró i Alaix, C. (2013). Reconsidering the water system of Roman Barcino (Barcelona) from
supply to discharge. Water History, 5(3), 243–266. doi:10.1007/s12685-013-0090-2
Palmisano, A. (2017). Drawing pathways from the Past: The trade routes of the old Assyrian Caravans across upper
Mesopotamia and Central Anatolia. In F. Kulakoglu & G. Barjamovic (Eds.), Movement, resources, interaction: Proceed-
ings of the 2nd Kültepe international meeting, Kültepe, July 26–30, 2015. Studies Dedicated to Klaas Veenhof (pp. 29–48).
Kültepe International Meetings 2 (SUBARTU, 39). Turnhout: Brepols.
Spatial analysis based on cost functions 357
Pandolf, K. B., Givoni, B., & Goldman, R. F. (1977). Predicting energy expenditure with loads while standing or
walking very slowly. Journal of Applied Physiology, 43(4), 577–581. doi:10.1152/jappl.1977.43.4.577
París Roche, A. (2008). MIDE: Método para la información de excursiones. Manual de procedimientos [Methodology
for assessing the difficulty of a walking route]. Versión 1.1. Retrieved from www.montanasegura.com/MIDE/
manualMIDE.pdf
Peeples, M. A., Barton, C. M., & Schmich, S. (2006). Resilience lost: Intersecting land use and landscape dynamics in
the prehistoric southwestern United States. Ecology and Society, 11(2), 22. Retrieved from www.ecologyandsociety.
org/vol11/iss2/art22/
Polla, S., & Verhagen, P. (Eds.). (2014). Computational approaches to movement in archaeology: Theory, practice and interpreta-
tion of factors and effects of long term landscape formation and transformation. Topoi Berlin Studies of the Ancient World:
Vol. 23. Berlin: De Gruyter. Retrieved from www.degruyter.com/viewbooktoc/product/182464
Pontzer, H., Raichlen, D. A., Wood, B. M., Mabulla, A. Z. P., Racette, S. B., & Marlowe, F. W. (2012). Hunter-gatherer
energetics and human obesity. PLoS One, 7(7), e40503.
Posluschny, A. G. (2010). Over the hills and far away? Cost surface based models of prehistoric settlement Hinterlands.
In B. Frischer, J. Webb Crawford, & D. Koller (Eds.), Making history interactive: Computer applications and quantita-
tive methods in archaeology (CAA): Proceedings of the 37th international conference, Williamsburg,Virginia, United States of
America, March 22–26 (pp. 313–319). BAR International Series, S2079. Oxford: Archaeopress.
Rademaker, K., Reid, D. A., & Bromley, G. R. M. (2012). Connecting the dots. In White & Surface-Evans (2012,
pp. 32–45).
Rogers, S. R., Collet, C., & Lugon, R. (2015). Least cost path analysis for predicting glacial archaeological site potential
in central Europe. In A. Traviglia (Ed.), Across space and time: Papers from the 41st conference on computer applications
and quantitative methods in archaeology, Perth, March 25–28, 2013 (pp. 261–275). Amsterdam: Amsterdam University
Press.
Schneider, J. (1879). Römische Heerstraßen auf der linken Rhein-und Moselseite. [Roman military roads left of the
rivers Rhine and Mosel]. Bonner Jahrbücher, 67, 21–28.
Soule, R., & Goldman, R. (1972). Terrain coefficients for energy cost prediction. Journal of Applied Physiology, 32,
706–708.
Surface-Evans, S. L. (2012). Cost catchments: A least cost application for modeling hunter gatherer land use. In
White & Surface-Evans (2012, pp. 128–151).
Tobler, W. (1993). Non-isotropic geographic modeling. Technical Report No. 93-1. Retrieved from https://cloudfront.
escholarship.org/dist/prd/content/qt05r820mz/qt05r820mz.pdf
Van Lanen, R. (2017). Changing ways: Patterns of connectivity, habitation, and persistence in Northwest European lowlands
during the first millennium AD. Utrecht: Utrecht Studies in Earth Science.
Verhagen, P. (2013). On the road to nowhere? Least cost paths and the predictive modelling perspective. In F. Contre-
ras, M. Farjas, & F. J. Melero (Eds.), Fusion of cultures: Proceedings of the 38th annual conference on computer applications
and quantitative methods in archaeology, Granada, Spain, April 2010 (pp. 383–389). BAR International Series, 2494.
Oxford: Archaeopress.
Verhagen, P., & Jeneson, K. (2012). A Roman puzzle: Trying to find the via Belgica with GIS. In A. Chrysanthi,
P. Murrieta Flores, & C. Papadopoulos (Eds.), Thinking beyond the Tool (pp. 123–100). BAR International Series,
2344. Oxford: Archaeopress.
Waugh, D. (2002). Geography: An integrated approach. Cheltenham: Nelson Thornes.
Wheatley, D., & Gillings, M. (2002). Spatial technology and archaeology: The archaeological applications of GIS. London:
Taylor & Francis.
White, D., & Surface-Evans, S. (Eds.). (2012). Least cost analysis of social landscapes: Archaeological case studies. Salt Lake
City: The University of Utah Press.
Whitley, T. G., & Burns, G. (2007). An explanatory framework for predictive modeling using an example from Mar-
ion, Horry, Dillon, and Marlboro Counties, South Carolina. In J. Clark & E. Hagemeister (Eds.), Digital discov-
ery: Exploring new frontiers in human heritage: Computer applications and quantitative methods in archaeology:
Proceedings of the 34th conference, Fargo, United States, April 2006 (pp. 121–129). Budapest: Archaeolingua.
Whitley, T. G., & Burns, G. (2008). Conditional GIS surfaces and their potential for archaeological predictive mod-
elling. In A. Posluschny, K. Lambers, & I. Herzog (Eds.), Layers of perception: Proceedings of the 35th international
358 Irmela Herzog
conference on computer applications and quantitative methods in archaeology (CAA), Berlin, Germany, April 2–6, 2007
(pp. 292–298). Kolloquien zur Vor-und Frühgeschichte, 10. Bonn: Rudolf Habelt.
Wood, B., & Wood, Z. (2006). Energetically optimally travel across terrain: Visualizations and new metrics of geo-
graphic distance with anthropological applications. In R. F. Erbacher, J. C. Roberts, M. T. Gröhn, & K. Börner
(Eds.), Visualization and data analysis 2006. SPIE Proceedings, Volume 6060. doi:10.1117/12.644376
19
Processing and analysing
satellite data
Tuna Kalaycı
Introduction
The human production of landscapes requires substantial labour investment. Settlement mounds, field
boundaries, cairns, water canals, road systems and many other anthropogenic features create a cultural
signature within natural space. Furthermore, human disturbance of natural soil results in differences in
mineralogy, chemical constituents, soil moisture, soil structure, particle size and organic material content.
As a consequence, archaeologists can still trace ancient landscapes today. In this respect, satellite remote
sensing has increasingly provided better and more efficient tools for the detection, identification, and
understanding of past human activities – with spatial consequences.
Human intentionality results in cultural features with shapes which rarely occur naturally; for example,
ancient canals and roadways may follow long, linear paths and defensive forts have regular forms. Archae-
ological features are also structurally different from the remainder of the landscape and at times they are
significantly obtrusive within a view; settlement mounds were not only places for habitation, for example,
but also structures against which one could physically and/or socially orientate oneself. Indeed, some are
visible from space and they continue to guide archaeologists today. But it is not only monumental struc-
tures that leave detectable traces. When archaeological features degrade and lose their original forms, they
may create moisture differences with the natural environment in which they are situated; ancient ditches
now filled with loose soil have differential water retention capabilities and so that they can be detected
in satellite imagery as soil marks.
The main axiom in archaeological remote sensing is that human-made features are detectable in satel-
lite imagery since they create a measurable data contrast with their immediate surroundings. There are
two contrast types: direct and proxy (Beck, 2007). Through direct contrast a sensor registers electromag-
netic data from an archaeological feature (or its remnants) which remains visible on the surface so that
the emitted/reflected energy has clear archaeological origins. Proxy contrast reading, on the other hand,
is materialized when a sensor reads data which is only indirectly related to an archaeological feature. This
type of data is formed by the interaction between a cultural feature and its natural setting, such as crop
marks.
Direct or proxy, the level of contrast determines the success of a remote sensing analysis. The degree
of separation between cultural and natural (i.e. contrast) is determined by a significantly large number
360 Tuna Kalaycı
of variables.1 A simplified but representative list might include local geological, geographical and geo-
morphological background conditions; chemical and physical composition of archaeological materials;
mechanics of degradation; time of data acquisition; and sensor resolutions.
The use of high-resolution imagery in archaeology goes back as early as the beginning of the 20th
century, with pictures of Stonehenge taken from a military balloon in September 1906 revealing the
value of high-altitude imagery for archaeology (Barber, 2012). Since that date, aerial photographs have
been widely used in archaeological studies (Brophy & Cowley, 2005; Riley, 1987; Wilson, 2000). On the
other hand, collecting and accessing aerial imagery is restricted in some parts of the world; or the tradi-
tion of practising aerial archaeology may not exist at all, and thus, only a handful of images may exist
for archaeological use.2 Space-borne imagery has become a competent alternative to aerial archaeology
thanks to the constant increase in the spatial resolutions of satellite systems.
With resolutions of less than 1 metre, Very High Resolution (VHR) Earth observation satellites can
now resolve archaeological features in great detail, as has been reflected in the number of related studies
(e.g. Deroin, Téreygeol, & Heckes, 2011; Garrison et al., 2008; Malinverni, Pierdicca, Bozzi, Colosi, &
Orazi, 2017; Scardozzi, 2012). In particular, scholars have been using VHR imagery extensively for moni-
toring cultural heritage in areas where political turmoil and social unrest prohibit travelling to sites (e.g.
Casana & Panahipour, 2014; Lasaponara, Danese, & Masini, 2012).
While numerous research projects make use of high-spatial-resolution true-colour satellite imagery
(where the software emulates how we see the earth’s surface using the visible portion of the electro-
magnetic spectrum), it is also possible to go beyond the visible spectra and exploit unique relationships
between materials and the ways in which electromagnetic energy is reflected/emitted from them. This is
to say that materials respond differentially to different portions of the electromagnetic spectrum. There-
fore, by determining unique ‘material signatures’ it is theoretically possible to discriminate archaeologi-
cal sites from non-sites and thoroughly map archaeological features using multispectral analysis3 (e.g.
Oltean & Lauren, 2012).
Between 1982 and 2012, the Landsat 4 and 5 satellites carrying Thematic Mapper (TM) multispec-
tral sensors spearheaded Earth observation, providing invaluable datasets for archaeologists from as early
as the 1980s, especially useful for better understanding environments (Clark, Garrod, & Pearson, 1998;
Cooper, Bauer, & Cullen, 1991; Pope & Dahlin, 1989). Later studies continued this trend, making use
of other satellite systems with higher spatial or temporal resolutions (Kouchoukos, 2001; Schmid, Koch,
DiBlasi, & Hagos, 2008), as well as finer spectral resolutions – also called hyperspectral data (Alexakis,
Sarris, Astaras, & Albanakis, 2009; Savage, Levy, & Jones, 2012). In the latter work, scholars focus on
specific parts of the spectrum to investigate how particular archaeological features are depicted in sensor
data (Altaweel, 2005) or they use mathematical operations to put different parts of the spectrum together
(also called indexing) in order to highlight the differential properties of materials for easier detection
(Agapiou, Hadjimitsis, & Alexakis, 2012; Lasaponara & Masini, 2006). Finally, researchers make use of
derived variables from satellite sensing data; digital elevation models (DEMs) in particular having been
widely used (Hritz & Wilkinson, 2006; Siart, Bubenzer, & Eitel, 2009).
Method
E
amplitude
directio
propagatnioof
n
wavelen
gth
Figure 19.1 The electric field (E – dashed line) remains perpendicular to the magnetic field (H – solid line)
as the electromagnetic energy propagates.
characteristics and chemical compositions of (archaeological) features. The main source of reflection/
emission is the Sun, as it produces a full spectrum of electromagnetic radiation (from gamma rays to
long waves).
The radiation travelling to the Earth at the speed of light is composed of electric (E) and magnetic (H)
fields. These two components are perpendicular to each other (Figure 19.1). The electrical field (E) var-
ies in amplitude in the direction of propagation and the magnetic field (H) propagates in-phase with the
electrical field (Jackson, 1962, p. 204). The characteristics of the electromagnetic radiation, and thus the
ways in which it is finally reflected back and/or emitted from surface matter, can be defined through its
wavelength (λ, the distance from one crest to the next), frequency (ν, the number of crests passing a fixed
point at a given period of time), and amplitude (the height of each crest, also called spectral irradiance).
The speed of electromagnetic energy (c) is constant at 299,792 km/sec (in vacuum) and the relationship
between the wavelength and the frequency at a given time is:
c = λ × ν (19.1)
This is also to say that the wavelength and frequency of radiation are inversely proportional to each
other and as frequency increases (or wavelength decreases) the energy of the system increases since the
speed of light is always constant (as measured in a vacuum).
The parameters of electromagnetic radiation are helpful for describing the portions of the spectrum
(Figure 19.2). The most significant part of the spectrum for humans is visible light with a wavelength
range from (approximately) 400 nanometres to 700 nanometres (1 nm = 10–9 m), a narrow limit set by
human visual physiology. The other major categories relevant to remote sensing are infrared and micro-
wave, all of which have their own unique ways of interaction with the surface material and, thus, have
designated areas of scholarly interest (Campbell & Wynne, 2011, Chapters 17–21).
Once the light starts travelling in the atmosphere it interacts with its immediate environment (gases,
particles, differential temperature, etc.) and scatters, refracts, and is absorbed in numerous ways until it
362 Tuna Kalaycı
Figure 19.2 A simplified depiction of electromagnetic radiation. The visible light, which is mainly made up
of red, green and blue, constitutes only a small fraction of the full spectrum.
reaches the ground. Therefore, atmospheric conditions can greatly affect sensor readings. Sensors rely-
ing solely on the energy of the Sun (in the form of electromagnetic radiation) are called passive sensors
and they can only operate within specific atmospheric windows (Campbell & Wynne, 2011, pp. 45–48).
Active satellite sensors, however, have the capability of emitting their own energies (mainly in the micro-
wave portion of the spectrum), and they detect reflected backscattered radiation of their own. Active
sensor measurements can penetrate through clouds and can operate even at night4 (Chapman & Blom,
2013). Both passive (e.g. Agapiou, 2017; Schuetter et al., 2013) and active sensor types (e.g. Holcomb &
Shingiray, 2007; Stewart, 2017) have a wide range of applications in archaeology, but their coupled use
also attracts attention (e.g. Stewart, Oren, & Cohen-Sasson, 2018).
Sensor design
Modern sensor data is collected in digital format (for a historical analogue counterpart, see the case study
section on CORONA) and is delivered ready for processing and analysis through computer software.
CCD (Charged-Coupled Device) and CMOS (Complementary Metal Oxide Semiconductor) are con-
sidered to be important digital sensor designs for the collection of digital imagery. Nevertheless, optical-
mechanical systems are still the dominant sensor types and provide a robust technology for collecting data
across the electromagnetic spectrum (Campbell & Wynne, 2011, Chapter 4).
As the satellite bus passes over a region of interest, the optical-mechanical sensor scans the surface
of the Earth and collects electromagnetic energy. On-board filtering (i.e. grating) splits (i.e. diffracts)
radiation into different segments (also called spectral channels), resulting in an orderly arrangement of
wavelengths. Next, detectors convert channelled radiation into a series of electric currents as a representa-
tion of land surface brightness at different wavelengths. Finally, electrical signals are amplified and data is
recorded digitally after conversion (Lillesand, Kiefer, & Chipman, 2004, Chapter 5).
Sensor resolutions
In this complex engineering design, there are some key parameters which are significant determinants
in archaeological applications. First and foremost is the spatial resolution, which reflects the ability of a
sensor to capture geometric details of the area or object under investigation, where each pixel represents
a corresponding area on the ground (Jensen, 2014, pp. 14–16). For instance, a 15-metre sensor resolution
Processing and analysing satellite data 363
indicates the value of a pixel from the averaging of electromagnetic energy reflected/emitted from a
ground surface area with the dimension of 15 metres by 15 metres. Therefore, the smaller the unit of
ground area, the more detailed the final dataset. A very generic rule of thumb in remote sensing is that the
spatial resolution of a sensor should be half the size of the object of interest. While very high-resolution
sensor data (0.5–2.0 metres) can provide detailed geometric descriptions of archaeological features, high
acquisition costs and limited ground coverage impedes common scientific usage.
Spectral resolution is determined by the width of the spectral channels. A narrower channel is a
more accurate descriptor of collected radiation as it focuses on a smaller portion of the electromagnetic
spectrum (Jensen, 2014, p. 14). A narrower width, however, also requires a larger number of channels (or
bands in sensor data) to meaningfully represent the spectrum. A multi-spectral sensor is one that collects
data beyond the visible spectra, such as Landsat-8 with 11 bands, or Sentinel-2A with 12 bands. When
the sensor is designed with higher spectral resolution, it is called hyper-spectral, such as EO-1 Hyperion
with more than 200 bands (Savage et al., 2012).
Radiometric resolution is the ability of a sensor to collect brightness at greater levels, and it represents
the smallest detectable energy difference (Jensen, 2014, p. 18). A coarse radiometric resolution indicates
that limited levels of data are recorded in smaller computer data bits (2-bits or 4-bits), usually resulting in
a high-contrast image. Fine resolution is the representation of ground brightness in greater detail (8-bits
or beyond), resulting in many levels of brightness.
Temporal resolution is the frequency with which the sensor visits the same area of interest in a given
time period (Jensen, 2014, pp. 17–18). A fine resolution means there is available data with close time
intervals, so it is possible to detect and/or monitor changes in the landscape (e.g. before looting and after
looting (Lasaponara et al., 2012)). A finer resolution also increases the chances of acquiring the best scene
from a region, such as one without cloud coverage obstructing the view.
Data
Preprocessing
Data Analysis
---
Image
Classificaon
Figure 19.3 The main steps in a satellite remote sensing analysis. The workflow has a hierarchical structure.
Evaluation of results may reset the workflow until a satisfactory solution is achieved.
364 Tuna Kalaycı
Problem definition
A project should start with an in-depth definition of the problem as this determines the next steps of the
analysis. For instance, a mapping project’s needs are completely different from those of remote-sensing-
based paleoenvironmental modelling. The former might be conducted by a single researcher using an
online viewing platform (such as Google Earth) and a budget computer with stable internet connection,
while the latter might need the collaboration of multiple researchers from different disciplines working
with large sets of remote sensing data on a computer with high processing power.
Desired outcomes should be clearly defined, since it might not be possible to easily alter the scope
during the workflow; because the process is hierarchical, a change in one step would require changes
in previous steps. An explicit designation of the project boundary will help during data acquisition and
will prevent later additions which would require rolling back the workflow. Finally, it is beneficial to
assign a specific coordinate system at the outset of the project, as later conversions between coordinate
systems and projection types might lead to spatial mismatches with other geographic datasets of the
project.
Data acquisition
Governmental agencies and companies in the private sector provide remotely sensed data. For publicly
funded satellite systems (e.g. Landsat, Sentinel) it is customary to have open access through an online
portal (e.g. https://earthexplorer.usgs.gov/; https://scihub.copernicus.eu/) or through project proposals
which are subject to evaluation. Private sector launches usually involve very high-resolution systems (e.g.
WorldView-3, RapidEye) and data can be purchased directly or via third-party vendors.
Data pre-processing
Pre-processing is a crucial step for correcting data issues which are not strictly related to the relation-
ship between the ground and its radiometric manifestations on satellite sensors. Eliminating these issues
increases the chances of highlighting usually subtle contrasts between features and their backgrounds.
Furthermore, pre-processing improves data integrity and interoperability. Acquired data might already
be processed at different levels (Arvidson et al., 1986, p. 5), but the researcher may still need to take into
consideration possible sensor malfunctions, atmospheric effects on measurements, and spatial referenc-
ing problems. Data pre-processing may be divided into two major groups: radiometric and geometric
pre-processing.
Processing and analysing satellite data 365
Radiometric pre-processing involves adjusting the digital values of sensor data. Optical-mechanical
malfunctions might lead to artificial data stripes or gaps due to sensor design issues, failures in scan line
collectors, eventual degradation of detectors, saturation, or problems during data downlink from satellite
to ground stations. Some of these issues might be corrected using statistical techniques ranging from
histogram equalization to interpolation (Pringle, Schmidt, & Muir, 2009; Rakwatin, Takeuchi, & Yasuoka,
2007).
The atmosphere can have considerable effects on remote sensing data collection. A sensor registers
radiation not only from the ground, but also from atmospheric scattering. There are three main types
of corrections: image-based, modelling-based, and corrections based on ground data (Hadjimitsis et al.,
2010). Among these, image-based methodologies are the most common. In this methodology, features
with known/assumed spectral values are used for an adjustment through Dark Object Subtraction (DOS).
Water, for instance, absorbs solar radiation, and ideally water bodies should be spectrally registered at zero
values. Therefore, any value other than zero can be attributed to the atmospheric column over that water
body. Subtracting this value from the whole scene provides a simple approximation for atmospheric cor-
rection. Further modifications and improvements to DOS have been suggested (Chavez, 1996).
Geometric pre-processing involves manipulating a remote sensing scene so that it is integrable with
other spatial data and represents detected ground geometry more accurately. Resampling is one of the
most common geometric pre-processing techniques. It is especially necessary when the sampling size of
the ground (i.e. the pixel resolution) of a dataset does not match with other imagery, since such a discrep-
ancy in resolution between different rasters can, for instance, prohibit researchers from applying algebraic
operations. Resampling is performed by interpolation. The most common techniques are nearest-
neighbourhood, bilinear and cubic convolution. Each interpolation technique has its advantages and
disadvantages (Zitová & Flusser, 2003). Resampling also creates the foundation for the ortho-rectification
process, where the effects of terrain are removed from the scene and features are represented in their true
positions (Leprince, Barbot, Ayoub, & Avouac, 2007).
Georeferencing is the process of matching image coordinates with Earth coordinates (Ground Control
Points, or GCPs) so that features represented on imagery can be accurately located on the ground for fur-
ther analysis and evaluation (Longley, Goodchild, Maguire, & Rhind, 2005, pp. 109–126). It is routine to
acquire satellite data already in georeferenced form: for example, locational information might be embed-
ded in an image within the header file or come as an auxiliary file. It is also possible that the researcher
may need the camera parameters and flight information in the metadata so that the scene can be treated
photogrammetrically for a geometric correction. Finally, locational information may be available through
Rational Polynomial Coefficients (RPCs), where the relationship between the image and the ground is
mathematically described in high-order (usually 20) polynomial functions (Tao & Hu, 2001). The RPCs
are preferred when camera vendors do not publicly reveal the parameters of their sensors. Furthermore,
RPCs can be used for any sensor since the mathematical description between an image and the ground
is system independent. Finally, RPCs are considered to be a fast solution to photogrammetric problems
(Dowman & Dolloff, 2000).
Data analysis
Satellite remote sensing has a wide variety of applications, ranging from agriculture to urban planning
and from disaster management to wetland mapping. Therefore, the objectives of research applications are
diverse, resulting in a vast body of literature on data analysis. The discussion below only provides selected
information, with a focus on the techniques which archaeologists employ most frequently.
366 Tuna Kalaycı
Panchromatic sharpening is a data fusion technique in which a higher resolution panchromatic band is
used to increase the spatial resolution of a multi-spectral band5 (Lasaponara & Masini, 2012a). Sharpening
is usually performed with transformations (e.g. Intensity-Hue-Saturation), statistical methods (e.g. Prin-
ciple Component Analysis), or algebraic operations (e.g. Brovey). The final result is a spatially improved
spectral layer which is ready for visual investigation. However, it is important to note that pan-sharpening
is a form of data manipulation so that altered layers may not be suitable for further quantitative analysis
and modelling.
Image enhancement is the rearrangement of image brightness levels with the aim of increasing the
level of contrast between archaeological features and their natural backgrounds. Enhancement strictly
depends on the data distribution of a particular scene, so there is no rule of thumb for its implementa-
tion. Radiometric approaches include, but are not limited to, contrast enhancement, linear stretching,
histogram equalization, and density slicing. Like panchromatic sharpening, image enhancement variably
alters sensor data values, so it is not advised to use enhanced images in further data analysis if the aim is
to physically model feature signatures.
Image classification is the process of assigning classes to pixels/objects in such a way that the separation
is meaningful and final classes form internally consistent homogeneous entities. In this respect, a classi-
fier is the algorithmic implementation of a formal mathematical/statistical method applied to a remotely
sensed dataset with the aim of providing a robust separation between classes. Due to the complexity of
this process, there have been countless efforts to provide the most powerful classifier, and thus, the remote
sensing literature is rich with a variety of applications. It is important to emphasize that no single clas-
sification method or algorithm is superior to any other, as each remotely sensed scene is unique. Owing
to the same complexity, there is no consensus on how to categorize approaches to the classification prob-
lem. However, considering the latest developments in computational technologies, a comparison between
pixel-based and object-based classifications can be proposed (Duro, Franklin, & Dubé, 2012).
Pixel-based classification methodologies involve the analysis of the spectral properties of pixels
in isolation and disregard potential spatial and contextual relationships among neighbouring pixels.
These methodologies can be further divided into two major categories: unsupervised classification (e.g.
k-means, ISODATA) where spectral characteristics of classes are not known a priori and an algorithm
divides the scene into clusters based on a statistical criterion. In supervised classification (e.g. parallel-
piped, maximum likelihood) there is some information about potential classes based on ground work,
expert knowledge, or other geospatial analysis. This information, in turn, is used to train a classification
algorithm for the detection and delineation of classes with similar spectral properties to the training set.
Object-based classification (or object-based image analysis, OBIA) methodologies decompose a
remotely sensed scene into unclassified image objects (also called “object primitives”) using a segmenta-
tion process (Szeliski, 2010, Chapter 5). Through image segmentation each pixel is flagged so that pixels
with the same flag form a group. A pixel shares a specific characteristic with other pixels in the same
group. Finally, statistical analysis is used to determine the characteristics of image objects (shape, size,
colour, texture, and context) for a final classification.
Archaeological feature detection can be considered a specific form of image classification, since the
ultimate aim of feature detection is to delineate pixels with specific anthropogenic meanings surrounded
by the pixels of natural background. One of the benefits of satellite remote sensing in archaeology is that
it enables wide-area detection of archaeological sites and features (Casana, 2014). Also, with its ability
to collect data beyond the visible portion of the electromagnetic spectrum, it highlights features which
are not immediately observable (Oltean & Lauren, 2012). Combined together, these two attributes
open up the possibility for identifying and documenting archaeological material in a given area in its
entirety. Furthermore, advancements in information technologies – but more so the introduction of very
Processing and analysing satellite data 367
Evaluation
The final step in a generic archaeological remote sensing project is the evaluation of results. Evaluation
can be qualitative and based on visual investigations or results can also be compared quantitatively with
another dataset (or against a set threshold value) for the viability of previous pre-processing analysis steps.
Therefore, evaluation provides feedback for the overall model, and as a result the researcher may need to
re-run data analysis with different parameters, employ other algorithms, consider using different sensor
data, or even re-evaluate the research question.
Case study
A very special form of high-spatial- and temporal-resolution satellite sensor pre-dates many of the
available sensors today and has been extensively used in archaeological studies. The CORONA spy sat-
ellite was developed as part of the US intelligence program (1960 to 1972) in the Cold War Era (Day,
Logsdon, & Latell, 1999). Due to its historicity, panchromatic CORONA images provide snapshots of
archaeological landscapes prior to recent large constructions, industrial agriculture, and urban expan-
sion. The impact of such land-use/land-cover change on the preservation of ancient material culture
is immense, and in many cases, there is complete loss (Casana, Cothren, & Kalayci, 2012). Among
many CORONA missions, the declassified series with Keyhole-KH 4A and 4B designators provide the
most suitable data for archaeological research, since they offer high-spatial-resolution imagery (2.74m
and 1.83m at nadir, respectively) and have considerably larger ground coverage (17 × 232 and 14 ×
188 kilometres, respectively).6 Traditionally, CORONA is well-known for its applications in Meso-
potamia, but there are also other examples from Central Asia (Goossens, Wulf, Bourgeois, Gheyle, &
Willems, 2006), China (Min, 2013), Egypt (Moshier & El-Kalani, 2008), Greece (Kalayci, 2014), and
India (Conesa et al., 2015).
CORONA has been the most useful when integrated with other satellite systems for the explora-
tion of archaeological landscapes. To give a few examples, Richason and Hritz (2007) visually compare
CORONA imagery with later-dated sensors and detect ancient canals before the destruction of modern
land-use practices. Menze and Ur (2012) propose CORONA and spectral satellite data coupling in order
to build a multi-temporal classification methodology with the intention of exploring long-term settle-
ment patterns. Parcak, Mumford, and Childs (2017) compare CORONA with later-generation very
high-resolution sensors in order to assess landscape scale changes.
Details of a specific case study will further highlight the use of CORONA in archaeological research.
The Bronze Age landscapes of Upper Mesopotamia contain a distinct archaeological feature called hollow
368 Tuna Kalaycı
Figure 19.4 A CORONA scene (DS1102–1025DF007, December 1967) showing the location and extent
of hollow ways radiating from Tell Brak, Syria. The terminal points of hollow ways can be mathematically
modelled in order to estimate the area of agricultural production around sites.
ways (Ur, 2003, 2017). Wilkinson (1994) convincingly argues that hollow ways were formed due to the
repetitive movement of flocks between settlements and open pasture. While on the move, flocks were kept
in groups in order to minimize damage to agricultural fields. Once the terminal point was passed, animals
were allowed to disperse in the landscape. Therefore, the end points of hollow ways can be considered as
the markers of agricultural production boundaries (Figure 19.4).
Following this theory, Kalayci (2016) explores the relationship between the size of a settlement –
as a proxy of its ancient population – and its crop production potential. The CORONA imagery is
used to map the hollow ways and determine the boundaries of agricultural production from a sample
of Bronze Age sites. Next, he proposes using a combination of precipitation reconstructions and a
rainfall-dependent remote sensor vegetation growth model. This model relies on the use of Normal-
ized Difference Vegetation Index (NDVI) values which are calculated from the Advanced Very High
Resolution Radiometer (AVHRR) (Figure 19.5). Finally, he estimates how much yield a site might have
produced within its production boundaries and compares these metrics with corresponding settlement
areas.
Processing and analysing satellite data 369
Figure 19.5 Multi-spectral data analysis with vegetation indices provides a detailed and dynamic representa-
tion of agricultural landscapes. These models surpass static descriptions of agro-economic zones, which are
usually based on strict assumptions about productivity. The circles in the figure show production boundaries
of Bronze Age settlements.
Selement Area
Total Producon
Figure 19.6 Scatter plots revealing the strength of the relationship between settlement size and estimates of
total production. The plot (a) shows a weak relationship when estimated production values are directly com-
pared with settlement size. However, when a biennial fallowing strategy is introduced for settlements smaller
than 50 hectares (b), the relationship becomes much stronger.
When settlement area and total production estimates are compared directly, the relationship between
these two variables appears to be weak (correlation coefficient, Pearson’s r=0.3) (Figure 19.6(a)). How-
ever, when a biennial fallowing is introduced to the model for settlements smaller than 50 hectares, the
correlation coefficient rises to 0.85 (Figure 19.6(b)). Thanks to the coupling of CORONA with multi-
spectral satellite data analysis, the study provides a dynamic representation of land-use practices and chal-
lenges normative assumptions about population pressure and food production practices.
370 Tuna Kalaycı
Selement Area
Total Producon
Conclusion
Satellite remote sensing is changing the ways in which scholars undertake landscape archaeology projects.
Today there are specialized books dedicated solely to satellite remote sensing applications in archaeology
(Comer & Harrower, 2013; Lasaponara & Masini, 2012b; Parcak, 2009). As a result, satellite remote sens-
ing in archaeology is moving beyond methodological progression and emerging as its own sub-discipline.
It is now possible to document archaeological landscapes in their entirety and purposefully cascade scales
of analysis from site to regional to supra-regional levels, in both space and time.
Very high-resolution satellite imagery can now be acquired at relatively low prices. Governmental
agencies provide open access for advanced multi-spectral (e.g. SENTINEL-2) and hyper-spectral (e.g.
Earth Observing One-Hyperion) data. In particular, the use of online geospatial imagery platforms (e.g.
Google Earth) in archaeological research is a significant shift. These viewing platforms provide free access
to high-resolution satellite imagery, at times, in time series form, making it possible to detect changes in
the landscape. In the field, viewing high-resolution satellite data on smartphones with GPS capabilities is
now almost a standard routine. The scholarship has accomplished astonishing progress since the earlier
pioneering studies (e.g. Behrens & Sever, 1991), which clearly manifests itself in the increasing trend of
scholarly publications (Agapiou & Lysandrou, 2015).
Semi-or fully automated detection of archaeological features in satellite imagery has been attracting
interest from scholars, and it appears that this trend will continue as data resolutions improve, software
attain more user friendly interfaces, and custom algorithms are developed (e.g. Zingman, Saupe, Penatti, &
Lambers, 2016). However, new research domains are also emerging thanks to big data analytics (see also
Green, this volume). Remote sensing archaeologists have started to explore cloud computing opportuni-
ties in order to access petabytes of global-scale data with parallel computational power at the server side
(Agapiou, 2017; Liss, Howland, & Levy, 2017).
Also moving in parallel with the advancements in internet technologies, crowdsourcing for the
analysis of remotely sensed data is a further step towards citizen science in archaeology. Among notable
examples of this approach, the GlobalXplorer Project (www.globalxplorer.org) has been the most visible
Processing and analysing satellite data 371
in the public sphere, while other initiatives continue to provide web platforms for online participatory
projects, e.g. TerraWatchers (www.terrawatchers.org).
A continuous critical reading of geospatial technologies – and especially of satellite remote sensing – is
a matter of the utmost importance. Satellite remote sensing brings exceptional insights into archaeological
questions and opens up innovative research avenues, especially in the landscape domain. However, there
are important concerns relating to the science and technology of this advancement, and it is vital to recog-
nize and understand how power mediates through remote sensing instruments, algorithms and software.7
There is a long list to be critically examined, but only two issues will be briefly highlighted here. First, the
relationship between the military industrial complex and archaeology should be unearthed (Hamilakis,
2009). This relationship has materialized through the use of geospatial tools as they constitute dual-use
technologies (Pollock, 2016, p. 220). In particular, the CORONA spy satellite system as discussed in the
Case Study section, was the continuation of the constant infiltration of army surveillance ideology seeping
into archaeological practice. The second critique is woven around the transformation of surveillance into
scientific objectivity, which finds theoretical underpinnings in Haraway (1988): the findings of a remote
sensing archaeologist are sterilized through electromagnetic energy, disciplined by regular pixels, bounded
by the imagery, and testable for its accuracy at the very least. Archaeology, on the other hand, is messy.
Notes
1 The delineation of sites is not only a methodological problem in satellite remote sensing. There has been no
consensus on the epistemological and ontological status of the archaeological site (Dunnell, 1992). Hence, the
separation between a cultural deposit and its background as observed from space is intrinsically a subjective
archaeological narrative despite its strictly empirical nature.
2 Recent advancements in Remotely Piloted Aircraft Systems (RPAS, also known as drones or Unmanned Aerial
Vehicles (UAVs)) have created an alternative to conventional aerial photography in archaeology. Feasible alterna-
tives, however, still suffer from system constraints, such as limited flight time, payload weight, and sensitivity to
weather conditions.
3 The determination of unique ‘material signatures’ of archaeological sites and features suffers from two major
issues. First, there is no clear way to define the boundary between sites and non-sites, if such a boundary can even
be said to exist (see endnote 1). Second, in very rare cases a pixel is composed of a single homogeneous material.
Almost always, a pixel value is the average of reflected/emitted radiation from various surface features. Therefore,
a ‘material signature’ of a feature exists only at a hypothetical level. Signal estimation of material contributions to a
pixel value (i.e. spectral mixture analysis) is one of the main research avenues in remote sensing (Adams, Smith, &
Johnson, 1986; Somers, Asner, Tits, & Coppin, 2011).
4 Active Radar sensor data is also based on electromagnetic theory, but sensor conceptualization and data analysis
require a different approach than do multi-spectral sensors (Wiseman & El-Baz, 2007). For brevity of discussion,
the remainder of the text is written with multi-spectral systems in mind.
5 A panchromatic band is the combination of red, green and blue bands and sometimes the near-infrared band.
Therefore, during the scan a panchromatic sensor needs less radiation to register a value than does a multi-spectral
sensor. This is to say that relative to multi-spectral sensors, panchromatic sensors can collect the same energy from
a smaller portion of the ground, translating into finer spatial resolution.
6 A photogrammetrically corrected comprehensive CORONA KH4-B inventory is hosted at http://corona.cast.
uark.edu (accessed January 2019).
7 In similar fashion, Geographic Information Systems (GIS) have been facing similar critiques (e.g. Gaffney &
Van Leusen, 1995; Wickstead, 2009). It appears that the discussion is now settled (or has been marginalized!) as
one considers the widespread and somewhat uncritical use of geospatial technologies, although efforts to offer
informed practice continue (Gillings, 2012).
References
Adams, J. B., Smith, M. O., & Johnson, P. E. (1986). Spectral mixture modeling: A new analysis of rock and soil types
at the Viking Lander 1 site. Journal of Geophysical Research: Solid Earth, 91(B8), 8098–8112.
372 Tuna Kalaycı
Agapiou, A. (2017). Remote sensing heritage in a petabyte-scale: Satellite data and heritage Earth Engine© applica-
tions. International Journal of Digital Earth, 10(1), 85–102.
Agapiou, A., Hadjimitsis, D. G., & Alexakis, D. D. (2012). Evaluation of broadband and narrowband vegetation
indices for the identification of archaeological crop marks. Remote Sensing, 4(12), 3892–3919.
Agapiou, A., & Lysandrou, V. (2015). Remote sensing archaeology: Tracking and mapping evolution in European
scientific literature from 1999 to 2015. Journal of Archaeological Science: Reports, 4, 192–200.
Alexakis, D., Sarris, A., Astaras, T., & Albanakis, K. (2009). Detection of Neolithic settlements in Thessaly (Greece)
through multispectral and hyperspectral satellite imagery. Sensors, 9(2), 1167–1187.
Altaweel, M. (2005). The use of Aster satellite imagery in archaeological contexts. Archaeological Prospection, 12,
151–166.
Arvidson, R., Billingsley, R., Chase, R., Chavez, P., Devirian, M., Estes, J., . . . Rossow, W. (1986). Report of the EOS
data panel on the data and information system,Vol. IIa of NASA TM-87777. Washington, DC: National Aeronautics
and Space Administration (NASA).
Barber, M. (2012). A history of aerial photography and archaeology: Mata Hari’s glass eye and other stories. Swindon: English
Heritage.
Beck, A. R. (2007). Archaeological site detection: The importance of contrast. In Proceedings of the 2007 annual con-
ference of the Remote Sensing and Photogrammetry Society (pp. 307–312). Red Hook, NY: The Remote Sensing and
Photogrammetry Society.
Behrens, C. A., & Sever, T. L. (Eds.). (1991). Applications of space-age technology in anthropology. Mississippi:
NASA.
Brophy, K., & Cowley, D. (2005). From the air: Understanding aerial archaeology. London: The History Press Ltd.
Campbell, J. B., & Wynne, R. H. (2011). Introduction to remote sensing. New York: Guilford Publications.
Casana, J. (2014). Regional-scale archaeological remote sensing in the age of Big Data: Automated site discovery vs.
brute force methods. Advances in Archaeological Practice, 2(3), 222–233.
Casana, J., Cothren, J., & Kalayci, T. (2012). Swords into ploughshares: Archaeological applications of CORONA
satellite imagery in the Near East. Internet Archaeology, 32.
Casana, J., & Panahipour, M. (2014). Satellite-based monitoring of looting and damage to archaeological sites in
Syria. Journal of Eastern Mediterranean Archaeology & Heritage Studies, 2(2), 128–151.
Chapman, B., & Blom, R. G. (2013). Synthetic aperture radar, technology, past and future applications to archaeology.
In D. C. Comer & M. J. Harrower (Eds.), Mapping archaeological landscapes from space (pp. 113–131). New York:
Springer Science & Business Media.
Chavez, P. S. (1996). Image-based atmospheric corrections-revisited and improved. Photogrammetric Engineering and
Remote Sensing, 62(9), 1025–1035.
Clark, C. D., Garrod, S. M., & Pearson, M. P. (1998). Landscape archaeology and remote sensing in Southern Mada-
gascar. International Journal of Remote Sensing, 19(8), 1461–1477.
Comer, D. C., & Harrower, M. J. (2013). Mapping archaeological landscapes from space (Vol. 5). New York: Springer
Science & Business Media.
Conesa, F. C., Madella, M., Galiatsatos, N., Balbo, A. L., Rajesh, S. V., & Ajithprasad, P. (2015). CORONA photographs
in monsoonal semi-arid environments: Addressing archaeological surveys and historic landscape dynamics over
North Gujarat, India. Archaeological Prospection, 22(2), 75–90.
Cooper, F. A., Bauer, M. E., & Cullen, B. C. (1991). Satellite spectral data and archaeological reconnaissance in West-
ern Greece. In C. A. Behrens & T. L. Sever (Eds.), Applications of space-age technology in anthropology (pp. 63–79).
Mississippi: NASA.
Day, D. A., Logsdon, J. M., & Latell, B. (Eds.). (1999). Eye in the sky: The story of the CORONA spy satellites. Wash-
ington, DC: Smithsonian Institution Press.
De Laet, V., Paulissen, E., & Waelkens, M. (2007). Methods for the extraction of archaeological features from very
high-resolution IKONOS-2 remote sensing imagery, Hisar (southwest Turkey). Journal of Archaeological Science,
34(5), 830–841.
Deroin, J.-P., Téreygeol, F., & Heckes, J. (2011). Evaluation of very high to medium resolution multispectral satel-
lite imagery for geoarchaeology in arid regions: Case study from Jabali, Yemen. Journal of Archaeological Science,
38(1), 101–114.
Processing and analysing satellite data 373
Dowman, I., & Dolloff, J. T. (2000). An evaluation of rational functions for photogrammetric restitution. International
Archives of Photogrammetry and Remote Sensing, 33(B3/1; PART 3), 254–266.
Dunnell, R. C. (1992). The notion site. In J. Rossignol & L. Wandsnider (Eds.), Space, time, and archaeological landscapes
(pp. 21–41). New York: Plenum Press.
Duro, D. C., Franklin, S. E., & Dubé, M. G. (2012). A comparison of pixel-based and object-based image analysis
with selected machine learning algorithms for the classification of agricultural landscapes using SPOT-5 HRG
imagery. Remote Sensing of Environment, 118, 259–272.
Gaffney, V., & Van Leusen, M. (1995). Postscript-GIS, environmental determinism and archaeology: A parallel text. In
G. R. Lock & Z. Stancic (Eds.), Archaeology and geographical information systems: A European perspective (pp. 367–383).
London: Taylor and Francis.
Garrison, T. G., Houston, S. D., Golden, C., Inomata, T., Nelson, Z., & Munson, J. (2008). Evaluating the use of
IKONOS satellite imagery in lowland Maya settlement archaeology. Journal of Archaeological Science, 35(10),
2770–2777.
Gillings, M. (2012). Landscape phenomenology, GIS and the role of affordance. Journal of Archaeological Method and
Theory, 19(4), 601–611.
Goossens, R., Wulf, A., Bourgeois, J., Gheyle, W., & Willems, T. (2006). Satellite imagery and archaeology: The
example of CORONA in the Altai mountains. Journal of Archaeological Science, 33(6), 745–755.
Hadjimitsis, D. G., Papadavid, G., Agapiou, A., Themistocleous, K., Hadjimitsis, M. G., Retalis, . . . Clayton, C. R. I.
(2010). Atmospheric correction for satellite remotely sensed data intended for agricultural applications: Impact
on vegetation indices. Natural Hazards and Earth System Sciences, 10(1), 89–95.
Hamilakis, Y. (2009). The “war on terror” and the military: Archaeology complex: Iraq, ethics, and neo-colonialism.
Archaeologies, 5(1), 39–65.
Haraway, D. (1988). Situated knowledges: The science question in feminism and the privilege of partial perspective.
Feminist Studies, 14(3), 575–599.
Holcomb, D. W., & Shingiray, I. L. (2007). Imaging radar in archaeological investigations: An image processing
perspective. In J. Wiseman & F. El-Baz (Eds.), Remote sensing in archaeology (pp. 11–45). New York: Springer.
Hritz, C. A., & Wilkinson, T. J. (2006). Using shuttle radar topography to map ancient water channels in Mesopo-
tamia. Antiquity, 80(308), 415–424.
Jackson, D. J. (1962). Classical electrodynamics. New York: John Wiley & Sons, Inc.
Jahjah, M., & Ulivieri, C. (2010). Automatic archaeological feature extraction from satellite VHR images. Acta
Astronautica, 66(9–10), 1302–1310.
Jensen, J. R. (2014). Remote sensing of the environment: An Earth resource perspective. Essex: Pearson Education Limited.
Kalayci, T. (2014). A review on the potential use of CORONA images of Greece. In Proceedings of the Computer
Applications in Archaeology conference (CAA-GR) (pp. 55–63). Rethymno: Crete.
Kalayci, T. (2016). Settlement sizes and agricultural production territories: A remote sensing case study for the Early
Bronze Age in Upper Mesopotamia. Science and Technology of Archaeological Research, 2(2), 217–234.
Kouchoukos, N. (2001). Satellite images and the representation of near Eastern landscapes. Near Eastern Archaeology,
1–2, 80–91.
Larsen, S. O., Trier, D., & Solberg, R. (2008). Detection of ring shaped structures in agricultural land using high
resolution satellite images. In Proceedings of the pixels, objects, intelligence: Geographic object based image analysis for the
21st century conference (pp. 81–86). Alberta, Canada.
Lasaponara, R., Danese, M., & Masini, N. (2012). Satellite-based monitoring of archaeological looting in Peru. In
Satellite remote sensing (pp. 177–193). New York: Springer Science.
Lasaponara, R., & Masini, N. (2006). Identification of archaeological buried remains based on the Normalized Dif-
ference Vegetation Index (NDVI) from Quickbird satellite data. IEEE Geoscience and Remote Sensing Letters, 3(3),
325–328.
Lasaponara, R., & Masini, N. (2012a). Pan-sharpening techniques to enhance archaeological marks: An overview.
In R. Lasaponara & N. Masini (Eds.), Satellite remote sensing: A new tool for archaeology (pp. 87–109). New York:
Springer Science.
Lasaponara, R., & Masini, N. (Eds.). (2012b). Satellite remote sensing: A new tool for archaeology (Vol. 16). New York:
Springer Science.
374 Tuna Kalaycı
Lauricella, A., Cannon, J., Branting, S., & Hammer, E. (2017). Semi-automated detection of looting in Afghanistan
using multispectral imagery and principal component analysis. Antiquity, 91(359), 1344–1355.
Leprince, S., Barbot, S., Ayoub, F., & Avouac, J.-P. (2007). Automatic and precise orthorectification, coregistration,
and subpixel correlation of satellite images, application to ground deformation measurements. IEEE Transactions
on Geoscience and Remote Sensing, 45(6), 1529–1558.
Lillesand, T. M., Kiefer, R. W., & Chipman, J. W. (2004). Remote sensing and image interpretation. New Jersey: John
Wiley & Sons.
Liss, B., Howland, M. D., & Levy, T. E. (2017). Testing Google Earth Engine for the automatic identification and
vectorization of archaeological features: A case study from Faynan, Jordan. Journal of Archaeological Science: Reports,
15, 299–304.
Longley, P. A., Goodchild, M., Maguire, D. J., & Rhind, D. W. (2005). Geographic information systems and science. West
Sussex: John Wiley & Sons.
Malinverni, E. S., Pierdicca, R., Bozzi, C. A., Colosi, F., & Orazi, R. (2017). Analysis and processing of nadir and stereo
VHR Pleiadés images for 3D mapping and planning the land of Nineveh, Iraqi Kurdistan. Geosciences, 7(3), 80.
Menze, B. H., & Ur, J. A. (2012). Mapping patterns of long-term settlement in Northern Mesopotamia at a large
scale. Proceedings of the National Academy of Sciences, 109(14), E778–E787.
Min, L. (2013). Archaeological landscapes of China and the application of CORONA images. In Mapping archaeological
landscapes from space (Vol. 5, pp. 45–54). New York: Springer Science & Business Media.
Moshier, S. O., & El-Kalani, A. (2008). Late bronze age paleogeography along the ancient Ways of Horus in North-
west Sinai, Egypt. Geoarchaeology, 23(4), 450–473.
Oltean, I. A., & Lauren, L. A. (2012). High-resolution satellite imagery and the detection of buried archaeological
features in ploughed landscapes. In R. Lasaponara & N. Masini (Eds.), Satellite remote sensing: A new tool for archaeol-
ogy (Vol. 16, pp. 291–305). New York: Springer Science.
Parcak, S. (2009). Satellite remote sensing for archaeology. Abingdon, UK: Routledge.
Parcak, S., Mumford, G., & Childs, C. (2017). Using open access satellite data alongside ground based remote sensing:
An assessment, with case studies from Egypt’s delta. Geosciences, 7(4), 94.
Pollock, S. (2016). Archaeology and contemporary warfare. Annual Review of Anthropology, 45, 215–231.
Pope, K. O., & Dahlin, B. (1989). Ancient Maya wetland agriculture: New insights from ecological and remote sens-
ing research. Journal of Field Archaeology, 16, 87–106.
Pringle, M. J., Schmidt, M., & Muir, J. S. (2009). Geostatistical interpolation of SLC-off Landsat ETM+ images.
ISPRS Journal of Photogrammetry and Remote Sensing, 64(6), 654–664.
Rakwatin, P., Takeuchi, W., & Yasuoka, Y. (2007). Stripe noise reduction in MODIS data by combining histogram
matching with facet filter. IEEE Transactions on Geoscience and Remote Sensing, 45(6), 1844–1856.
Richason, B. F., & Hritz, C. (2007). Remote sensing and GIS use in the archaeological analysis of the central Mesopo-
tamian plain. In J. Wiseman & F. El-Baz (Eds.), Remote sensing in archaeology (pp. 283–325). New York: Springer.
Riley, D. N. (1987). Air photography and archaeology. Philadelphia: University of Pennsylvania Press.
Savage, S. H., Levy, T. E., & Jones, I. W. (2012). Prospects and problems in the use of hyperspectral imagery for
archaeological remote sensing: A case study from the Faynan copper mining district, Jordan. Journal of Archaeo-
logical Science, 39(2), 407–420.
Scardozzi, G. (2012). Integrated methodologies for the archaeological map of an ancient city and its territory: The
case of Hierapolis in Phrygia. In R. Lasaponara & N. Masini (Eds.), Satellite remote sensing: A new tool for archaeology
(pp. 129–156). New York: Springer Science.
Schmid, T., Koch, M., DiBlasi, M., & Hagos, M. (2008). Spatial and spectral analysis of soil surface properties for an
archaeological area in Aksum, Ethiopia, applying high and medium resolution data. CATENA, 75(1), 93–101.
Schuetter, J., Goel, P., McCorriston, J., Park, J., Senn, M., & Harrower, M. (2013). Autodetection of ancient Arabian
tombs in high-resolution satellite imagery. International Journal of Remote Sensing, 34(19), 6611–6635.
Siart, C., Bubenzer, O., & Eitel, B. (2009). Combining digital elevation data (SRTM/Aster), high resolution satellite
imagery (Quickbird) and GIS for geomorphological mapping: A multi-component case study on Mediterranean
karst in Central Crete. Geomorphology, 112(1), 106–121.
Somers, B., Asner, G. P., Tits, L., & Coppin, P. (2011). Endmember variability in spectral mixture analysis: A review.
Remote Sensing of Environment, 115(7), 1603–1616.
Processing and analysing satellite data 375
Stewart, C. (2017). Detection of archaeological residues in vegetated areas using satellite synthetic aperture radar.
Remote Sensing, 9(2), 118.
Stewart, C., Oren, E., & Cohen-Sasson, E. (2018). Satellite remote sensing analysis of the Qasrawet archaeological
site in North Sinai. Remote Sensing, 10(7), 1090.
Szeliski, R. (2010). Computer vision: Algorithms and applications. London: Springer.
Tao, C. V., & Hu, Y. (2001). Use of rational function model for image rectification. Canadian Journal of Remote Sens-
ing, 27(6), 593–602.
Ur, J. A. (2003). CORONA satellite photography and ancient road networks: A Northern Mesopotamian case study.
Antiquity, 77(295), 102–115.
Ur, J. A. (2017). WorldMap: Hollow ways in Northern Mesopotamia. Boston, MA: Harvard Dataverse. Retrieved from
http://worldmap.harvard.edu/maps/14984
Wickstead, H. (2009). The uber-archaeologist: Art, GIS and the male gaze revisited. Journal of Social Archaeology, 9(2),
249–271.
Wilkinson, T. J. (1994). The structure and dynamics of dry-farming states in Upper Mesopotamia. Current Anthro-
pology, 35(5), 483–520.
Wilson, D. R. (2000). Air photo interpretation for archaeologists. London: The History Press Ltd.
Wiseman, J., & El-Baz, F. (2007). Remote sensing in archaeology. New York: Springer.
Zingman, I., Saupe, D., Penatti, O. A. B., & Lambers, K. (2016). Detection of fragmented rectangular enclosures in
very high resolution remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 54(8), 4580–4593.
Zitová, B., & Flusser, J. (2003). Image registration methods: A survey. Image and Vision Computing, 21(11), 977–1000.
20
Processing and analysing
geophysical data
Apostolos Sarris
Introduction
Despite the specificities of each geophysical technique, the goal is to maximize the information content
of the measurements taken and to transform the registered signals into a simple, clear and accurate form
that will allow a direct interpretation of them in relation to the suspected targets and the properties of
the surrounding soil matrix. Since subsurface soil interactions depend on different environmental and
climatic variables and anthropogenic interventions, it is necessary to apply specific processing to high-
light the information that is partially obscured by them. The need for the filtering of geophysical data
is obvious because experimental data contain external noise levels that mask the essential content of the
information that exists in the original measurements.
The processing of geophysical data follows a more or less standard flow, consisting of the preprocessing
and editing of the raw data, the basic processing routines based on signal or image processing, the applica-
tion of more sophisticated filters and algorithms, and the creation of maps for visualization purposes. Cer-
tain pre-processing steps are required to improve the quality and accuracy of the original measurements.
Simple processing, such as pre-amplification of the signal and noise reduction, can be achieved with
digital filtering. A number of standard processing techniques can be used for all kinds of data (magnetic,
resistance, conductivity, etc.) acquired with a normal sampling strategy within a grid (mapping mode).
More specialized processing is required for techniques that deal with data related to Ground Penetrating
Radar (GPR), Electrical Resistivity Tomography (ERT), seismic techniques, Electromagnetic Induction
(EMI) techniques and microgravity measurements.
The sections below present a short review of the particular processes utilised for the most commonly
applied techniques. The aim has been to provide an indicative, rather than exhaustive account, since more
detailed discussions of individual methods can be found elsewhere. For example, a thorough summary of
the most sophisticated processing techniques applied to principally magnetic archaeological survey data
has been given by Scollar, Weidner, and Segeth (1986) and Scollar, Tabbagh, Hesse, and Herzog (1990,
pp. 491–513). The fundamental methodologies on GPR data processing are also described in Jol (2009)
and Conyers (2013).
Processing and analysing geophysical data 377
Method
Preprocessing techniques
De-spiking of the original data is necessary in the initial stage of processing. Point-like random interfer-
ence (expressed as extreme values in the data) caused by the existence of isolated features, instrumental
malfunctions or by improper measuring conditions, needs to be removed and usually replaced by inter-
polated values, based on a subset (array) of the surrounding measurements. This is usually achieved by
the continuous shifting of the filter to the next subset of the observed data (moving window method).
A situation frequently encountered during the zig-zag acquisition of measurements (i.e. where mea-
surements are logged up and down adjacent transect lines) is the defect of traverse striping, namely a
different average value existing along alternating transects according to the alignment of the sensor (most
common in fluxgate magnetometry data due to an improper alignment of the sensor). The reduction
of the values along each transect to a common value removes the striping effect and at the same time
can contribute to the matching of adjacent grids. Common to the zig-zag mode of surveying is also the
problem of staggering effects, namely the slight misalignment of the measurements along the alternat-
ing transects, depending on the survey pace (mainly in the automatic mode of surveying). This kind of
displacement has to be corrected accordingly by shifting the values to their exact position.
The above processes are usually followed by reduction of the data to a common reference value (e.g.
the average background resistance for soil resistance data or the 0-mean for vertical magnetic gradient
data). The reduction of the measurements to a common reference level also contributes to the matching
of the measurements that have been carried out in different grids (grid matching or grid equalization).
In situations where surveys are carried out over a long period of time or because of incorrect balancing
of the instruments, the background values of different grids can be shifted with the consequence that the
grid results can be difficult to match. In soil resistivity/conductivity surveys, the mismatching of grids
can also result from changes of climatic conditions between grids and the repositioning and separation of
the remote probes. In magnetic surveys, mismatching of grids originates due to the balancing of the sen-
sors at different locations or due to diurnal variations in the magnetic field. After achieving an optimum
matching among grids, the entire geophysical map of a site is processed as a whole (Sarris, 1992; Geoscan
Research, 2005) (Figure 20.1).
Main convolution
All the techniques that measure the total magnitude of a physical quantity include an arbitrary amount
of background noise, usually systematic in nature, due mainly to the geological nature of the site and
the top soil uniformity. Alongside these broad, long-range regional trends originating from the under-
lying geology are instrumental random errors or noise spiking signals, which are also often present in
the geophysical maps. Filters of different ranges with the appropriate cut-off thresholds can be used
to isolate anomalies of interest. Most of the filtering treatment of geophysical data from archaeologi-
cal sites is performed in the spatial domain. Here, low pass filters use a high value threshold in order
to permit long-range geological features to pass, whereas high pass filters use a low cut-off value to
Figure 20.1 Multisensor (8 sensors) vertical magnetic gradient survey with SENSYS at Pella, North. Greece.
Left image indicates the original data suffering from various spikes, traverse striping and grid mismatching.
Right image indicates the results of processing that tried to remove those specific effects. A colour version of
this figure can be found in the plates section.
Processing and analysing geophysical data 379
allow only the high values to pass. A large search radius (or large dimensions of the moving window)
of a low pass smoothing filter can result in the virtual elimination of small anomalies and noise. In
archaeological surveys, where some of the anomalies are weak compared to the general background,
a small moving window is suggested (3×3 or 5×5), whereas for very large surveys, larger windows can
be employed to eliminate any regional trends. Band pass filters are also used to allow a certain range
(band) of values to pass by combining a broad filter to reject the regional trends and a narrow filter to
remove high values.
Residual filtering, by means of subtracting each reading from the mean of measurements, both along
a profile (line equalization) or throughout the map, can eliminate overall trends and emphasize subtle
anomalies. Low pass filtered residuals (the difference of low pass data from raw data) are used to achieve
an enhancement of magnetic data by reducing high frequency noise. On the other hand, high pass filters
(e.g. by means of measuring the curvature of the map and assigning +/– values for +/– curvature and a
0 value at inflection points) can outline the edges of features and increase the degree of details of certain
anomalies, in a similar way to residual filtering (MacLeod & Dobush, 1990).
Similar to the above, the total magnetic field measurements suffer from regional magnetic trends,
which are capable of masking short-range magnetic anomalies. The regional trends can be approximated
by either employing a single dipole approximation of the geomagnetic field or calculating the latitudinal
and longitudinal gradient components. In the majority of archaeological prospection surveys, a simple
trend analysis and removal of a least square fitted surface from the actual data are sufficient due to the
small extents of the areas surveyed. As a consequence of the removal of trend, regional patterns are
reduced and the local anomalies are emphasized in a similar way as the computation of high pass residuals
(Kearey & Brooks, 1984, pp. 188–189). In the frequency domain, the removal of trend is a necessary pre-
treatment procedure prior to the Fourier transformation of the raw data, since linear trends and non-zero
means of the data increase the power of the spectrum in the lower wavenumber range.
Dean (1958) and Scollar (1970) have summarized the advantages of the frequency analysis of potential
magnetic data. In the frequency domain, where signals are converted to frequencies, operations are less
intensive, and the effects of convolution processes are more obvious than in the spatial domain, where
processing is applied directly to the pixel of an image. In the frequency (or wavenumber) domain, different
frequency filters are applied to enhance a certain range of frequencies. The process involves the transforma-
tion from the space domain (x-y plane) to the frequency domain (u-v plane), by a two dimensional Fourier
series bounded by both upper and lower frequencies due to the finite nature of the area and the sampling
size of the survey. Thus, the Fourier series becomes a representation of the different frequencies that form
the potential field data. Due to the finite nature of magnetic potential data, instead of a Fourier series, a Fou-
rier transform (usually a Fast Fourier Transform, FFT) can be used to generate a correspondence between
spatial and frequency domain data (Scollar, 1970, p. 11; Jahne, 1991, p. 53). The FFT conserves the image
information and the inverse transform can fully recover the spatial representation of data. In the frequency
domain, each point of the grid (image) is represented by the amplitude (relative position) and the phase of
a periodic function. Features are resolved into different frequency components, including regional trends,
local influences and noise, adding a new perspective to the two dimensional spatial data.
The power spectrum (i.e. the squared amplitude of the Fourier components) of the transformation
indicates the amplitude of the wavenumbers and carries slightly less information due to the loss of the
phase component. In general, deep geological structures are responsible for generating low amplitude
anomalies of large horizontal extent and low horizontal gradients (Mares, 1984, p. 127). Thus, the power
spectrum of the FFT regional (geological) component is represented in the lower wavenumber (k) range
of the spectrum (Figure 20.2). It has to be noted that the low wavenumbers (k) correspond to low fre-
quencies (f) and large wavelengths (λ). On the other hand, local features in shallower strata are character-
ized by high amplitude anomalies, of small horizontal extent and more abrupt changes of their horizontal
gradients. These high frequency features are best represented in the higher wavenumber range of the FFT
power spectrum of the magnetic data (high k, high f, small λ). The bimodal nature of the energy spec-
trum with respect to the wavelength permits the distinction between shallow and deep magnetic sources,
which are usually separated by a neutral (no energy) wavelength band (Alldredge, Van Voorhis, & Davis,
1963). The power spectrum can also be used for approximating the horizontal dimensions of the subsur-
face targets, and estimates of the depth of the target features can be calculated from the decay rate of the
power spectrum (Bhattacharyya, 1966, p. 98). Although Fourier analysis is limited by the spectral overlap
of the anomalies, some more information can be obtained by the examination of the phase component of
the anomalies (Zurflueh, 1967, p. 1017). In total, FFT and the frequency processing of magnetic potential
data are useful in resolving specific anomalies, changing the effective field inclination, calculating the
vertical derivatives of the total magnetic field intensity and identifying areas of interest. A summary of
the different frequency filtering techniques is given by Telford, Geldart, and Sheriff (1990, pp. 107–110).
The selective two-dimensional filtering of high or low wavenumber components enhances the cor-
responding shallow or deep features associated with those components (Kearey & Brooks, 1984, p. 169).
Processing and analysing geophysical data 381
Figure 20.2 Application of the Fast Fourier Transform (FFT) power spectrum analysis of the magnetic data
obtained from the Bronze Age cemetery of the Békés Koldus-Zug site cluster – Békés 103 (BAKOTA proj-
ect). The depth of the various targets (h) is easily determined by measuring the slope of the power spectrum
at different segments and dividing it by 4π (Spector & Grant, 1970). The radially averaged spectrum was cal-
culated and used to separate the magnetic signals coming from deep sources (h=2.87 m) and shallow sources
(h=0.73 m) below the sensor. The spectrum was also used as a guide to define a bandwidth filter in order to
eliminate the sources with wavenumber more than 550 radians/m and less than 100 radians/m respectively,
and enhance the magnetic signal coming from the potential archaeological structures. A colour version of this
figure can be found in the plates section.
High frequency anomalies from shallow sources can be enhanced by high pass filters. Low pass filtering
can eliminate high frequency effects created by inhomogeneities close to the magnetometer sensor or
in the vicinity of the electrodes in an electrical resistance survey (Pattantyus, 1986, p. 566). In order to
achieve a separation of regional and residual anomalies with frequency filters, the spectral contents of the
anomalies from the two different depths are assumed to be somewhat different.
382 Apostolos Sarris
In magnetic or EMI data, any change in the height of the sensor from the ground, due to deep plough-
ing, changing of operators or even imperfect balancing of the sensors, may cause striping effects either
within a grid or among grids. This will result in periodic noise that is removed by examination of the
power spectrum of the data, which specifies the degree of periodicity and the filter that has to be applied
to remove the effects of striping.
In upward and downward field continuation, the potential field is calculated above or below the plane
of measurement in order to give emphasis to deeper or shallower features correspondingly (Sheriff, 1989,
p. 141). It can enhance or depress the regional or residual component of the maps and even reduce topo-
graphic and sensor height variation effects (Telford et al., 1990, p. 106). Upward continuation of the mag-
netic field is applied in order to emphasize the regional trends originating from deep geological sources.
The opposite effect is achieved by downward continuation of the magnetic potential data, resulting in
an attenuation of the deep sources and accentuation of near surface targets. The successive application of
downward continuation is able to resolve the overlapping anomalies of adjacent features, give an estimate
of their vertical extent and enhance the high wavenumber components of the magnetic map, namely
those related to the shallow subsurface features. Downward continuation is accurate enough up to a depth
of two to three times the separation between the source and the sensor (Bhattacharyya, 1965, p. 857).
When dealing with total magnetic field measurements, the calculation of the vertical derivatives can
emphasize the outline of features present near the inflection points of the magnetic field. Thus, the verti-
cal derivatives are used as a measure of the curvature of the magnetic map, the large values of which cor-
respond to the shallow targets. Similarly, the reduction of magnetic potential data to the pole has been one
of the standard geophysical processing steps in eliminating the distortion of the magnetic map due to the
obliquity of the magnetic field, thus locating the magnetic anomalies right above the sources responsible
for them. The frequency response of the filter (Scollar et al., 1990, p. 494; Spector, 1975) depends on a
factor for the direction of the measurement and a factor for the direction of magnetization. The method
produces a simpler and more symmetrical appearance of extensive features, suppresses weak anomalies
and removes the general distortion of the magnetic maps.
transects, an alternative solution is to work with FFT de-striping as in the gridded data. This however
requires that any gaps within the survey are filled with interpolated values, as the FFT does not accept
measurement voids or non-orthogonal area shapes (Kalayci & Sarris, 2016).
Interpolation procedures are also problematic in non-g ridded data, as the proximity of the sensors, the
overlap of measurements from different sensors, the non-equal balance of the sensors, the titling of the
sensors during the survey and other practical factors introduce inconsistencies among the measurements
obtained. For this reason, different functions (such as a convex hull or an a-shape algorithm) are applied
to create a buffer of a piecewise linear simple curve associated with the distribution of the finite set of
measurements (Bernardini & Bajaj, 1997). The particular buffers indicate the overlap of the regions, lead-
ing to the elimination of spatially close measurements. Then, interpolation algorithms (such as Inverse
Distance Weighted (IDW), kriging, minimum curvature, nearest neighbor, etc. (see Conolly, this volume;
Lloyd & Atkinson, this volume) can be used to create the resulting map.
soil strata. Different 2D/3D resistivity inversion algorithms (Loke & Barker, 1996; Loke, 2004) can be
applied and most of them partition the subsurface into a number of layers, which are also subdivided to a
number of individual elements, which are allowed to have an independent resistivity value. These inver-
sion algorithms run iteratively, continuously changing the resistivity of these elements until they achieve
the best possible match between the theoretical values of the electrode configurations with the observed
measurements. The divergence between the actual measured values and the theoretical model is evalu-
ated through the calculation of the Root Mean Square (RMS) error. The stability of the RMS is a sign
of the best matching between the calculated and apparent resistivity values and leads to a more accurate
reconstruction of the distribution of the subsurface resistivity (Papadopoulos et al., 2011).
In order to avoid large inconsistencies in the inversion model, it is necessary to eliminate any extreme
measurements. These are caused by poor contact resistances of the electrodes or connectivity problems
and have a more confined dynamic range of resistivity values. Topographic corrections are also required
to define the actual location of the electrodes and create a more realistic image of the pseudo-section in
terms of the existing terrain.
Having a number of parallel ERT transects, it is further possible to create a 3D model of the apparent
resistivity and from this to extract slices of the horizontal distribution of the resistivity with increasing
depth (Figure 20.3), in a similar way to the depth slices of the GPR survey (Papadopoulos, Tsourlos,
Tsokas, & Sarris, 2006, 2007).
Figure 20.3 (a) 3D resistivity model. (b) Three dimensional distribution of the calculated apparent resistivity
results from model A. (c) Pseudo-3D slices of the resistivity resulting from the 2D inversions along the X, Y
and XY axes. (d) 3D resistivity model from the three dimensional inversion. Due to the wide range of the
resistivity values, a logarithmic scale is used (Papadopoulos et al., 2006).
Processing and analysing geophysical data 385
(b)
Figure 20.4 An example of processing approaches applied to a radargram obtained at Lechaion archaeological
site with an antenna of 250MHz: (a) Raw data without any processing, (b) application of Dewow, (c) Spreading
Exponential Compensation (SEC) gain with an attenuation of 6.16, start gain of 2.56 and maximum gain of
542, (d) application of automatic gain with a window width of 1.5 and maximum gain of 500, (e) application
of regional background removal, (f) migration process, (g) application of a high pass filter, and (h) envelope
transformation.
(c)
(d)
(e)
(f)
on the adjacent traces in the spatial direction (spatial filters) or along the traces in the vertical direction
(down to trace or vertical filters) (Annan, 2009; Sensors & Software, 2013).
Having processed each radargram, a Hilbert Transform computes the instantaneous amplitude, con-
verts it to a magnitude value based upon which time slices are extracted. Usually, the specific process is
accompanied by a migration process, which reduces the reflectors’ hyperbolas into point like sources.
Having an estimate of the velocity of propagation of the EM signal through the specific ground condi-
tions, time slices are transformed to depth slices, namely maps indicating the horizontal distribution of
the reflectors (reflector amplitude) with increasing depth. In depth slices, intense values indicate strong
Processing and analysing geophysical data 389
(g)
(h)
changes of high reflectivity, namely changes of the electrical properties of the soil strata. For the accu-
rate determination of the velocity of propagation of the EM waves (~0.10m/ns on average), Common
Mid-Point (CMP) or Wide Angle Reflection and Refraction (WARR) GPR surveys are needed. Due to
the compactness of the antennas used, this is not a common practice in most archaeological prospection
surveys; instead an estimate of the velocity is calculated through hyperbola shape fitting to one of the
registered reflectors (Sensors & Software, 2015, pp. 87–92).
Depth slices are usually exported to 3D visualization software to create a volumetric representation of
the reflectors’ distribution. Through the isolation of the weak background noise, it is possible to separate
the most intense reflectors and have a 3D visualization of their extent.
390 Apostolos Sarris
Figure 20.5 Gravity residual anomalies recorded above two tombs (Tombs 4 (above) and 8 (below)) of the
Roman cemetery on the Koutsongila Ridge at Kenchreai on the Isthmus of Corinth, Greece. The centres
of the tomb chambers are located approximately at the middle of the transects. According to the result-
ing graphs, it is estimated that both tombs have a width of about 4.5–5 m. The gravity signature of tomb
T4 is better presented compared to the one of tomb T8, probably, because T4 is located within a more
homogeneous geological unit (valley fill deposits), whereas T8 is located at the border between the valley
fill deposits and conglomerate outcrops that extend to the central section of the ridge. Both tombs have
created a well-defined gravity anomaly with at least 0.04–0.08 mGal maximum variation with respect to
the average background (Sarris et al., 2007).
approaches or simplified sensor position (taking into account the actual location of the sensor) or down-
ward continuation (taking into account the height differences of the sensor) corrections (Panisova &
Pasteka, 2009). The gravitational effect of the surrounding walls of the structures are modelled through
finite elements or prisms (having a specified thickness, density and height) reaching a range of about
10–100 mGal (Debeglia & Dupont, 2002).
392 Apostolos Sarris
employed in reflection seismic surveys. Lacking source coupling due to rough surface conditions, the
effect of the weathering layer and topography of the survey area attenuate the seismic signals and produce
weak reflections. Signals registered by the geophones represent a time series, as the acoustic waves are
recorded at different times depending on the distance of the geophones from the source. The effect of
multiple reflections in a layer (ringing effect), which is often encountered in the seismic measurements,
is addressed through a deconvolution process, which collapses the wavelets to their approximate point
locations. As the travel times of the acoustic waves are recorded with respect to the distance offset, time
(Normal Move Out – NMO) corrections are applied based on a primary velocity analysis to separate the
uncorrelated coherent noise (multiples) from the primary reflections. Frequency multiples can be further
removed through CMP stacking, which accounts for increasing the signal to noise ratio. Frequency-
wavenumber migration is the final main processing stage as it reduces the hyperbolas of the diffraction
signals to their actual point-like locations (Yilmaz, 1976; Hatton, Worthington, & Makin, 1986). Trace
to trace decorrelation attenuates even further the background noise. At the end of the processing and
based on the modeling of each seismic profile, transects are integrated to provide a 3D layered model of
the subsurface (Figure 20.6).
(a)
Figure 20.6 Results of a seismic refraction survey at the area of the assumed ancient port of Priniatikos Pyrgos
in East Crete, Greece: (a) 2D image representing the depth to the bedrock, which reaches about 40 m below
the current surface (bluish colors). The black dots represent the position of the geophones along the seismic
transects. The area has been completely covered by alluvium deposits and other conglomerate formation frag-
ments as a result of past landslide and tectonic activity. The interpretation of the velocity of propagation of
the acoustic waves revealed the spatial distribution of (b) the alluvium deposits at the top (velocity of 491 m/
sec), (c) the lower and upper terrace deposits (velocity of 1830 m/sec), (d) the medium depth sandstones and
conglomerates (velocity of 2400 m/sec) and (e) the deeper weathered limestone or cohesive conglomerates
(velocity of 4589 m/sec) (Sarris, Papadopoulos, & Soupios, 2014). A colour version of this figure can be found
in the plates section.
(b)
(c)
(e)
Case study
A number of geophysical approaches have been carried out in the past years aiming to explore the inter-
nal organization of a number of Neolithic settlements in the area of Thessaly, Central Greece, and their
environs. Thessaly is the locus of a high density of mounted (magoules) and flat settlements which played
a critical role in the origins of the Neolithic in Europe. The limited number of systematic excavations or
the application of large scale non-invasive methods has been a serious defect in understanding the spatial
organization of these agricultural villages. In order to explore the structural diversity and similarities of
the settlements, a manifold geophysical survey and airborne/space-borne remote sensing approach was
implemented to map a number of them. The geophysical surveys made use of handheld (Bartington
G601) and cart-based multisensory magnetometry (SENSYS GmbH MX Compact system carrying 8
fluxgate gradiometer sensors), GPR (Sensors & Software Noggin Plus unit with a 250MHz antenna), soil
resistance (Geoscan Research RM85) and EMI (Geophex GEM-2, GF Instrument CMD Explorer, and
Geonics EM-31) techniques, accompanied by analysis of the chemical and magnetic properties of soil
samples. Differential GPS (DGPS) units were employed for navigation of the multisensory cart and the
EMI surveys. Processing of the data followed the pipeline that has been outlined in the above sections.
In total, within the auspices of the IGEAN project (Innovative Geophysical Approaches for the Study of
Early Agricultural Villages of Neolithic Thessaly), 21 Neolithic settlements were scanned covering a total
area of more than 70 hectares (Sarris et al., 2017).
Processing of the magnetic data was able to map the architectural structures and the layout of the
sites. Intense magnetic anomalies were caused by burnt (intentionally?) daub houses, whereas the stone
foundations exhibited weaker magnetic values but stronger GPR or resistivity signals, for example at
Velestino-Mati (Figure 20.7). The stone foundations most probably belong to different occupation
phases and indicate either re-occupation or expansion and growth of the settlement, as at Almyriotiki
(Figure 20.8). Different layouts were distinguished, following either a circular arrangement on mounted
settlements as at Almyros 2 (Figure 20.9), Almyriotiki and Velestino-Mati, or rectangular planning in
flat settlements. Magnetic data were also revealing with respect to the clusters/neighborhoods within the
settlement and the differences in terms of house size and orientation. The extent of most of the settle-
ments was revealed through magnetic techniques and most of the detected enclosures were confirmed
to have ditches through the EMI measurements (increased magnetic susceptibility and soil conductivity).
Geophysical results did not resolve the ambiguity of the function of the enclosures, defensive (Runnels
et al., 2009), social (Halstead, 1999), for burials or water storage (Pappa & Besios, 1999). However, flood-
ing simulation modeling based on the analysis of satellite derived DEMs suggested that the ditches might
have acted as a counter-measure against periodic flooding episodes. This hypothesis was strengthened
(a)
(b)
Figure 20.7 Results of the geophysical surveys at Velestino Mati. The magnetic data (a) indicates the nucleus
of the settlement at the west top of the magoula with some expansion towards the east top. A number of high
dipolar magnetic anomalies are associated with burnt daub foundations that were also confirmed from the
Electromagnetic Induction (EMI) soil magnetic susceptibility (b) and the soil resistance data (c). Magnetic
susceptibility also confirmed the existence of enclosures around the tell. A colour version of this figure can be
found in the plates section.
398 Apostolos Sarris
(c)
by the EMI measurements that indicated high conductivity of the soils in particular sections outside the
ditches, providing further support of a flood farming production strategy as was originally suggested by
Van Andel and Runnels (1995). Breaks along the perimeter of the (sometimes multiple and even concen-
tric) enclosures were also revealed designating the entrances to the settlements.
The above techniques provided an exceptional contribution to the study of the Neolithic landscapes
of the region of Thessaly and they managed to produce new evidence regarding the spatial characteristics
of the settlements, their extent and internal structure raising more questions about their social organiza-
tion and exploitation of the surrounding land.
Conclusion
Geophysical data processing offers an unlimited number of choices that depend on the method of
measurement adopted, survey conditions, instrumentation and the configuration of the sensors, the
physical properties measured, the location and properties of the targets, and the various automatic or
semi-automatic algorithms that are able to enhance the original signals. Whatever the case, the common
denominator remains the interpretation of the measurements and the recognition of the subsurface fea-
tures, and even if the topic is being increasingly approached through more automated techniques such
as machine learning or pattern recognition, the interpretation of geophysical features will remain one of
the main challenges of shallow depth archaeological prospection. The interpretation process cannot be
considered to be always irrefutable; it will always be connected to the geological, soil and surface char-
acteristics, the target’s preservation condition, our experience, theoretical models and hypotheses. To this
(a)
Figure 20.8 Results of the geophysical surveys at Almyriotiki. The magnetic data (a) presented a clear image
of the internal planning of the settlement: Burnt daub structures follow a circular orientation around the top
of the tell. The houses expand further to the south, where some weaker magnetic anomalies representing stone
houses with internal divisions are also present. An irregular wide ditch system encloses the settlement from the
east and the north and it is confirmed from the EMI magnetic susceptibility (b) and soil conductivity measure-
ments (c). The high soil conductivity to the north coincides with an area susceptible to periodic flooding. The
above were also confirmed from the soil viscosity measurements (d) as an indicator of the soil permittivity.
A colour version of this figure can be found in the plates section.
(b)
Figure 20.9 Results of the geophysical surveys at Almyros 2. The magnetic data (a) depict clearly the con-
centration of burnt daub structures at the centre of the tell, expanding further to the south. The settlement
is surrounded by a double ditch system, which is confirmed by both EMI magnetic susceptibility (b) and soil
conductivity data (c). A number of breaks in this double enclosure are most probably associated with multiple
entrances to the settlement. Soil conductivity seems also to increase outside the settlement to the south and
west directions (north to the top), namely in the area which is most susceptible to flooding. A colour version
of this figure can be found in the plates section.
(b)
end, it will continue to be a joint undertaking involving all the different disciplines and experts that are
concerned with unravelling the secrets of our past that are hidden beneath the surface of our planet.
References
Alldredge, L. R., Van Voorhis, G. D., & Davis, T. M. (1963). A magnetic profile around the world. Journal of Geophysi-
cal Research, 68, 3679–3692.
Annan, A. P. (2009). Electromagnetic principles of ground penetrating radar. In H. M. Jol (Ed.), Ground penetrating
radar theory and applications (pp. 1–40). Amsterdam: Elsevier.
Bernardini, F., & Bajaj, C. L. (1997). Sampling and reconstructing manifolds using alpha-shapes. Computer Science Techni-
cal Reports. Department of Computer Sciences, Purdue University, 1–11.
Bevan, B., & Kenyon, J. (1975). Ground penetrating radar for historical archaeology. MASCA Newsletter, 11(2), 2–7.
Bhattacharyya, B. K. (1965). Two-dimensional harmonic analysis as a tool for magnetic interpretation. Geophysics,
30(5), 829–857.
Bhattacharyya, B. K. (1966). Continuous spectrum of the total magnetic field anomaly due to a rectangular prismatic
body. Geophysics, 31(1), 97–121.
Burger, H. R., A. F. Sheehan, & C. H. Jones. (2006). Introduction to applied geophysics: Exploring the shallow subsurface.
New York: W. W. Norton & Company.
Conyers, L. B. (2013). Ground-penetrating radar for archaeology (3rd ed). Series Editors: L. B. Conyers & K. L. Kvamme.
Geophysical Methods for Archaeology No. 4. Lantham, MD: AltaMira Press.
Dean, W. C. (1958). Frequency analysis for Gravity and Magnetic Interpretation. Geophysics, (23), 97–127.
Debeglia, N., & Dupont, F. (2002). Projet de Réalisation d’un Nouveau Réseau Gravimétrique Français – Liaisons Gravimé-
triques Entre des Bases du Réseau Français de Référence RGF83 et des Bases Absolues Récentes, BRGM/RP-51502-FR.
Edwards, L. S. (1977). A modified pseudosection for resistivity and induced polarization. Geophysics, (42), 1020–1036.
Geoscan Research. (2005). Instruction manual 1.97 (Geoplot 3). Bradford: Geoscan Research.
Green, R. (1974). The seismic refraction method: A review. Geoexploration, 12(4), 259–284.
Halstead, P. (1999). Neighbours from Hell? The household in Neolithic Greece. In P. Halstead (Ed.), Neolithic society
in Greece (pp. 77–95). Sheffield: Sheffield Academic Press.
Hatton, L., Worthington, M. H., & Makin, J. (1986). Seismic data processing theory and practice. Oxford: Blackwell
Scientific Publications.
Jahne, B. (1991). Digital image processing: Concepts, algorithms, and scientific applications. Berlin: Springer–Verlag.
Jol, H. M. (2009). Ground penetrating radar theory and applications. Amsterdam: Elsevier.
Kalayci, T., & Sarris, A. (2016). Multi-sensor geomagnetic prospection: A case study from Neolithic Thessaly, Greece.
Remote Sensing, 8(11), 966, doi: 10.3390/rs8110966, OPEN ACCESS: www.mdpi.com/2072-4292/8/11/966/html
Kearey, P., & Brooks, M. (1984). An introduction to geophysical exploration. Oxford: Blackwell Scientific Publications.
Lessard, Y. A. (1981). Simulation of magnetic surveying techniques to study the effects of diurnal variations. Senior Project
CS498. Lincoln: Dept. of Computer Science, University of Nebraska-Lincoln.
Loke, M. H. (2004). Tutorial: 2-D and 3-D electrical imaging surveys. Retrieved from www.geoelectrical.com
Loke, M. H., & Barker, R. D. (1996). Rapid least-squares inversion of apparent resistivity pseudo-sections using quasi-
newton method. Geophysical Prospecting, (48), 181–152.
MacLeod, I. N., & Dobush, T. M. (1990). Geophysics: More than numbers: Processing and presentation of geophysical data.
4th National Outdoor Action Conference on Aquifer Restoration, Ground Water Monitoring and Geophysical
Methods, 14–17 May, Las Vegas, Nevada (pp. 1081–1095).
Mares, S. (1984). Introduction to applied geophysics. Prague: D. Reidel Publishing Company.
Panisova, J., & Pasteka, R. (2009). The use of microgravity technique in archaeology: A case study from the
St. Nicolas Church in Pukanec, Slovakia. Contributions to Geophysics and Geodesy, 39(3), 237–254.
Papadopoulos, N. G., Tsourlos, P., Papazachos, C., Tsokas, G. N., Sarris, A., & Kim, J.-H. (2011). An algorithm for
the fast 3-D resistivity inversion of surface electrical resistivity data: Application on imaging buried antiquities.
Geophysical Prospection, (59), 557–575.
Papadopoulos, N. G., Tsourlos, P., Tsokas, G. N., & Sarris, A. (2006). 2D and 3D resistivity imaging in archaeological
site investigation. Archaeological Prospection, 13(3), 163–181.
Processing and analysing geophysical data 407
Papadopoulos, N. G., Tsourlos, P., Tsokas, G. N., & Sarris, A. (2007). Efficient ERT measuring and inversion strategies
for 3D imaging of buried antiquities. Near Surface Geophysics, 5(6), 349–362.
Pappa, M., & Besios, M. (1999). The neolithic settlement at Makriyalos, Northern Greece: Preliminary report on the
1993–1995 excavations. Journal of Field Archaeology, 26(2), 177–195.
Parasnis, D. S. (1997). Principles of applied geophysics. London: Chapman & Hall.
Pattantyus, M. A. (1986). Geophysical results in archaeology in Hungary. Geophysics, 51(3), 561–567.
Reford, M. S. (1980). History of geophysical exploration, magnetic method. Geophysics, 45, 1640–1658.
Reynolds, J. M. (2011). An introduction to applied and environmental geophysics (2nd ed.). Chichester: John Wiley &
Sons Ltd.
Runnels, C. N., White, C., Payne, C., Wolff, N. P., Rifkind, N. V., & LeBlanc, S. A. (2009). Warfare in Neolithic
Thessaly: A case study. Hesperia, 78(2), 165–194.
Sarris, A. (1992). Shallow depth geophysical investigation through the application of magnetic and electric resistance techniques
(Ph.D. Dissertation). University of Nebraska-Lincoln, Dept. of Physics and Astronomy, Lincoln, USA: A Bell &
Howell Company.
Sarris, A., Dunn, R. K., Rife, J. L., Papadopoulos, N., Kokkinou, E., & Mundigler, C. (2007). Geological and geo-
physical investigations in the Roman cemetery at Kenchreai (Korinthia), Greece. Journal of Archaeological Prospec-
tion, (14), 1–23.
Sarris, A., Kalayci, T., Simon, F.-X., Donati, J., Garcia, C. C., Manataki, M., . . . Stamelou, E. (2017). Opening a
new frontier in the Neolithic settlement patterns of Eastern Thessaly, Greece. In A. Sarris, E. Kalogiropoulou,
T. Kalayci, & L. Karimali (Eds.), Communities, landscapes, and interaction in Neolithic Greece: Proceedings of international
conference, Rethymno 29–30 May 2015 (pp. 27–48). Ann Arbor, MI: International Monographs in Prehistory.
Sarris, A., Papadopoulos, N., & Soupios, S. (2014). Contribution of geophysical approaches to the study of priniatikos
pyrgos. In B. P. C. Molloy & C. N. Duckworth (Eds.), A cretan landscape through time: Priniatikos pyrgos and environs
(pp. 61–69). Oxford: BAR International Series 2634.
Scollar, I. (1970). Fourier transform methods for the evaluation of magnetic maps. Prospezioni Archeologiche, (5), 9–41.
Scollar, I., Tabbagh, A., Hesse, A., & Herzog, I. (1990). Archaeological prospecting and remote sensing. Cambridge: Cam-
bridge University Press.
Scollar, I., Weidner, B., & Segeth, K. (1986). Display of archaeological magnetic data. Geophysics, 51(3), 623–633.
Scott, J. H., & Markiewicz, R. D. (1990). Dips and chips-PC programs for analyzing seismic refraction data. Proceedings
of SAGEEP 1990, Golden, Colorado (pp. 175–200).
Sensors & Software. (2013). EKKO_Project. Mississauga, Canada: Sensors & Software.
Sensors & Software. (2015). LineView. Mississauga, Canada: Sensors & Software.
Sheriff, R. E. (1989). Geophysical Methods, New Jersey: Prentice Hall.
Simon, F.-X., Sarris, A., Thiesson, J., & Tabbagh, A. (2015). Mapping of quadrature magnetic susceptibility/magnetic
viscosity of soils by using multi-frequency EMI. Journal of Applied Geophysics, 120, 36–47.
Simon, F.-X., Tabbagh, A., Donati, J., & Sarris, A. (2018). Permitivity mapping in the VLF-LF range using a multi-
frequency EMI device: First tests in archaeological prospection. Near Surface Geophysics, 17(1), 27–41.
Spector, A. (1975). Application of aeromagnetic data for porphyry copper exploration in areas of volcanic cover. 45th Annual
International Meeting of the Society of Exploration Geophysicists, 15 October, Denver, Colorado.
Spector, A., & Grant, F. S. (1970). Statistical models for interpreting aeromagnetic data. Geophysics, 35(2), 293–302.
Szymczyk, M., & Szymczyk, P. (2013). Preprocessing of GPR Data. Image Processing & Communication, 18(2–3), 83–90.
Telford, W. M., Geldart, L. T., & Sheriff, R. E. (1990). Applied geophysics (2nd ed.). Cambridge: Cambridge University
Press.
Van Andel, T. H., & Runnels, C. (1995). The earliest farmers in Europe. Antiquity, 69(264), 481–500.
Weymouth, J. W. (1976). A magnetic survey of the walth bay site (39WW203). Lincoln, Nebraska: Midwest Archaeologi-
cal Center, National Park Service, U.S. Department of the Interior.
Weymouth, J. W., & Lessard, Y. A. (1986). Simulation studies of diurnal corrections for magnetic prospection. Pros-
pezioni Archeologiche, (10), 37–47.
Yilmaz, O. (1976). A short note on deep seismic sounding in Turkey. Journal of Geophysical Society of Turkey, (3), 54–58.
Zurflueh, E. G. (1967). Applications of two dimensional linear wavelength filtering. Geophysics, 32(6), 1015–1035.
21
Space and time
James S. Taylor
Introduction
Concepts of spatiotemporality
Concepts of ‘space’ and ‘time’ are fundamental to the discipline archaeology, which deals with the distri-
bution of human material culture at various scales through the timespan of human existence (see Lucas,
2005). More broadly, the integrated nature of spatiotemporality has been well established in science for
over a century since the acceptance of a general theory of relativity (Einstein, 1905). Since then, as Daly
and Lock outlined in their comprehensive review of the subject – “Timing is Everything” – (Daly &
Lock, 1999, p. 289), over the course of the 20th century a robust corpus of theoretical literature has devel-
oped across the humanities especially in anthropology (Evans-Pritchard, 1939; Levi-Strauss, 1948, 1961;
Block, 1977; Bourdieu, 1977; Fabian, 1983; Gell, 1992), sociology (Giddens, 1984), philosophy (McTag-
gart, 1908; Heidegger, 1953; Husserl, 1966) and of course geography (Carlstein, Parkes, & Thrift, 1975;
Hägerstrand, 1975; Carlstein & Thrift, 1978; Parkes & Thrift, 1978; Soja, 1989; Harvey, 1991; Soja, 1996).
Of particular interest to archaeology are those conceptions of temporalities that relate to similar spatial
concepts of landscape and place (as espoused by Soja, 1989; Harvey, 1991; adapted by Ingold, 1993). It is
commonly accepted that there are many alternative forms of temporal perception, generally based upon
the perspective of the observer (whether that be the emic agent of a ‘past society’, or the etic archaeolo-
gist, see Headland, Pike, & Harris, 1990; and for an in depth discussion of this Taylor, 2016, pp. 43–44).
Daly and Lock also note clear engagement by many theoretical archaeologists with the way in which
“constructs of time can be relevant to and applied in archaeology” and “how archaeology can contrib-
ute to the overall understanding of time as it relates with humans and human processes” (Daly & Lock,
1999, p. 289). They specifically highlight works by Bradley (1991), Clark (1992), Ingold (1993), Barrett
(1994), Gosden (1994), Thomas (1996), Terrell and Welsch (1997), Bradley (1998), and Frachetti (1998);
but one might also factor in work by Braudel (1972), Leone (1978), Braudel (1980), and Shanks and Til-
ley (1987) as well as Bailey (1983, 1987, 2007, 2008), much of which is neatly discussed by Lucas in his
Archaeology of Time (2005).
Despite this longstanding disciplinary awareness of the relevance of time, temporality and the affor-
dances of spatiotemporal computing, fully integrated spatiotemporal synthesis remains uncommon
Space and time 409
within archaeological narratives. However, it is perhaps more significant that even as digital methods con-
tinue to gain more and more traction within archaeology, the integration of spatial and temporal data, and
its subsequent analyses remains relatively simplistic, and even somewhat elusive within the fundamental
data structures that underpin the discipline.
From a computational angle, the integration of time or temporal data within Database Management
Systems (DBMS) and Geographic Information Systems (GIS) has been also discussed in detail both in
relation to, and outside of the discipline of Archaeology (see for example Langran, 1989, 1992; Roddick &
Patrick, 1992; Lock & Daly, 1998; Abraham & Roddick, 1999; Daly & Lock, 1999; Peuquet, 2002; Green,
2011a, 2011b; De Roo, Bourgeois, & Maeyer, 2013; Taylor, 2016). However there has been a distinct
lack of development in this area within commonly used GIS software, and, given the long history of the
development of GIS and spatial technologies (dating back to the 1960s), any sort of integrated temporal
functionality has been a relatively recent innovation.
Given the potential ways in which temporality might be addressed in computational terms (see
following) the reasons for this are not entirely clear. It may be that the specific spatiotemporal require-
ments of archaeologists are too niche to warrant the sustained research and development of integrated
spatiotemporal technologies. This seems unlikely, as a need for integrated spatiotemporal analysis of data
are not unique to Archaeology, and similar issues apply to other disciplines (for example geography, the
environmental and social sciences and project management within the engineering, construction and
planning industries all have complex spatiotemporal data requirements). More likely the apparent sim-
plicity of current computational approaches to space and time appears to be rooted in one fundamental
issue: ‘that it is complicated’. Constructing a computational spatiotemporal system and/or data-structure
that serves the varied and subtle analytical needs of these disciplines is difficult and generally lacks real
world development.
A key aspect is that within computational spatial technologies Space is by definition Cartesian in its
conception, and therefore framed (or perhaps limited) by the Euclidean geometric and algebraic tools
most commonly used to describe and understand it (maps, coordinates, projections, etc.). This approach
to space has been extended into our understanding of time (at least within the constraints of spatial
technologies) which is commonly understood in terms of “cartographic time” (see Langran, 1992,
pp. 28–29). This is a distinctly ‘Newtonian’ view of time as a linear fourth Cartesian dimension that
flows from past infinity to future infinity and can be measured separately from the other three spatial
dimensions (ibid.; Taylor, 2016). Thus by extension time is most commonly defined by GIS practitio-
ners, both in archaeology or otherwise, as an extension of a spatial Cartesian system; a measurable fourth
dimension. This approach to time fits neatly with an archaeological (and more general social) concept of
a linear, chronological time, and usually culminates in a form of numeric spatiotemporal data (i.e. lists of
coordinates, dates and timestamps).
This definition of time clearly privileges the linear, sequential, ‘measureable’ temporality that also
neatly fits within the dominant relational data structures that we use to manage our spatiotemporal data.
For most archaeologists time is almost a fourth coordinate which generally manifests as some sort of
timestamp at a relevant level of granularity (a broad archaeological period, or a year/range of years given
by an absolute dating technique). This is a hugely simplistic way of dealing with temporality. Perhaps
the challenge when seeking an integrated form of spatiotemporality comes in defining a more nuanced
and abstract temporality, reflecting the presence of multiple real-world temporalities. As geographers and
spatial theorists have increasingly sought to characterise a sense of ‘Place’ as opposed to ‘Space’, so perhaps
there is a need to distinguish a different sense of ‘Temporality’ from the common linear, numerical con-
cept of ‘time’ itself (see for example Soja, 1989; or Ingold, 1993). Archaeology, as a discipline predomi-
nantly concerned with the subtleties of these concepts, has much to offer.
410 James S. Taylor
Table 21.1 Table summarising the three key computational approaches to the integrated conceptual modeling of
spatiotemporality in spatial technologies.
for an archaeological audience, however she outlined four popular conceptions of computational spatio-
temporality, detailing their pros and cons at a pragmatic level (Langran & Chrisman, 1988, p. 11; Langran,
1992, pp. 37–44). These, alongside a further event-based approach posited by Peuquet and Duan (1995),
are based upon the simplified codification of a discrete (linear) temporal attribute, or chronon. That is, a
single “nondecomposable unit of time with an arbitrary duration” (Snodgrass, 1992, p. 24); i.e. a second,
a minute, or a year, for example.
Method
Implementation
Despite a well-established theoretical discourse on how time might function alongside space within
spatial technologies, and with some notable prototypes, there still remains no fully functional T-GIS.
Peuquet’s (2002) review and critique of the state of these spatiotemporal conceptual models offers some
insight as to why. She notes the extension of both conventional relational (or otherwise) DBMS and
spatial data models to “include temporal data, or vice versa, will [. . .] result in forms of implementation
that are both complex and voluminous”, particularly if one wants to capture the nuances of “temporal
interrelationships, such as temporal coexistence of specific entities or relative temporal configuration of
various events that are not explicitly stored” (Peuquet, 2002, p. 307). Indeed even Peuquet’s own concep-
tual answer to this, the ‘Event-Based Spatio Temporal Data Model’ (ESTDM), which ultimately (despite
being problematic) proposes a more versatile and efficient approach to temporal modelling (Peuquet &
Duan, 1995) has only seen limited (if any) realisation within current GIS technologies.
Outside of the relatively small corpus of geographic examples (highlighted by predominantly research
driven examples), there has also been relatively little published work to date on the integration of space
and time within GIS or spatial technologies (compared to other research and development innovations in
412 James S. Taylor
spatial technologies), and certainly no literature advocating a standard or best practice for the structure of
spatiotemporal data. There is no broad research into a common methodology for its analysis (still less within
archaeological applications). Despite the huge amount of potential in this field of research, there remains
much work to be done.
That said, things are beginning to move on as processing power has become faster, and software more
sophisticated. Work on more integrated spatiotemporal data management has continued in the wider
commercial GIS industry. Most notably, ESRI (Environmental Systems Research Institute) began to
improve the functionality of temporal data in its 2010 software release: ArcGIS 10, with the addition of a
fairly straightforward ‘time slider’ (but, see also discussion of ‘space-time cubes’ further in chapter). These
tools facilitate temporal animation (based upon start and end date attributes, effectively time-stamps, of
objects within the geodatabase) in order to visualise the evolution of features in a geodatabase. Again
reliant on chronon-based data, this approach is closely related to basic time-slicing techniques (outlined
earlier) and as such, is predominantly useful for the consideration of time instants (single events) and
extents (features with lifespan). Such functionality again preferences temporal models rooted in absolute
time, that are less able to cope with the vague period ranges and fuzzy boundaries of our absolute dat-
ing methods (ceramic spot dates, radiocarbon probability ranges, etc.) or the relative chronologies that
dominate archaeological temporalities.
Ultimately all of these conceptual approaches and software solutions are problematic for one funda-
mental reason: that time is still not truly represented as a continuum, but as a list of events or chronons
that represent incremental changes to space. The temporal data is effectively constrained by its location
and is still not dealt with as a free and discrete entity. Generally these spatial systems have one thing in
common in terms of the way they handle time: they all adopt an approach that requires the tabulation
of temporal data. Rather than being fully integrated, time is simply appended as metadata to spatial data
sets; an issue which, by definition, prohibits the development of a true T-GIS (Langran, 1992, pp. 11–12).
Consequently, from a spatiotemporal perspective our main methodological challenge currently boils
down to one simple question: how do we code time in order to fully integrate it with our spatial data?
From a temporal perspective the most literal and difficult way to implement this concept is through
the temporal “voxelisation” of spatial data, where rasterised two-dimensional data is converted into a
three-dimensional voxel structure (a regular grid of 3D cells), “in which the height of the voxels is a time
interval” (Lin & Mark, 1991, p. 987), as opposed to the more conventional use of voxel height to represent
a Cartesian z-coordinate. Hypothetically, it has been suggested that interpolation between the ‘original
data based time slices’ could be used to construct (or re-construct) missing temporal layers, i.e. gaps in
the data (Lin & Mark, 1991, p. 987). Daly and Lock (1999) highlight the inevitable questions about what
would make “appropriate interpolation techniques”.
To date therefore most experimentation in large-scale voxel modeling has tended to focus upon the
field of commercial geological prospection and the interpolation of subsurface geology for analysis at
the landscape level (see, for example, van der Meulen et al., 2013, for a good example of this approach).
Recently, however the process of making space-time cubes has also become more attainable, since major
software releases have begun to incorporate tools for their production (alongside a suite of other tempo-
ral tools), and there is a growing trend in experimentation with the potential of voxel modelling within
archaeology (see, for example, Landeschi, 2018).
Related to the space-time cube, Langran notes that “the trajectory of a two-dimensional object
through time creates a worm-like pattern in this phase space” (Langran, 1992, 37) – a space-time path
(STP). Specifically, this concept builds upon the earlier ‘Time Geography’ of Hägerstrand (1967; see
also for example Kraak, 2003; Miller, 2005; Yu, 2006; Miller & Bridwell, 2009). Halls and Miller (1995,
1996) articulate a similar approach to modeling temporality along a space-time path. They suggest that
a data object’s ‘lifespan’ can be represented as a mathematical curve, or ‘worm’, which can be viewed as
a ‘temporal arc’, constrained by a series of ‘temporal nodes’, or ‘todes’, which can influence the trajec-
tory of the worm. In real terms is it possible to repurpose the three-dimensional utility of modern GIS
to represent time as the third variable in a three-dimensional model (Lock & Harris, 1997)? Again, one
Cartesian dimension (height) is generally sacrificed so that time can be represented as the third axis of a
two-dimensional spatial dataset (Daly & Lock, 1999, p. 288). This approach has been effectively deployed
in a number of cases, perhaps most evocatively by Kwan (2002a, 2002b; see also Kwan, 2008; Kwan &
Ding, 2008), in her efforts to visualise the everyday social geographies of individuals (outlined in the case
studies in this chapter).
The STPs have been used to great effect in the approach adopted by ‘Feminist GIS’ geographer Kwan
(2008) in her study of the experiences of Muslim women in Columbus, Ohio, within the context of a
post ’9/11’ USA. It highlights the potential of STPs in the visualisation of qualitative aspects of spatio-
temporality using spatial technologies. Kwan’s approach clearly demonstrates that it is possible to adapt
some of the conceptual approaches for visualising time and temporality in GIS, to explore an implicit
and somewhat intangible social context of spatiotemporality. This may have far reaching implications for
archaeology, where there may be considerable potential for exploring the lifespan of archaeological finds
(see, for example, Kraak, 2003).
Figure 21.2 Example of Langran’s ‘Snapshot Approach’. In this case ‘snapshot’ (Si) presents a particular ‘world
state’ at time (ti).
Source: Note here that the temporal distance between ‘snapshots’ need not be uniform (redrawn by Neil Gevaux after Peuquet &
Duan, 1995, 9; Langran, 1992).
criticised as being restrictive as it also constrains temporality to known points in time: “the events that
change one state to the next” are not explicitly recorded (Langran, 1992, p. 39). This is a linear approach,
and while much temporal data is distinctly non-linear in character, this is not explicit when visualised as
a sequence of time-slices (Halls & Miller, 1996, p. 12).
Although the case studies in the following section deploy more sophisticated approaches, utilising
bespoke software, it is remarkably straightforward to produce a simple time-slice sequence in any GIS
through the creation of a simple ‘time’ field in the attribute table of a spatial dataset. This is done regu-
larly in archaeological data sets to define and produce archaeological period maps, and phased plans of
excavation data for example – which are, in effect, sequent snapshots.
Event-based modeling
The remaining methods of spatiotemporal modeling might be distinguished from the others, as being
‘event oriented’ (see Table 21.1). Like all of the other approaches time is distilled as a spatial ‘attribute’,
which can be symbolised accordingly and discrete changes in space through time can be modeled as dis-
crete temporal events usually in a separate (but linked) data table (Langran, 1992, p. 44). In the ‘Base State
With Amendments’ approach a single spatial layer forms the so-called ‘base state’ of a specific geographic
region, and subsequent ‘amendments’ to this image are superimposed (Langran, 1992, pp. 39–41).
‘Space-Time Composite Modeling’ builds upon this method by “flattening” all the temporal data into
a single layer, where formal coding and topology are utilised to reconstruct the temporal sequence. This
can manifest either as a raster-based “temporal grid” solution or a vector-based approach (Langran, 1992,
pp. 43 & 46–47; see also Hazelton, 1991; Kelemis, 1991). The former sees a ‘temporal list’ attached to each
pixel, representing a specific location on a spatially registered grid (Figure 21.3). In the latter, polygon
‘entities’ are imbued with inherent temporal attributes representing incremental change (Figure 21.4)
distinct from their neighbours (Langran, 1992, p. 47). With the polygon approach it is technically pos-
sible to generate new ‘regions’ from the intersection of superjacent polygons holding information about
the ‘change’ between them (see Lin & Mark, 1991).
These event-based approaches have the advantage over conventional snapshot approaches in that they
only store temporal data related to specific locations, also reducing data redundancy (Peuquet & Duan,
1995, p. 10). However, they afford little insight into the process of change itself (Daly & Lock, 1999,
p. 288).
416 James S. Taylor
Figure 21.3 Example of Langran’s ‘Temporal Grid’ solution – here a temporal grid is created and a variable
length ‘list’ is attached to each grid cell to denote successive changes.
Source: redrawn by Neil Gevaux after Langran, 1992, p. 46
Figure 21.4 Example of Langran’s ‘Amendment Vector Approach’ – showing urban encroachment where
urban encroachment is represented as a base state (left) with incremental amendment vectors.
Source: redrawn by Neil Gevaux after Langran, 1992, p. 40
Case studies
The following case studies can be related to the conceptual models outlined above. As noted, given the
scale of digital practice in archaeology there are few who have fully engaged with computing space and
time. Consequently, there are relatively few robust archaeological case studies.
Space and time 417
Figure 21.5 The TimeMap Data Viewer (TMView) map space. A colour version of this figure can be found
in the plates section.
Source: from Johnson, 2002a
418 James S. Taylor
vector] snapshots at known points in time, and a series of transitions between these snapshots” (Johnson,
1997). It allows geographically registered historical features, maps and satellite imagery to be superim-
posed and animated in an event-based system. TimeMap is not a topological system. It does not record
the relationship between features in space and time, it simply records their location (Johnson, 1997, p. 6).
As such TimeMap is not a true ‘spatiotemporal system’. Its primary function is the dynamic representa-
tion of the past with limited capability for more complex spatiotemporal analysis. More recently, the
addition of temporal functionality in ArcGIS (including a time slider and temporal animation function)
has meant the notion of dynamic mapping pioneered by the TimeMap project is rapidly becoming an
integral spatiotemporal tool in modern GIS.
spatiotemporal animations to present the results of this collaborative study as a form of prototype
‘visual biography’, [that would be] more dynamic and nuanced than conventional phasing, that
might be used to underpin and illustrate a social narrative of the building.
(Taylor et al., 2015, p. 127)
In the absence of absolute dates for every stratigraphic unit in the sequence, the project took an
approach that involved parsing through the stratigraphic matrix of the structure (Harris, 1989), to define
a minimum number of stratigraphic events that could be cross-correlated and coded in relation to one
another, in such a way that they serve as a relative start/end ‘node’ and allow individual units to be defined
in terms of their lifespan, or ‘temporal arc’ (Taylor et al., 2015, pp. 133–146). This ‘temporally-enabled’
spatial data allowed the in-built time-slider functionality of ArcGIS to visualise the spatial data as a series
of dynamic animations that could be symbolised using any of the other data linked to the stratigraphic
units as attribute tables. The result is a powerful spatial visualisation that can be linked to other visualisa-
tions of statistical approaches (such as density, or cumulative frequency of material culture) in order to
demonstrate a wide variety of aspects of the material culture and depositional sequence through time
(Figure 21.6). Although the project was a pilot study, the overall goal was to produce a tool to illustrate
and underpin rich multidisciplinary “visual narratives” about the depositional sequence and its relation-
ship to the material culture it yields, and the overall ‘life-cycle’ of the structure (Chadwick, 2001; after
Lucas, 2001; Taylor et al., 2015, pp. 146–148).
Space and time 419
Figure 21.6 Sample frame from an animation in the ‘Up In Flames’ study that combined synchronised ani-
mated density graphs (produced in the R Software Environment) with animated density maps (produced in
ArcGIS). A colour version of this figure can be found in the plates section.
Source: from Taylor et al., 2015, p. 145
Figure 21.7 Diagram showing the relationship between the various inputs and outputs of the ‘OH_FET’ urban
fabric model (social use, space and time) and the dynamics of the potential analytical outputs.
Source: from Rodier & Saligny, 2010, p. 32
Figure 21.8 Time-GIS (TGIS) screenshot showing dates symbolised according to temporal topology. The
colour coding is according to the temporal topological relationship between each date and the currently
selected time period. A colour version of this figure can be found in the plates section.
Source: from Green, 2011b, p. 217
layer falling within a selected period, based upon the percentage overlap between the date’s range and the
selected period (see Figure 21.8) (Green, 2011b).
The development of this TGIS has helped to establish ‘aoristic’ or ‘fuzzy’ approaches to handling
archaeological chronological data (see also Fusco & de Runz, this volume), such as those implemented in
recent work considering the way in which time and space might be addressed in large complex archaeo-
logical datasets (e.g. those generated by the Portable Antiquities Scheme – PAS), whereby
rather than assigning artefacts to relative typochronological phases (e.g. the appropriate coin period
or pottery phase), this approach considers the probability that the objects under consideration
belong to one or more time-slices of equal (or less commonly unequal) length (e.g. 50 years) across
a given study period.
(Cooper & Green, 2017).
Conclusion
Recent releases of many off-the-shelf GIS packages have seen the inclusion of increasingly complex spa-
tiotemporal functionality suggesting that the tide is turning with regards to the way in which time and
temporality might be integrated into spatial datasets. For archaeologists, the importance of the spatiotem-
poral integration of archaeological data cannot be overstated, particularly as the discipline continues to
benefit from recent innovations in the Bayesian modelling of radiocarbon dates (Bayliss, Bronk-Ramsey,
Van der Plicht, & Whittle, 2007: Bayliss et al., 2015; Whittle, Richardson, Healy, Alton, & Bayliss, 2011).
Together then, generally, the various models and methodological implementations presented in the
422 James S. Taylor
discussion above highlight that there are a number of ways to approach how to embed temporal data
within GIS and spatial technologies.
However, the methodologies discussed in this chapter tend to share a strong focus on codifying time
so it fits within common relational data structures (complete with the inherent restrictions of linear spa-
tiotemporality and tabular structure), either as an integral attribute of spatial data, or as a layer of related
data that can be analysed and visualised separately. Even with renewed focus on digital technologies in
recent decades, and radical advancements in computer science, we still have not achieved the simplest fully
functional TGIS as defined by Langran (1992) or Peuquet (2002). Ultimately, in order for true spatiotem-
poral analysis to progress in Archaeology there is a distinct need to rethink the underlying data-structures
that drive spatial technologies, and carry out more concerted research into how space and time might be
integrated to inform our analysis and outputs (see discussion in De Roo, 2013, pp. 619–620).
the flexibility in defining object relationships that OO GIS and databases provide has a tremendous
potential for redefining how archaeologists can manage temporal variables. The possibility exists
for unique temporal relationships to be constructed, unfettered by predetermined categories (for
the basis of OO is the construction of the categories and the extents of the relationships that can
exist between them).
(Daly & Lock, 1999, p. 289)
In traditional relational database models (which are much easier to design and implement) the archaeo-
logical entity is represented by a table and relationships between archaeological entities (temporal or
otherwise) are reflected in the relationships between the database tables. By contrast object-oriented
databases focus upon modelling the archaeological entity as an object, which can “participate in events”.
This means they are defined both by what they are and what they do (Richards, 1998, p. 333). As such
temporal information may be embedded within the objects themselves by definition. The capacity for
object entities to inherit properties of those entities from which it is comprised offers a potentially seam-
less transition between various temporal scales, and granularity.
More recently archaeologists are experimenting with ‘Semantic Web’ technologies, which are reliant
on a ‘graph data’ structure, using subjects, predicates and objects (often referred to as triples). Semantic
Web data is mapped to appropriate controlled vocabularies, thesauri and/or ontologies, allowing interop-
erability with other data mapped to the same authoritative structure (Wright, 2011, p. 13). Within
archaeology, data is often mapped to the domain ontology known as the CIDOC (International Council
for Documentation) Conceptual Reference Model (CRM). CIDOC-CRM is the ISO standard for the
cultural heritage domain, and may prove to be a particularly useful means of coding sophisticated multi-
layered temporal information into spatial objects (Taylor & Wright, 2012), potentially offering a more
holistic and interoperable form of spatiotemporality. This ultimately begs the question as to whether
there may be potential in this suite of technologies for handling a more elegant integration of different
Space and time 423
Table 21.2 Table summarising the seven baseline temporal operators of Allen’s interval algebra (1983), which, along
with their inversions, define a total of 13 relationships between two temporal intervals.
types of temporal information within a single data structure (such as absolute and relative temporal data)
perhaps drawing upon the complex nuances offered by ‘Allen operators’ (Allen, 1983). These operators
define 13 base relations for temporal reasoning, that capture the relationship between a pair of intervals
as tabulated in Table 21.2.
Extensions to the CRM, including the CRM-EH and particularly the CRM-archaeo, have begun
to define and model these operators, and take into account other forms of temporal models (including
‘Spacetime volume’). In experimenting with the application of the CRM, considerable work has been
done demonstrating how dates and timespans (instances and intervals) can be aligned at a disciplinary
level for use with Semantic Web modeling (Binding, 2011). Further research has also been conducted
considering ways to semantically handle spatial data (Wright, 2011; Doerr & Hiebel, 2013; Hiebel,
Doerr, & Eide, 2013; Hiebel, Doerr, Hanke, & Masur, 2014; Hiebel, Doerr, & Eide, 2016) and to some
extent stratigraphic data (Cripps, Greenhalgh, Fellows, May, & Robinson, 2004; Cripps & May, 2010;
Tudhope, Binding, Jeffrey, May, & Vlachidis, 2011). This work has culminated in the construction of
some interesting prototypes for geo-based Semantic Web applications (see, for example, the Pelagios
Commons project: http://commons.pelagios.org; Isaksen, Barker, Simon, & de Soto, 2014; Barker, Simon,
Isaksen, & de Soto Cañamares, 2016) and robust efforts to define broader temporal definitions in order
to facilitate the semantic interoperability of temporal data (see for, example, the PeriodO project: http://
perio.do; Rabinowitz, 2014; Rabinowitz, Shaw, Buchanan, Golden, & Kansa, 2016). However, despite
this, there is considerable work to be done to make these technologies user friendly enough for wider
implementation. These fields of research will undoubtedly have an important impact upon the future of
‘spatio-temporal technologies’.
commensurate need to continue to refine space/time in this sense from a basic software development
perspective. Perhaps what practitioners in archaeological spatial technologies should really be striving
for, given the importance of various temporalities to the discipline, is a move towards a more inferred,
interpretative spatiotemporality, taking into account past perceptions of time, including the landscapes
and taskscapes of Ingold (1993), or the narrative biographical concepts of temporality explored by the
likes of Lucas (2005), Yamin (1998, 2001), Beaudry (1998), and King (2006). These qualitative analytical
methods resonate with trends laid out in the growing body of ‘critical GIS’ literature that has emerged
since the mid 1990s, in response to post-modern critiques of GIS technologies (Pickles, 1995), and the
consolidation of a discrete and complementary field of GIScience (for a review of this critique and history
of this sub-discipline see Elwood, 2006; O’Sullivan, 2006; Pavlovskaya, 2006). Much of the discourse of
critical GIS highlights the obvious tension between the ease of producing more conventional ‘represen-
tational’ outputs based upon Euclidian spatial and temporal data constructs and the difficulties of using
GIS to offer more fluid, qualitative and interpretative ‘non-representational’ spatiotemporal outputs. This
is further echoed in recent calls for a more non-representational approach to applied GIS in archaeology,
which seek to understand the world as being “spatio-temporally contingent”, where “the past [is not]
understood as a frozen and pre-g iven entity [. . .] but rather as something that continuously melts down
and is remade in the present” (Hacιgüzeller, 2012, p. 255). Perhaps the potential affordances of the ‘graph
data’ approaches emerging from research into semantic ontologies and the still underexploited ‘O-O’ DB
systems will ultimately help archaeologists to visualise a more complex, socially oriented and integrated
spatiotemporality.
References
Abraham, T., & Roddick, J. F. (1999). Survey of spatio-temporal databases. GeoInformatica, 3(1), 61–99.
Allen, J. F. (1983). Maintaining knowledge about temporal intervals. Communications of the ACM 26, 11, 832–843.
Andresen, J., & Madsen, T. (1992). Data structures for excavation recording: A case of complex information manage-
ment. In C. U. Larsen (Ed.), Sites & monuments: National archaeological records (pp. 49–67). Copenhagen: National
Museum of Denmark.
Andresen, J., & Madsen, T. (1996a). IDEA: The integrated database for excavation analysis. In H. Kamermans &
K. Fennema (Eds.), Interfacing the past: Computer applications and quantitative methods in archaeology CAA95
(pp. 3–14). Leiden: Analecta Praehistorica Leidensia.
Andresen, J., & Madsen, T. (1996b). Dynamic classification and description in the IDEA. Archeologia e Calcolatori,
7, 591–602.
Bailey, G. N. (1983). Concepts of time in quaternary prehistory. Annual Review of Anthropology, 12, 165–192.
Bailey, G. N. (1987). Breaking the time barrier. Archaeological Review from Cambridge, 6, 5–20.
Bailey, G. N. (2007). Time perspectives, palimpsests and the archaeology of time. Journal of Anthropological Archaeol-
ogy, 26, 197–223.
Bailey, G. N. (2008). Time perspectivism: Origins and consequences. In S. Holdaway & L. Wandsnider (Eds.), Time
in archaeology: Time perspectivism revisited (pp. 13–30). Utah: Utah University Press.
Barker, E., Simon, R., Isaksen, L., & de Soto Cañamares, P. (2016). The pleiades gazetteer and the pelagios project.
In M. L. Berman, R. Mostern, & H. Southall (Eds.), Placing names: Enriching and integrating gazeteers (pp. 97–109).
Bloomington: Indiana University Press.
Barrett, J. C. (1994). Fragments from antiquity: An archaeology of social life in Britain, 2900–1200 BC. Oxford: Blackwell.
Bayliss, A., Brock, F., Farid, S., Hodder, I., Southon, J., & Taylor, R. E. (2015). Getting to the bottom of it all: A
bayesian approach to dating the start of Çatalhöyük. Journal of World Prehistory, 28(1). https://doi.org/10.1007/
s10963-015-9083-7
Bayliss, A., Bronk-Ramsey, C., Van der Plicht, J., & Whittle, A. (2007). Bradshaw and bayes: Towards a timetable for
the Neolithic. Cambridge Archaeological Journal, 17(1), 1–28.
Beaudry, M. C. (1998). Farm journal: First person, four voices. Historical Archaeology, 32(1), 20–33.
Space and time 425
Berggren, Å., Dell’Unto, N., Forte, M., Haddow, S., Hodder, I., Issavi, J., . . . Taylor, J. (2015). Revisiting reflex-
ive archaeology at Çatalhöyük: Integrating digital and 3D technologies at the trowel’s edge. Antiquity, 88,
433–448.
Bezzi, A., Bezzi, D., Francisci, D., & Gietl, R. (2006). L’utilizzo di Voxel in campo archeologico. Paper given at Settimo
Meeting Degliutenti Italiani di GRASS, febrarrio, Genova, Italy.
Binding, C. (2011). Relatively speaking: Temporal alignment for archaeological data. Paper given at KCL, London,
PELAGIOS Project Workshop.
Block, M. (1977). The past in the present and the past. Afan, 12, 278–292.
Bourdieu, P. (1977). Outline of a theory of practice. Cambridge: Cambridge University Press.
Bradley, R. (1991). Ritual, time, and history. World Archaeology, 23, 209–219.
Bradley, R. (1998). The significance of monuments. London: Routledge.
Braudel, F. (1972). The meditérranéan and the meditérranéan world in the age of Phillip II. New York: Harper and Row.
Braudel, F. (1980). On history. London: Weidenfield & Nicolson.
Carlstein, T. D. P., Parkes, D., & Thrift, N. (Eds.). (1975). Human activity and time geography. London: Unwin Hyman.
Carlstein, T. D. P., & Thrift, N. (Eds.). (1978). Timing space and spacing time, Vol. II: Human activity and time geography.
London: Edward Arnold Ltd.
Castleford, J. (1992). Archaeology, GIS and the time dimension: An overview. In G. Lock & J. Moffett (Eds.), CAA91.
Computer applications and quantitative methods in archaeology 1991 (BAR International Series S577) (pp. 95–106).
Oxford: Tempus Reparatum.
Chadwick, A. M. (2001). What have the post-processualists ever done for us? Towards an integration of theory and practice;
and radical field archaeologies. Paper given at University of York, Interpreting Stratigraphy Group Meeting, York
(pp. 9–36).
Clark, G. (1992). Space, time, and man: A prehistorian’s view. Oxford: Cambridge University Press.
Cooper, A., & Green, C. (2017). Big questions for large, complex datasets: Approaching time and space using com-
posite object assemblages. Internet Archaeology, 45. https//doi.org/10.11141/ia.45.1
Crema, E. R. (2011). Aoristic approaches and voxel models for spatial analysis. In E. Jerem, F. Redő, & V. Szeverényi
(Eds.), On the road to reconstructing the past: Computer applications and quantitative methods in archaeology (CAA): Pro-
ceedings of the 36th international conference, Budapest, April 2–6, 2008 (pp. 179–186) (CD-ROM 199–106). Budapest:
Archeaeolingua.
Cripps, P., Greenhalgh, A., Fellows, D., May, K., & Robinson, D. (unpublished). Ontological modelling of the work of the
centre for archaeology, September 2004.
Cripps, P., & May, K. (2010). To OO or not to OO? Revelations from defining an ontology for an archaeological
information system. In F. Niccolucci & S. Hermon (Eds.), Beyond the artefact – Digital interpretation of the past.
Computer applications and quantitative methods in archaeology (CAA). Proceedings of the 32nd international conference,
Prato, Italy, 13–17 April 2004 (pp. 57–61). Budapest: Archaeopress.
Daly, P. T., & Lock, G. R. (1999). Timing is everything: Commentary on managing temporal variables in geographic information
systems. Paper given at Barcelona, Computer Applications and Quantitative Methods in Archaeology: CAA98,
March 1998 (pp. 287–293).
De Roo, B., Bourgeois, J., & Maeyer, P. D. (2013). On the way to a 4D archaeological GIS: State of the art, future
directions and need for standardization. Proceedings of the 2013 Digital Heritage International Congress, 2.
Doerr, M., & Hiebel, G. (unpublished). CRMgeo: Linking the CIDOC CRM to GeoSPARQL through a spatiotemporal
refinement, April 2013.
Einstein, A. (1905). Zur Elektrodynamik bewegter Körper. Annalen der Physik, 17, 891.
Elwood, S. (2006). Critical issues in participatory GIS: Deconstructions, reconstructions, and new research directions.
Transactions in GIS, 10(5), 693–708.
Evans-Pritchard, E. (1939). Nuer time reckoning. Africa, 12, 189–216.
Fabian, J. (1983). Time and the other: How anthropology makes its object. New York: Columbia University Press.
Feder, J. (1993). Museums index: An object oriented approach to the design and implementation of a data driven
data base management system. In J. Andresen, T. Madsen, & I. Scollar (Eds.), Computer applications and quantitative
methods in archaeology: CAA 1992 (pp. 221–228). Aarhus: Aarhus University Press.
Frachetti, M. (1998). Two times for val camonica (unpublished Masters Dissertation). University of Cambridge,
Cambridge.
426 James S. Taylor
Gell, A. (1992). The anthropology of time: Cultural constructions of temporal maps and images. Oxford: Berg.
Giddens, A. (1984). The constitution of society: Outline of the theory of structuration. Cambridge: Polity Press.
Gosden, C. (1994). Social being and time. Oxford: Blackwell.
Green, C. (2011a). Winding Dali’s clock: The construction of a fuzzy temporal-GIS for archaeology. BAR International Series
2234. Oxford: BAR Publishing.
Green, C. (2011b). It’s about time: Temporality and intra-site GIS. In E. Jerem, F. Redö, & V. Szeverényi (Eds.), CAA
2008: On the road to reconstructing the past. Budapest: Archaeolingua.
Hacιgüzeller, P. (2012). GIS, critique, representation and beyond. Journal of Social Archaeology, 12(2), 245–263.
Hägerstrand, T. (1967). Innovation diffusion as a spatial process. Chicago, IL: The University of Chicago Press.
Hägerstrand, T. (1975). Survival and arena: On the Üfe history of individuals in relation to their geographic environ-
ment. In C. T., P. D., & T. N. (Eds.), Human activity and time geography. London: Unwin Hyman.
Halls, P. J., & Miller, A. P. (1995). Moving GIS into the fourth dimension . . . or the case for todes. Paper given at the GIS
Research UK 1995 Conference, Department of Surveying, University of Newcastle upon Tyne. Newcastle upon
Tyne (pp. 41–43).
Halls, P. J., & Miller, A. P. (1996). Of todes and worms: An experiment in bringing time to ArcInfo. Paper given at the ESRI
European Users Conference. Watford (pp. 1–15).
Harris, E. C. (1989). Principles of archaeological stratigraphy (2nd ed.). London: Academic Press.
Harvey, D. (1991). The condition of postmodernity: An enquiry into the origins of cultural change. Oxford: Blackwell.
Hazelton, N. W. J. (1991). Integrating time, dynamic modelling and geographic information systems: Development
of four-dimensional GIS (unpublished Ph.D). Melbourne: Department of Surveying and Land Information, The
University of Melbourne.
Headland, T., Pike, K., & Harris, M. (1990). Emics and etics: The insider/outsider debate. Newbury Park, CA: Sage
Publications.
Heidegger, M. (1953). Being and time. New York: State University of New York Press.
Hiebel, G., Doerr, M., & Eide, Ø. (2013). CRMgeo: Integration of CIDOC CRM with OGC standards to model spatial
information. Paper given at CAA 2013, 41st Conference in Computer Applications and Quantitative Methods in
Archaeology. Perth, Australia.
Hiebel, G., Doerr, M., & Eide, Ø. (2016). CRMgeo: A spatiotemporal extension of CIDOC-CRM. International
Journal on Digital Libraries, 18(4), 271–279. https://doi.org/10.1007/s00799-016-0192-4
Hiebel, G., Doerr, M., Hanke, K., & Masur, A. (2014). How to put archaeological geometric data into context?
Representing mining history research with CIDOC CRM and extensions. International Journal of Heritage in the
Digital Era, 3(3), 557–578. doi: 10.1260/2047-4970.3.3.557
Husserl, E. (1966 [1887]). The phenomenology of internal time consciousness. Bloomington: INA, Midland Books.
Ingold, T. (1993). The temporality of the landscape. World Archaeology, 25, 152–174.
Isaksen, L., Barker, E., Simon, R., & de Soto, P. (2014). Pelagios and the emerging graph of ancient world data. WebSci’14
Proceedings of the ACM Conference on Web Science, 22–26 June, Bloomington, IN, USA (pp. 197–201).
Johnson, I. (1997). Mapping the fourth dimension: The TimeMap project. Paper given at University of Birmingham,
Computer Applications and Quantitative Methods in Archaeology: CAA97, 1999, 82/CDROM.
Johnson, I. (2002a). Contextualising archaeological information through interactive maps. Internet Archaeology, 12.
https://doi.org/10.11141/ia.12.9
Johnson, I. (2002b). Mapping the humanities: The whole is greater than the sum of its parts. In Proceedings of digital
resources in the humanities. Sydney: Research Institute of Humanities and Social Sciences.
Johnson, I. (2003). Aoristic analysis: Seeds of a new approach to mapping archaeological distributions through time.
In Proceedings of the computer applications into archaeology conference (pp. 448–452). BAR International Series 1227.
Oxford: BAR Publishing.
Johnson, I. (2004, July/August). Putting time on the map: Using Time map for map animation and web delivery.
Geoinformatics, 26–29.
Johnson, I., & Wilson, A. (2002). The TimeMap Kiosk: Delivering historial images in a spatio-temporal context. Paper given
at Gotland, Sweden, Computer Applications and Quantitative Methods in Archaeology: CAA 2001, April 2001
(pp. 71–78).
Johnson, I., & Wilson, A. (2003). The Time map project: Developing time-based GIS display for cultural data. Journal
of GIS in Archaeology, 1, 123–135.
Space and time 427
Kelemis, J. (1991). Time and space in geographic information: Toward a four-dimensional spatiotemporal data model
(unpublished PhD). Pennsylvania: The Pennsylvania State University.
King, J. A. (2006). Historical archaeology, identities and biographies. In D. Hicks & M. C. Beaudry (Eds.), The
Cambridge companion to historical archaeology. Cambridge: Cambridge University Press.
Kraak, M. J. (2003). The space-time cube revisited from a geovisualization perspective. In ICC 2003: Proceedings of
the 21st international cartographic conference: Cartographic renaissance, 1988–1996. Durban, South Africa: International
Cartographic Association (ICA).
Kwan, M.-P. (2002a). Time, information technologies, and the geographies of everyday life. Urban Geography, 23(5),
471–482.
Kwan, M.-P. (2002b). Feminist visualization: Re-Envisioning GIS as a method in feminist geographic research. Annals
of the Association of American Geographers, 92(4), 645–661.
Kwan, M.-P. (2008). From oral histories to visual narratives: Re-presenting the post-September 11 experiences of the
Muslim women in the USA. Social & Cultural Geography, 9(6), 653–669.
Kwan, M.-P., & Ding, D. (2008). Geo-Narrative: Extending geographic information systems for narrative analysis in
qualitative and mixed-method research. The Professional Geographer, 60(4), 443–465.
Landeschi, G. (2018). Rethinking GIS, three-dimensionality and space perception in archaeology. World Archaeology,
1–16. https://doi.org/10.1080/00438243.2018.1463171
Langran, G. (1989). A review of temporal database research and its use in GIS applications. International Journal of
Geographical Information Systems, 3, 215–232.
Langran, G. (1992). Time in geographic information systems. London: Taylor & Francis.
Langran, G., & Chrisman, N. R. (1988). A framework for temporal geographic information. Cartographica, 25(3),
65–99.
Lefebvre, B. (2009). How to describe and show dynamics of urban fabric: Cartography and chronometry? Paper given at Wil-
liamsburg, Virginia, USA., CAA 2009, 22–26 March 2009 (pp. 1–15).
Lefebvre, B., Rodier, X., & Saligny, L. (2008). Understanding urban fabric with the OH_FET model based on social
use, space and time. Archeologia e Calcolatori, 19, 195–214.
Leone, M. (1978). Time in American archaeology. In C. Redman (Ed.), Social archaeology: Beyond subsistence and dat-
ing. London: Academic Press.
Levi-Strauss, C. (1948). Race et Histoire. Paris: UNESCO.
Levi-Strauss, C. (1961). Tristes Tropiquez (A world on the wane). London: Hutchinson.
Lieberwirth, U. (2008). Voxel-based 3D GIS: Modelling and analysis of archaeological stratigraphy. In B. Frischer &
A. Dakouri-Hild (Eds), Beyond illustration: 2D and 3D digital tools for discovery in archaeology, British archaeological
reports international series 1805 (pp. 250–271). Oxford: Archaeopress.
Lin, H., & Mark, D. (1991). Spatio-Temporal INtersection (STIN) and volumetric modeling. GIS/LIS Proceedings, 28
October–1 November, Atlanta, Georgia, USA (pp. 982–991).
Lock, G. R., & Daly, P. T. (1998). Looking at change, continuity and time in GIS: An example from the Sangro Valley,
Italy. In J. A. Barceló, I. Briz, & A. Vila (Eds.), Computer applications and quantitative methods in archaeology: CAA98
(pp. 259–264). Barcelona: Archaeopress.
Lock, G. R., & Harris, T. M. (1997). Analyzing change through time within a cultural landscape: Conceptual and
functional limitations of a GIS approach. In P. Sinclair (Ed.), Urban origins in Eastern Africa, world archaeological
congress, one world archaeology series. London: Routlegde.
Lucas, G. (2001). Critical approaches to fieldwork: Contemporary and historical archaeological practice. London: Routledge.
Lucas, G. (2005). The archaeology of time. Abingdon, Oxon.: Routledge.
Madsen, T. (2003). ArchaeoInfo: Object-oriented information system for archaeological excavations. Paper given at Vienna,
Computer Applications and Quantitative Methods in Archaeology, 8–12 April.
McTaggart, J. M. E. (1908). The unreality of time. Mind, 17, 457–474.
Miller, H. J. (2005). A measurement theory for time geography. Geographical Analysis, 37(1), 17–45.
Miller, H. J., & Bridwell, S. A. (2009). A field-based theory for time geography. Annals of the Association of American
Geographers, 99(1), 49–75.
Mlekuž, D. (2010). Time geography, GIS and archaeology. In F. Contreras & F. J. Melero (Eds.), CAA 2010 “fusion
of cultures” proceedings of the 38th conference on computer applications and quantitative methods in archaeology granada,
Spain, April 2010 (BAR International Series 2494). Oxford: Archaeopress.
428 James S. Taylor
Orengo, H. A. (2013). Combining terrestrial stereophotogrammetry, DGPS and GIS-based voxel modelling in the
volumetric recording of archaeological features. ISPRS Journal of Photogrammetry and Remote Sensing, 76, 49–55.
https://doi.org/10.1016/j.isprsjprs.2012.07.005
O’Sullivan, D. (2006). Geographical information science: Critical GIS. Progress in Human Geography, 30(6), 783.
Parkes, D., & Thrift, N. (1978). Putting time in its place. In T. Carlstein, D. Parkes, & N. Thrift (Eds.), Timing space
and spacing time,Vol. 1: Making sense of time (pp. 119–129). London: Edward Arnold, Ltd.
Pavlovskaya, M. (2006). Theorizing with GIS: A tool for critical geographies?. Environment and Planning A, 38(11),
2003–2020.
Peuquet, D. J. (2002). Representations of space and time. New York: The Guilford Press.
Peuquet, D. J., & Duan, N. (1995). An Event-Based Spatio-Temporal Data Model (ESTDM) for temporal analysis
of geographical data. International Journal of Geographical Information Systems, 9(1), 7–24.
Pickles, J. (Ed.). (1995). Ground truth: The social implications of geographic information systems. New York: Guildford Press.
Rabinowitz, A. (2014). It’s about time: Historical periodization and linked ancient world data. ISAW Papers, 7(22).
http://dlib.nyu.edu/awdl/isaw/isaw-papers/7/rabinowitz/
Rabinowitz, A., Shaw, R., Buchanan, S., Golden, P., & Kansa, E. (2016). Making sense of the ways we make sense of the past:
The PeriodO Project. Bulletin of the Institute of Classical Studies, 59(2), 42–55. doi:10.1111/j.2041-5370.2016.12037.x
Renolen, A. (1997). Temporal maps and temporal geographical information systems (Review of Research), Department of
Surveying and Mapping, The Norwegian Institute of Technology.
Richards, J. (1998). Recent trends in computer applications in archaeology. Journal of Archaeological Research, 6(4),
331–382.
Roddick, J. F., & Patrick, J. D. (1992). Temporal semantics in information systems, A survey. Information Systems,
17(3), 249–267.
Rodier, X., & Saligny, L. (2010). Modélisation des objets historiques selon la fonction, l’espace et le temps pour
l’étude des dynamiques urbaines dans la longue durée. Cybergeo: European Journal of Geography [online], Systèmes,
Modélisation, Géostatistiques, document 502. doi: 10.4000/cybergeo.23175
Rucker, R. (1977). Geometry, relativity and the fourth dimension. New York: Dover Books.
Scheder Black, A. (2011). Visualising the past: A prototype temporal GIS for archaeology. (unpublished Archaeology
M.Sc. Dissertation). University of York.
Shanks, M., & Tilley, C. (1987). Abstract and substantial time. Archaeological Review from Cambridge, 6, 32–41.
Sinton, D. (1978). The inherent structure of information as a constraint to analysis: Mapped thematic data as a case
study. In G. Dutton (Ed.), Harvard papers on GIS. Reading, MA: Addison-Wesley.
Snodgrass, R. T. (1992). Temporal databases. In A. U. Frank, I. Campari, & U. Formentini (Eds.), Proceedings of the
international conference on GIS: From space to territory: Theories and methods of spatio-temporal reasoning in geographic
space, Pisa, Italy, 21–23, 1992 (pp. 22–64). Berlin: Springer-Verlag.
Snow, D. (1997). GIS and northern iroquoian demography. In I. Johnson & M. North (Eds.), Archaeological applications
of GIS: Sydney University archaeological methods series #5. Sydney: Prehistoric & Historical Archaeology, University
of Sydney. CD – ROM.
Soja, E. W. (1989). Postmodern geographies: The reassertion of space in critical social theory. London: Verso Press.
Soja, E. W. (1996). Thirdspace: Journeys to Los Angeles and other real-and-imagined places. Oxford: Basil Blackwell.
Spikins, P. (1997). GIS modelling of holocene vegetation dynamics in Northern England. In I. Johnson & M. North
(Eds.), Archaeological applications of GIS: Sydney University archaeological methods series #5. Sydney: Prehistoric &
Historical Archaeology, University of Sydney. CD – ROM.
Szegö, J. (1987). Human cartography: Mapping the world of man. Stockholm: Swedish Council for Building Research.
Taylor, J. S. (2016). Making time for space at Çatalhöyük: GIS as a tool for exploring intra-site spatiotemporality
within complex stratigraphic sequences (unpublished PhD PhD). University of York, UK.
Taylor, J. S., Bogaard, A., Carter, T., Charles, M., Haddow, S., Knüsel, C. J., Mazzucato, C., Mulville, J., Tsoraki, C.,
Tung, B., & Twiss, K. (2015). “Up in flames”: A visual exploration of a burnt building at Çatalhöyük in GIS. In
I. Hodder & A. Marciniak (Eds.), Assembling Çatalhöyük (pp. 128–149). Leeds: Maney Publishing.
Taylor, J. S., Issavi, J., Berggren, Å., Lukas, D., Mazzucato, C., Tung, B., & Dell’Unto, N. (2018). The rise of the
machine: The impact of digital tablet recording in the field at Çatalhöyük. Internet Archaeology, 47. https://doi.
org/10.11141/ia.47.1
Space and time 429
Taylor, J. S., & Wright, H. (2012). Getting into time: Exploring semantic web-based research questions for the spatio-temporal
relationships at Çatalhöyük. Paper given at Southampton, UK, Archaeology in the Digital Era: 40th Computer
Applications and Quantitative Methods in Archaeology (CAA) Conference.
Terrell, J. E., & Welsch, R. (1997). Lapita and the temporal geography of prehistory. Antiquity, 71, 548–572.
Thomas, J. (1996). Time, culture, and identity. London: Routledge.
Tschan, A. P. (1998). An introduction to object-oriented GIS in archaeology. In J. A. Barceló, I. Briz, & A. Vila
(Eds.), Computer applications and quantitative methods in archaeology: CAA98 (pp. 303–316). Barcelona: Archaeopress.
Tudhope, D., Binding, C., Jeffrey, S., May, K., & Vlachidis, A. (2011). A sTellar role for knowledge organization
systems in digital archaeology. Bulletin of the American Society for Information Science and Technology, 37(4), 15–18.
van der Meulen, M. J., Doornendal, J. C., Gunnink, J. L., Stafleu, J., Schokker, J., Vernes, R. W., . . . van Daalen,
T. M. (2013). 3D geology in a 2D country: Perspectives for geological surveying in the Netherlands. Netherlands
Journal of Geoscience, 92(4), 217–241.
Whittle, A., Richardson, W., Healy, F., Alton, F. M., & Bayliss, A. (2011). Gathering time: Dating the early neolithic enclo-
sures of Southern Britain and Ireland. Oxford: Oxbow Books.
Wright, H. E. (2011). Seeing triple: Archaeology, field drawing and the semantic web (unpublished PhD). University
of York.
Yamin, R. (1998). Lurid tales and homely stories of New York’s notorious five points. Historical Archaeology, 32(1),
74–85.
Yamin, R. (2001). Alternative narratives: Respectability at New York’s five points. In A. Mayne (Ed.), The archaeology
of urban landscapes: Explorations in slumland. Cambridge: Cambridge University Press.
Yu, H. (2006). Spatio-temporal GIS design for exploring interactions of human activities. Cartography and Geographic
Information Science, 33(1), 3–19.
22
Challenges in the analysis of
geospatial ‘big data’
Chris Green
Introduction
It has become a cliché, although that does not make it any less true, to state that we now live in the
information age and are all members of an information society (Huggett, 2012, p. 539). Archaeology as
a discipline is going through its own 21st century computing revolution (Levy, 2014), with many of its
‘grand challenges’ requiring sophisticated computer modelling and large-scale synthetic research. The
greatest potential payoff in this regard will arise from systematic exploitation of data resulting from the
explosion in archaeological investigation that has taken place in many countries (especially in Europe
and North America) since the mid to late 20th century (Figure 22.1), due in part to laws protecting
archaeological resources (that are particularly relevant in the context of commercial projects) (Kintigh
et al., 2014, p. 879). This is because data possess both the characteristic that they can be processed again
and again without losing value (although reprocessing processed data over and over again will inevitably
cause deterioration in quality due to the Second Law of Thermodynamics [Groot, 2017, p. 208]) and the
characteristic that their value arises from what they may reveal in aggregate: recombining data in novel
ways can trigger new insights (Gattiglia, 2015, pp. 113, 117).
Our immersion in data has triggered concerns about information overload (Huggett, 2012, p. 539),
both within archaeology as a discipline and society as a whole. Archaeologists, despite some concerns to
the contrary, are not alone in having to deal with data that is difficult to reconcile or that means differ-
ent things to different people. A good example from medical science is the International Classification
of Diseases (ICD) database maintained by the World Health Organisation, which is subject to constant
modification and to constant local reinterpretation according to conditions and opinions at the medi-
cal ‘coalface’. The standards defined therein mean different things to different users and any impression
of international standardisation is only really skin deep (Bowker & Star, 1999). These are issues that are
similarly true of archaeological datasets, which are equally subject to the creative reinterpretation of so-
called standards and the metamorphosis of ‘agreed’ terminologies over time.
At the time of writing archaeology stands on the brink of an immense challenge, which we must not
fail to embrace: to dive in and attempt to reinterpret our models on a grand spatial and temporal scale
(Cooper & Green, 2016; see Atici, Kansa, Lev-Tov, & Kansa, 2013). As stated, archaeological data has been
multiplying at an increasing rate and we should not let concerns about data quality and data coherency
unduly hold us back. If we wish to justify our continuing existence in a world of increasing pressure
The analysis of geospatial ‘big data’ 431
Figure 22.1 Archaeological investigations recorded over time in England from the 17th century until 2013
as in the National Record of the Historic Environment Excavation Index (based upon data extracted from
Historic England, 2011). The picture is certainly not complete, but is representative of the immense increase
in archaeological investigation seen from the mid to late 20th century.
on archaeological resources, both in material and personnel terms, then we need to continually strive to
demonstrate that our data can produce exciting new models and insights into the past (and potentially
the future) of humanity. Part of the solution to answering that challenge lies in big data analytics and,
more particularly, analytics of geospatial big data (McCoy, 2017, p. 74):
The need for larger and more integrated geospatial data and analyses cross-cuts virtually all of our
goals and aspirations . . . These require us to produce data and results that are scientific (testable,
replicable), authentic (a faithful representation of the archaeological record and the human past),
and ethical (protects cultural resources).
Big data
Big data is a buzzword that has achieved little consensus as to its meaning, which varies significantly
between disciplines. Its definitions should be relative, not absolute (Gattiglia, 2015, p. 114), and should be
related to the resources available to those dealing with the data in question. The threshold for data that
is challenging due to its size or complexity will be much lower within a less well-resourced discipline
432 Chris Green
such as archaeology (e.g. compared to earth sciences) (Austin & Mitcham, 2007, p. 12). McCoy defines
Geospatial big data as being datasets which include locational information that exceed the capacities of
hardware/software/human resources, which he argues is not a criteria yet met by archaeological data-
sets, with the exception of some remotely sensed data (2017, p. 74). By contrast, Gattiglia suggests that
big data could alternatively be taken to mean working with the maximum amount of data available or
useful to approach answering a question (‘big data as all data’), which would mean working not neces-
sarily just on a broad scale but (in combination or alternatively) also working at an intense level of detail
(2015, p. 114). In essence, we can see in this a distinction between two broad categories of big data:
datasets that are large but primarily numeric (referred to here as ‘scientific’ big data) and datasets that
are potentially not so large but which contain a complex mix of numeric and textual detail (referred to
here as ‘human’ big data).
For example, an excavation site archive constructed today might include any of the following (amongst
many others): standardised context sheets describing individual archaeological features, photographs,
photogrammetric models, section drawings and plans, diaries, possibly videos, scientific samples and their
quantifications, quantifications of finds, etc. This complexity is only added to when moving up the chain
of interpretation towards synthetic projects and regional archaeological records. Furthermore, there are
a multiplicity of different data storage formats and data structures, and a whole spectrum of attitudes
towards data accessibility (McCoy, 2017; Dam, Austin, & Kenny, 2010; Kintigh, 2006; Snow et al., 2006).
Although what data can reveal in aggregate is one of its great strengths, data aggregation of this type of
material can amplify the volume of meaningless noise (Wesson & Cottier, 2014, p. 2) and will almost
always entail increased levels of ‘messiness’ (Gattiglia, 2015, p. 114).
Working with ‘human’ big data of this type means that we have to accept this degree of messiness,
simply because it becomes impossible to achieve the high levels of accuracy seen (at least on the surface) in
traditional methods (Gattiglia, 2015, p. 114): Harris’s (2013) eighth law of data quality states that a smaller
emphasis on data quality can sometimes enable bigger data-driven insights, meaning that using a larger
amount of lower-quality data can sometimes be better than using smaller amounts of higher-quality data.
There are several ways in which we can mitigate these problems of ‘messiness’: through better planning
before data capture in the case of newly acquired datasets (Wesson & Cottier, 2014, p. 2; Huggett, 2012,
p. 540; Austin & Mitcham, 2007, p. 23); through thorough exploration of the histories and topographies
of existing datasets to better understand their qualities (Cooper & Green, 2016, pp. 283–284; Huggett,
2012, pp. 539–540, 546–547); through documentation of dataset quality in standalone reports (McCoy,
2017, pp. 91–92), through use of common ontologies and controlled vocabularies (e.g. Binding, May, &
Tudhope, 2008; Vlachidis, Binding, Tudhope, & May, 2010), and through the application of big data ana-
lytics to highlight anomalies in datasets and to verify data quality (Gattiglia, 2015, p. 118).
‘Human’ big data, then, is less about the size of datasets themselves but more the capacity to aggre-
gate, search, and cross-reference multiple large datasets (boyd & Crawford, 2012, p. 663). Performing
analyses of this nature requires the increased usage of automated processing methodologies, as it rapidly
becomes impossible within reasonable time and cost constraints to process data using manual method-
ologies. This is thus the first definition of ‘human’ big data used here following the discussion above:
datasets that are too complex and/or large to process without the use of computer algorithms/scripts.
Further, applying the big data paradigm to complex alphanumeric datasets involves a shift away from
traditional hypothesis-driven approaches towards evidence-based and data-driven approaches, which
are perhaps better able to avoid researcher-derived biases. In this approach, hypotheses and models
come after data analysis, born from data rather than born from theory (Hey, Tansley, & Tolle, 2009;
Gattiglia, 2015, pp. 115–116; Kitchin, 2014, pp. 5–7; also see case study later in this chapter). Therefore,
this is the second complementary definition of ‘human’ big data used here: analyses conducted in an
exploratory manner and then used to construct models and hypotheses after the (initial) data processing
event. This does not preclude further data analysis to test models once they have been constructed, of
course. As Kitchin suggested (2014, p. 2):
Big data analytics enables an entirely new epistemological approach for making sense of the world;
rather than testing a theory by analysing relevant data, new data analytics seek to gain insights ‘born
from the data’.
The rest of this chapter will discuss methods that can be applied to ‘human’ big data, followed by a case
study. In summary, ‘human’ big data is defined here as (a) datasets that are too large to be processed using
manual methodologies that are (b) analysed using an exploratory data-driven paradigm. Examples of this
434 Chris Green
type of study in archaeology are rare due to the only recent availability of large collated datasets, tools,
and sufficient computer processing power. Embracing these ideas is an essential next step for large scale
synthesis projects in archaeology, as projects conducted using conventional methodologies are rapidly
becoming impossible within the budget constraints of the discipline. For example, the recent Rural Settle-
ment of Roman Britain (RSRB) project at the University of Reading (Smith, Allen, Brindle, & Fulford,
2016) produced a rich and detailed dataset relating to 3,652 rural Roman period sites in England and
Wales excavated over the last few decades (Allen et al., 2015). The production of this dataset involved the
employment of three (later four) full time postdoctoral staff over a period of around four years, reading
reports and entering data into a database. Yet the National Record of the Historic Environment Exca-
vation Index (Historic England, 2011) suggests that at least 9,000 Roman sites have been subjected to
excavation in England since 1990 (a figure that should be considered an underestimate, although many
of these sites will not be ‘rural’), so their dataset would struggle to claim to be ‘complete’ (as impressive
as it is). As excavation and investigation of archaeological sites continues to take place at a rapid pace, we
have probably already reached the point where a project like RSRB could not now be undertaken without
the extensive use of data harvesting algorithms and other big data methodologies. A notable example of
a (very) large-scale geospatial ‘human’ big data project would be Endangered Archaeology in the Middle
East and North Africa (EAMENA), which has been attempting to gather data on archaeology across a
massive swathe of the planet (largely through remote sensing methods), in order to understand and pro-
tect vulnerable archaeological resources (Rayne et al., 2017); however, EAMENA is only able to function
by operating with a much larger team and budget than a project such as RSRB. Even so, the progress
of the EAMENA team would still undoubtedly be much more rapid with the availability of improved
algorithms to handle pattern recognition from remote sensed data and algorithms that ‘read’ and extract
information from textual sources through natural language processing.
Method
There are many different ways in which one can approach complicated large datasets computationally.
Some applicable methods have been discussed in detail elsewhere in this volume (see, for instance, Bevan,
this volume; Conolly, this volume; Hacıgüzeller, this volume; Lloyd & Atkinson, this volume) and a few
others will be briefly outlined here.
(see case study in this chapter), as it will minimise problems caused by multiple counting of the same
feature(s). There are many ways of implementing the spatial binning of a dataset (e.g. using ‘Generate
Tessellation’ followed by ‘Summarize Within’ in ArcGIS Pro), so no detailed examples will be given here.
Hexagons are increasingly preferred over squares for sampling tessellations due to having less perceptual
issues over the human brain spotting false linear alignments in the binned data and due to the increased
simplicity of calculating neighbouring cells (Birch, Oom, & Beecham, 2007).
Exploratory statistics
Statistical analyses are inevitably key to the big data approach. These potentially include a vast number of
techniques applicable to particular problems or datasets, although exploitation of these is not yet common
in archaeology. An important theme within these statistical approaches is the data-driven paradigm based
on exploratory statistics and in line with the second definition of ‘human’ big data used above. Instead
of hypothesising and modelling a relationship between a large dataset and a specific set of other selected
variables and then testing that model statistically, the data-driven approach can involve testing many
relationships, both singly and in combination with a much larger set of potentially explanatory variables,
with little initial user selection other than gathering the maximum possible available input data sources.
A notable example of this methodology is the Exploratory Regression tool in ArcGIS (Rosenshein,
Scott, & Pratt, 2011), which tests a dataset against any number of potential explanatory variables using
Ordinary Least Squares (OLS) regression (see Hacıgüzeller, this volume). In each iteration, the dataset is
tested against an increasing number of combined other variables (generally from one up to a maximum
set by the user). At the end of the process, a report is generated which can be used to assess how well
each modelled set of variables explains variation in the original dataset. However, each extra iteration
massively increases the run time for the tool, which limits the usability of the technique on current stan-
dard desktop hardware. Techniques such as this possess immense potential for building stronger models
to explain spatial variation in archaeological data, however, and escape from some (but not all) of the
problems associated with creator subjectivity in conventional modelling exercises.
Smits & Friis-Christensen, 2007; Yue, Gong, Di, He, & Wei, 2011). These developments go hand in hand
with new and improved cyber infrastructures for archaeological geospatial information (Kintigh, 2006;
Snow et al., 2006; e.g. Meghini et al., 2017; Elliott, Heath, & Muccigrosso, 2014), which would ideally be
referenced against schema designed to aid semantic interoperability (such as CIDOC-CRM; cf. Doerr,
2003; Binding et al., 2008), and which should themselves no longer be seen as simply an additional tool
in the archaeologist’s toolbox, but rather as the key system that allows us to unlock meaning from within
our information (Llobera, 2011, p. 217).
Case study
The English Landscapes and Identities project (EngLaId) ran from 2011 to 2016 at the University of
Oxford. It sought to construct syntheses using archaeological information for all of England from the
Middle Bronze Age (c.1500 BC) to the Domesday Book (AD 1086). Fundamentally, it was structured
around the reuse of legacy data collected from over eighty regional and national organisations. The
main project database of Bronze Age to early medieval archaeological sites (including single findspots
and records of uncertain date) contained over 900,000 records, of which around 800,000 were usable
in a spatial analysis context (with the others either lacking any spatial information or being errone-
ously extracted data relating to incorrect time periods). With a core day-to-day project team of four
The analysis of geospatial ‘big data’ 437
Figure 22.2 Comparison of different spatial bins used by the EngLaId (The English Landscapes and Identities)
project at the same scale.
postdoctoral researchers (plus PhD students, administrative support, etc.), it was clearly not possible to
attempt any level of analysis or cleaning of all of this data on a manual record-by-record basis. As such,
although more detailed studies were conducted on case study areas, at a national level all synthesis and
spatial analysis had to involve automated processing methods. Although the main project database was
only around 3.5GB in terms of storage usage, the database definitely passed the “too complex for manual
processing” test for ‘human’ big data.
The major issue in terms of cleaning this data involved the identification of double-counted (or more)
records from multiple datasets. This was particularly problematic due to the combination of national
and regional level archaeological records, with many entities being included in more than one of these
source databases. The only reliable way to completely solve this problem, due to the lack of any common
identifiers recorded in most of the datasets, would have been to compare the records one-by-one against
all other nearby records on a map, a task made even more difficult due to spatial imprecision for some
of the entities. This would have been an impossible task within the time allowed and with the number
of people employed upon the project. As such, although the comparison was undertaken for small areas
of the country, the overall national synthesis was achieved using the spatial binning method discussed
briefly above.
To allow maximum flexibility in both spatial visualization and spatial analysis, three different sets of
spatial bins were constructed and the data collated using them on a presence/absence basis for 120 differ-
ent types of archaeological site and 22 different broad categories of archaeological find (Figure 22.2). The
coarsest bins were a set of regular hexagons where any one vertex was 5km from its second nearest neigh-
bour. For England, this resulted in 6,598 cells, which allowed for very rapid display and filtering of data.
However, these bins were clearly too coarse for most visualization (other than images designed to appear
at a very small size on a page, e.g. Figure 22.3) and for all analytical purposes. The second set of bins was a
set of hexagons with a 3km vertex to second nearest vertex resolution. This resulted in 17,922 cells, which
438 Chris Green
Figure 22.3 Artefacts recorded by the Portable Antiquities Scheme (PAS) displayed using 100 year time-slices
from 1500 BC to AD 999. The data has been binned into 5 km hexagons and probabilities of each artefact
falling into each time-slice in each hexagon calculated and then summed.
were of sufficiently fine spatial resolution for most of our national level mapping purposes, particularly
for images presented at around 20cm × 20cm size (Figure 22.4). However, again, these bins were probably
too coarse to attempt any robust statistical analyses. The final set of bins used for the project were 1×1km
squares, slightly offset from the 1,000m divisions of the Ordnance Survey (OS) National Grid to remove
quadruple counting (due to the procedure applied) of records at the origin point of OS kilometre grid
The analysis of geospatial ‘big data’ 439
Figure 22.4 Map produced using 3 km hexagons showing early medieval evidence for field systems.
squares (a situation reflecting lack of spatial precision in many cases, rather than records that actually fell
on grid cell origin points). This resulted in 136,767 cells, which were less useful for visualization purposes
(due to their small size compared to the scale of England and due to the perceptual issues associated with
square tessellations noted above), but which allowed reasonably robust statistical comparison of EngLaId
data against a great number of other variables of potentially explanatory nature. We found on the whole
that step changes in variables at the boundary lines between regional datasets made any problematic data
readily apparent (Figure 22.4 shows the border between Cornwall and Devon in southwest England very
clearly due to categorisation differences between the datasets maintained by the two local authorities)
(Cooper & Green, 2016, p. 294). The binned EngLaId data can be explored online (EngLaId team, 2016)
with the different sizes of spatial bin being displayed at different spatial scales/zoom levels.
440 Chris Green
Following the end of the project, further analyses have been attempted using data collated using
hectare (100×100m squares) spatial bins. This resulted in 13,478,926 cells for all of England. Process-
ing this data as a vector dataset is very intensive on conventional computer hardware and becomes very
time-consuming. This can be mitigated by moving from a vector to a raster data model, but this makes
cross-referencing between the different variables more difficult. Attempting to apply the Exploratory
Regression tool mentioned earlier to this full dataset is essentially impossible on conventional computer
hardware, as it only works with vector data and the vector dataset is too large to process at all (even when
converted to points). As such, sub-sampling is necessary to produce any results, which can be done by
performing the analysis just on the bins containing archaeological material and a random sub-sample
of the bins containing no archaeological material (of roughly equivalent numbers to those containing
archaeology): this reduces the number of cells to around 700,000 from over 13 million. However, by
sub-sampling in this way, we are losing some of the explanatory potential of the dataset taken as a whole:
clearly, the ideal for big data analytics should be to attempt analysis on the fullest possible datasets (Gat-
tiglia, 2015, p. 114).
One of the source datasets utilised by the EngLaId project, and one particularly susceptible to a
geospatial big data approach, was the Portable Antiquities Scheme (PAS) database (British Museum,
2013–2018). This consists (at the time of writing) of over 800,000 records relating to over 1.3 million
archaeological objects reported (mostly) by members of the public. The EngLaId team approached the
PAS data in a number of different ways (see Cooper & Green, 2017 for more detail), but one relevant
example is shown in Figure 22.3. The production of this set of maps involved: (a) calculate the percent-
age probability of each record falling within a series of time-slices based upon its assigned start and end
dates; (b) multiply those probabilities by the quantity of objects represented by each record; (c) calculate
which hexagon each record falls within spatially; (d) sum each value for each time-slice for each 5km
hexagon; (e) attach the summed probabilities to the spatial dataset for the hexagons; (f) produce maps.
All of this can be relatively simply achieved using scripts produced in Python (or another programming
language) possibly alongside using tools provided in widely available GIS packages (for the spatial parts of
the procedure). This is just one example of how relatively simple but computationally intensive analyses
can produce interesting new results that would have taken many weeks of work to produce using con-
ventional methods.
Conclusion
It could be argued that, in the more human focussed disciplines such as archaeology, big data analytics
is less about the quantity of data and less about specific analytical techniques, and more about taking
a particular perspective when undertaking our analyses. That perspective can: involve gathering the
maximum possible material to approach our questions (“big data as all data”, Gattiglia, 2015); be about
understanding the different characters of our datasets (Cooper & Green, 2016; Huggett, 2012); be
about building exploratory models that avoid building in too many pre-existing assumptions; and be
about accepting data-driven workflows that do not generally start from a pre-defined and overly
restrictive hypothesis or set of hypotheses. Many of the techniques described in this chapter could be
applied to very large or complex datasets, and new techniques will undoubtedly come along in future
that will also be of use. Also, computer hardware continues to become more powerful per unit of
money spent, aiding in processing large datasets, although associated data curation costs do not neces-
sarily become cheaper over time (Austin & Mitcham, 2007, p. 12).
However, the key point here is that discussion of software and hardware solutions are not the alpha
and the omega of discourse in computational archaeology: more important to the future of the discipline
The analysis of geospatial ‘big data’ 441
is discussion about the nature of our data, its definition, representation, and manipulation (Llobera, 2011,
p. 219). Kitchin wrote (2014, p. 2):
The challenge of analysing big data is coping with abundance, exhaustivity and variety, timeliness
and dynamism, messiness and uncertainty, high relationality, and the fact that much of what has
been generated has no specific question in mind or is a by-product of another activity.
We have to embrace all of the data available to us, despite concerns about subjectivity (after all, almost all
archaeological data is the result of interpretation on some level). We can aid in this by better understand-
ing the character and histories of our datasets through thorough documentation (i.e. metadata creation)
and by developing ways of working with data that allow us to include more data in our models (Cooper &
Green, 2016, pp. 296–297), perhaps by representing data quality using fuzzy membership criteria, such
as treating temporal information probabilistically (e.g. Green, 2011; Crema, 2012; see Fusco & de Runz,
this volume). In 2015, Gattliglia asked (2015, p. 118):
Is archaeology ready to move towards data-led research, and to accept predictive and probabilistic
techniques?
Acknowledgements
The case study discussed in this chapter was carried out as part of the European Research Council funded
English Landscape and Identities project (Grant Number 269797). The project team’s thanks go to all of
the bodies that supplied data to us, most notably Historic England, the Portable Antiquities Scheme, and
England’s Historic Environment Record offices. At the time of writing, the EngLaId data can be explored
at: http://englaid.arch.ox.ac.uk
References
Allen, M., Brindle, T., Smith, A., Richards, J. D., Evans, T., Holbrook, N., . . . Blick, N. (2015). The rural settlement
of Roman Britain: An online resource. Archaeology Data Service. Retrieved from https://doi.org/10.5284/1030449
Atici, L., Kansa, S. W., Lev-Tov, J., & Kansa, E. C. (2013). Other people’s data: A demonstration of the imperative of
publishing primary data. Journal of Archaeological Method and Theory, 20(4), 663–681. https://doi.org/10.1007/
s10816-012-9132-9
Austin, T., & Mitcham, J. (2007). Preservation and management strategies for exceptionally large data formats: “Big Data”.
Retrieved from https://archaeologydataservice.ac.uk/research/bigData.xhtml
Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The semantic web. Scientific American, 284(5), 34–43.
Binding, C., May, K., & Tudhope, D. (2008). Semantic interoperability in archaeological datasets: Data mapping and
extraction via the CIDOC CRM. In B. Christensen-Dalsgaard, D. Castelli, B. Ammitzbøll Jurik, & J. Lippincott
(Eds.), Research and advanced technology for digital libraries (Vol. 5173, pp. 280–290). Berlin, Heidelberg: Springer
Berlin Heidelberg. https://doi.org/10.1007/978-3-540-87599-4_30
Birch, C. P. D., Oom, S. P., & Beecham, J. A. (2007). Rectangular and hexagonal grids used for observation,
experiment and simulation in ecology. Ecological Modelling, 206(3–4), 347–359. https://doi.org/10.1016/j.
ecolmodel.2007.03.041
Bowker, G. C., & Star, S. L. (1999). Sorting things out: Classification and its consequences. Cambridge, MA: MIT Press.
boyd, D., & Crawford, K. (2012). Critical questions for Big Data: Provocations for a cultural, technological, and
scholarly phenomenon. Information, Communication & Society, 15(5), 662–679. https://doi.org/10.1080/13691
18X.2012.678878
442 Chris Green
McCoy, M. D. (2017). Geospatial Big Data and archaeology: Prospects and problems too great to ignore. Journal of
Archaeological Science, 84, 74–94. https://doi.org/10.1016/j.jas.2017.06.003
Meghini, C., Niccolucci, F., Felicetti, A., Ronzino, P., Nurra, F., Papatheodorou, C., . . . Hollander, H. (2017).
ARIADNE: A research infrastructure for archaeology. Journal on Computing and Cultural Heritage, 10(3), 1–27.
https://doi.org/10.1145/3064527
Nicolucci, F. (2017). Documenting archaeological science with CIDOC CRM. International Journal on Digital Librar-
ies, 18(3), 223–231. https://doi.org/https://doi.org/10.1007/s00799-016-0199-x
Opitz, R., & Limp, W. F. (2015). Recent developments in High-Density Survey and Measurement (HDSM) for
archaeology: Implications for practice and theory. Annual Review of Anthropology, 44(1), 347–364. https://doi.
org/10.1146/annurev-anthro-102214-013845
Rayne, L., Bradbury, J., Mattingly, D., Philip, G., Bewley, R., & Wilson, A. (2017). From above and on the ground:
Geospatial methods for recording endangered archaeology in the Middle East and North Africa. Geosciences, 7(4),
100. https://doi.org/10.3390/geosciences7040100
Rosenshein, L., Scott, L., & Pratt, M. (2011). Exploratory regression: A tool for modeling complex phenomena. Retrieved
from www.esri.com/news/arcuser/0111/files/exploratory.pdf
Smith, A., Allen, M., Brindle, T., & Fulford, M. (2016). The rural settlement of Roman Britain: New visions of the countryside
of Roman Britain (Britannia Monograph 29). London: The Roman Society.
Smits, P., & Friis-Christensen, A. (2007). Resource discovery in a European spatial data infrastructure. IEEE Transac-
tions on Knowledge and Data Engineering, 19(1), 85–95. https://doi.org/10.1109/TKDE.2007.250587
Snow, D. R., Gahegan, M., Giles, C. L., Hirth, K. G., Milner, G. R., Mitra, P., & Wang, J. Z. (2006). Cybertools and
archaeology. Science, 311(5763), 958–959. https://doi.org/10.1126/science.1121556
Vlachidis, A., Binding, C., Tudhope, D., & May, K. (2010). Excavating grey literature: A case study on the rich index-
ing of archaeological documents via natural language-processing techniques and knowledge-based resources. Aslib
Proceedings, 62(4/5), 466–475. https://doi.org/10.1108/00012531011074708
Wesson, C. B., & Cottier, J. W. (2014). Big sites, big questions, Big Data, big problems: Scales of investigation and
changing perceptions of archaeological practice in the southeastern United States. Bulletin of the History of Archaeol-
ogy, 24, 16. https://doi.org/10.5334/bha.2416
Yue, P., Gong, J., Di, L., He, L., & Wei, Y. (2011). Integrating semantic web technologies and geospatial catalog ser-
vices for geospatial information discovery and processing in cyberinfrastructure. GeoInformatica, 15(2), 273–303.
https://doi.org/10.1007/s10707-009-0096-1
23
The analytical role of 3D
realistic computer graphics
Nicoló Dell’Unto
Introduction
In the last decade, the introduction and use of high-resolution, textured, 3D surface models (3D realistic
surface models) in archaeology has been widely presented and discussed in the literature (Callieri et al.,
2011; De Reu et al., 2013; Campana, 2014; Opitz & Limp, 2015; Dell’Unto et al., 2016). Recently, these
applications have become widely used and have proven to be very efficient in supporting archaeological
interpretation.
Three-dimensional realistic models display aspects of the original materials that are usually dif-
ficult (or impossible) to represent with more traditional approaches, and allow the construction of
3D spatial simulations for visualising the original position of objects and contexts in great detail. The
introduction of these approaches within archaeological practice augments our comprehension of
the site, enhancing the complex relations that exist among the many fragmented bits of information
retrieved as the result of a field investigation (Dell’Unto, Landeschi, Apel & Poggi, 2017). Despite the
broad applications of 3D realistic models in archaeology, this chapter will focus on the particular use
of these models to support archaeological field practice, and specifically, it will provide an overview
of how these applications and their associated data can be employed to support spatial analysis and
interpretation.
A brief history
Three-dimensional models have always been considered an innovative tool in cultural heritage research,
and since their introduction, the possibilities offered by these technologies have brought forth new
perspectives in archaeology (Reilly, 1989; Reilly, 1991; Forte & Silotti, 1997). In the 1990s, Colin Ren-
frew discussed how 3D models forced archaeologists into more logical explanations, thereby providing
researchers with the opportunity to experiment with different interpretation methods (Renfrew, 1997).
However, at that time, several obstacles including the high costs of the technology and the complexity
of merging and using those types of information within the framework of more traditional investiga-
tion practice prevented the spread of these techniques throughout the archaeological community. The
creation and use of 3D models required the mastery of specific skills which were not traditionally part
The analytical role of 3D graphics 445
of any archaeological training, and thus, despite their relevance, their introduction and diffusion was
hampered by many obstacles.
In the last few years, the technological development of passive and active sensors, such as laser scanning
or image based 3D modelling techniques, and their diffusion at relatively low cost has allowed for the
exponential spread of realistic 3D surface models, providing archaeologists with the opportunity to start
using these instruments and techniques in support of field investigations. This phenomenon was probably
due to a number of key events: (a) the development and diffusion of low cost techniques and instruments,
(b) the initiation of a methodological and theoretical discussion concerning the use of the third and
fourth dimensions in support of archaeological interpretation, and (c) the accomplishment of significant
experimentation where the introduction of 3D acquisition technology proved to be crucial in interpret-
ing the site. Despite their obvious advantages, the introduction of these methods in the field raised con-
cerns among practitioners who warned archaeologists of the risks of losing intellectual engagement with
the material being recorded (Giuliani, 2008). Despite their apparent realism and accuracy, 3D data are
still the product of a complex interpretation process (Jeffrey, 2015; Garstki, 2016), and for this reason, their
affordances are strongly dependent on the research goals posed before their creation (Dell’Unto, 2016).
Since their first application in archaeology, 3D modelling techniques were meant to be used within
the framework of a geodatabase, and in relation to different types of spatial information (Ducke, Score, &
Reeves, 2011). Thus, they would provide researchers with the opportunity to (a) perform complex opera-
tions of data visualisation and editing, (b) identify 3D patterns as a result of specific queries, (c) import
and handle new types of information, such as 3D volumes, and (d) enable the visualisation of information
through different types of devices, such as immersive or stereoscopic systems. Despite the lack of a single
system which includes all these functions, a number of experiments have focused on using 3D visual tech-
nologies and 3D realistic modelling techniques to create 3D palimpsests for performing different kinds of
spatial analysis (Bevan et al., 2014; Magnani & Schroder, 2015; Garstki, Arnold, & Murray, 2015; Opitz,
2014; Callieri et al., 2011; Forte, Dell’Unto, Issavi, Onsurez, & Lercari, 2012; Dellepiane, Dell’Unto, Cal-
lieri, Lindgren, & Scopigno, 2013; Dell’Unto, 2014; Roosevelt, Cobb, Moss, Olson, & Unluso, 2015). The
results of these experiments have proven the capability of 3D realistic surface models, combined with 3D
visualisation systems, to generate new ways of engaging with archaeological information. These findings
provide the scientific community with the opportunity to initiate a discussion about the impact that these
approaches will have on spatial analysis and, more specifically, about how the combination of 3D models
and different types of archaeological data can be used to identify new patterns.
Figure 23.1 A 3D model of the house of Caecilius Iucondus visualized through Unity 3D. A colour version
of this figure can be found in the plates section.
Source: Picture: Kennet Ruona
archaeologists and specialists with the opportunity to ‘walk across’ their simulations and review the infor-
mation previously collected directly in 3D space (Forte, Dell’Unto, Jonsson, & Lercari, 2015; Lercari,
Shiferaw, Forte, & Kopper, 2017).
Due to their ability to be used in immersive simulation environments, through which it is possible to
gain a more natural interaction with the scene, game engine systems are considered very promising. By
using information from archaeo-botanical or geological studies, game engines have been used for simu-
lating and visualizing complex past ecosystems (Huyzendveld et al., 2012) and for the study of the use of
space in antiquity through the employment of virtual characters (Der Manuelian, 2013). However, despite
their increased application in scientific environments, in order to be used to support the field investiga-
tion process, these systems need to be transformed into “sandboxes” where data can be easily imported,
queried, analysed, manipulated, and visualized in spatial relation to other datasets and/or users in real time.
These systems are very powerful, and their future use and diffusion in support of field archaeological
practice is strongly dependent on the implementation of tools for spatial analysis.
Three-dimensional geographic information systems (3D GISs) have also been employed for managing
3D realistic models in support of archaeological practice. Traditional 2D GISs are considered to be the
most influential visualization tools for managing and analysing archaeological data, and in many countries,
they represent the standard for archaeological documentation (Allen, Green, & Zubrow, 1990; Lock &
Stancic, 1995; Wheatley & Gillings, 2002; Chapman, 2006; Conolly & Lake, 2006). Three-dimensional
GIS allow 3D surface models to be merged with more traditional datasets, making the employment of
this visualization instrument a more logical choice for archaeologists.
The analytical role of 3D graphics 447
Although these platforms cannot necessarily be considered a novelty, recent development and inte-
gration into these systems of tools oriented towards 3D visualization (editing, querying and analysis of
resolute point clouds such as Light Detection and Ranging (LiDAR) 3D surface models or 3D volumes)
has revealed new scenarios for their use in archaeological field practice. 3D GIS allow for the manage-
ment of data as a result of different excavation approaches and allows users to interact with their data
in completely novel ways, Figure 23.2. Currently, these systems have the capabilities to host and display
the results of multiple types of analysis in three dimensions (surface analysis, 3D spatial distributions, 3D
visual scapes, etc.), thus providing the basis for opening discussions concerning best practice with regard
to fieldwork recording strategies.
Most of the current archaeological field projects developed using these new versions of 3D GIS have
employed ESRI (Environmental Systems Research Institute) ArcGIS. This allows 3D realistic surface mod-
els to be directly imported and visualized in spatial relationships with more traditional data sets. ESRI’s
ArcGIS takes advantage of fast rendering speeds to explore in a 3D environment the information detected
and recorded in the field (ESRI, 2010).
Figure 23.2 3D GIS for the documentation of insula V I at Pompeii. Vertical and horizontal drawings are made
directly in the system based on the 3D surface model of the house of Caecilius Iucondus as a 3D reference.
448 Nicoló Dell’Unto
More recently ESRI introduced ArcGIS PRO, which provides advanced 3D editing tools and allows the
importing, managing and further analysing of 3D models with higher resolution. The real time imple-
mentation of 3D realistic surface models within a 3D GIS used in support of an excavation affords new
ways of interacting with the dataset collected in the field. The adoption of these systems during excava-
tion as the primary tool for documentation is excellent for exploring 3D information in a non-linear
way, making the excavation process more contextual and reflexive, and providing archaeologists with a
more complete overview of the complex information retrieved on site (Berggren et al., 2015; Landeschi,
Nilsson, & Dell’Unto, 2016; Dell’Unto et al., 2017).
Despite the recent introduction of new 3D Web-based GIS tools (Jensen, 2018; Galeazzi & Rissetto,
2018; Scopigno, Callieri, Dellepiane, Ponchio, & Potenziani, 2017), these systems still represent the most
diffuse solution for testing new methodological and theoretical approaches in the field. An interesting
example of this can be found in the study of Stora Förvar, Sweden; a prehistoric cave excavated at the end
of the 19th century with arbitrary spit layers. Despite the important information retrieved as a result of
the field investigation, due to the excavation method adopted at that time it was not possible to recon-
struct the stratigraphic sequence of the site. However, by using a 3D GIS to simulate in three dimensions
the relationship between the volumes of the spit layers and elements of the artefact assemblages retrieved
during different excavations, as shown in Figure 23.3, it was possible to partially redefine the stratigraphic
Figure 23.3 A virtual simulation of Stora Förvar. The volumes of spit layers excavated during the 19th century
were reconstructed by combining the original drawings and the 3D surface model of the cave made by laser
scanning technology.
Source: 3D Acquisition: Stefan Lindgren, 3D Visualization: Giacomo Landeschi and Victor Lundström
The analytical role of 3D graphics 449
sequence of the site, opening up new interpretative scenarios (Landeschi, 2018; Landeschi et al., 2016;
Landeschi et al., 2018).
Method
Since the first introduction of 3D GIS in archaeology, a number of projects have invested resources in
experimenting with the implementation of 3D realistic surface models in support of field investigations.
Research projects like Gabii Goes Digital (Opitz & Johnson, 2016), Catalhoyuk (Taylor et al., 2018),
Uppåkra (Callieri et al., 2011), 3D Digging (Forte, 2014) and the Kaymakçı Archaeological project
(Roosevelt et al., 2015) were among the first to start testing the limits and potential of 3D data in relation
to more traditional archaeological field recording systems. If, at the beginning, the discussion was mainly
focused on aspects such as the sustainability and management of these types of information in support
of site documentation (Forte et al., 2012; Roosevelt et al., 2015), the diffusion and systematic use of this
approach in the field has allowed the focus to shift towards the discovery of new archaeological results
and their implications for theory and methodology. The question which characterizes this second phase
is as follows: how crucial is the use of these technologies for the definition of research questions and for
the identification of new archaeological information? If logistical aspects (such as the implementation
of these instruments and techniques in current field investigation activities) can be easily addressed and
discussed, an understanding of the impact of these new methods at a more methodological and theoreti-
cal level will not be immediate and instead requires a deeper analysis of the effects of these recording and
visualisation systems on the ways archaeologists interpret the past.
The creation of archaeological documentation is not only a mechanical procedure for recording the
geometrical characteristics of archaeological materials, but it is also a process through which archaeolo-
gists gain a deeper understanding of the contexts and features being recorded (Giuliani, 2008). On one
hand, 3D recording techniques are shown to be faster and to produce very accurate and realistic repre-
sentations of the materials retrieved on site (Scopigno et al., 2017). On the other hand, the idea that the
use of 3D realistic surface models can simply substitute conventional recording could create a dangerous
imbalance between the process of data documentation and intellectual engagement with the material
being recorded (Powlesland, 2016). As a consequence, in order to make 3D realistic surface models sig-
nificant elements in the interpretation process, it is important to generate data that reflect and emphasize
the observations accrued as a result of an active engagement with the material retrieved on site.
For this reason, when introducing 3D realistic surface models as part of field documentation, it is
necessary to do the following:
1 Assess the limits and potential of this type of data in relation to its use during archaeological
interpretation.
2 Design a data management system capable of establishing robust spatial relationships between 3D
realistic models and the rest of the documentation produced on site.
3 Carefully define the metadata which will be used to describe the 3D realistic models.
4 Re-define the site logistics in order to ensure that these data are created only after accurate analysis
of the materials retrieved in the field. Despite their characteristics, 3D surface models do not have
the capacity to reproduce all aspects of the original materials.
A methodological approach which takes into account the limits and potential of these types of data in
relation to the different spatial datasets produced as a result of field analysis has the potential to signifi-
cantly affect the impact that this documentation approach will have on the final interpretation of the
450 Nicoló Dell’Unto
site. Integrating 3D surface models within the framework of field investigation activities is not an easy
task. More traditional documentation approaches do not include or manage such datasets, and for this
reason, 3D models produced (and used) during the excavation are often mainly employed to extract bi-
dimensional geometric representations of sections and maps of the site.
Considering the potential of these data for supporting field interpretation and spatial analysis, in the
past years, the Lund University Digital Archaeology Laboratory (DARKLab) focused on developing field
strategies for the use of 3D data in the context of archaeological field investigation. This activity aimed
to advance understanding of how the employment of a 3D system for recording and visualizing ongo-
ing field activities impacted the archaeological understanding of space. Since the method was developed,
a number of experiments on different case studies were initiated, and various approaches were tested.
Among the instruments and techniques for producing 3D surface models, image based 3D modelling
techniques proved to be the most efficient tools to employ in support of archaeological field investigation
(Callieri et al., 2011; Dellepiane et al., 2013).
The method relies on the integration and use of 3D realistic surface models in a 3D GIS for the
documentation and interpretation of contexts and features detected in the field. In particular, the process
consists of the following steps:
1 A digital camera is used to capture images of the context from multiple perspectives, and then, by
using image based modelling techniques (Agisoft PhotoScan), the pictures are processed and a 3D
textured surface model created.
2 Once generated, the 3D model is georeferenced using ground control points, imported into the
geodatabase and visualized (in the field) in a 3D GIS platform (ESRI ArcScene or ARCGIS Pro). By
employing a tablet PC to work directly in the trench,
3 3D polylines are used to draw contexts and features using the 3D surface model (and the 3D GIS) as
geometrical reference, and
4 a relational database (connected to the 3D polyline) is employed to record textual information (Fig-
ures 23.2 and 23.4).
The possibility of simultaneously exploring features and contexts both in the real world (looking, touch-
ing, changing perspective) and in the 3D GIS (where the contexts can be visualized in real time and in
spatial relation to materials and features removed in previous steps of the investigation) allows for simula-
tion of different 3D spatial scenarios and the exploration, in real time and within field investigation, of
different hypotheses.
To be successful, this approach requires the definition of acquisition strategies which maintain a good
balance between the number of pictures taken and processed on site and a proper 3D representation of the
contexts as exposed in the field. Due to the unpredictable nature of archaeological investigation, the field
acquisition approach adopted for recording is a delicate step, and it is usually defined after a discussion among
the field archaeologists operating on site. Each acquisition should aim to properly represent the material
identified in the field to support archaeological interpretation and spatial analysis. To be effective in the field,
3D surface models must be used in synergy with different types of datasets. Their creation and use in support
of ongoing field investigations allow for a number of simulations that is impossible to achieve with more
traditional documentation/visualization methods. Once imported into the 3D GIS, the models can be used
for (1) visualizing, exploring and analysing (in 3D and in spatial relations) contexts and materials exposed and
removed at different steps of the investigation (Lingle et al., 2015), (2) the detection of new archaeological
information by identifying 3D patterns as result of specific spatial queries (Wilhelmson & Dell’Unto, 2015)
or (3) analysing the site morphology in order to detect new archaeological contexts and features.
The analytical role of 3D graphics 451
Figure 23.4 The different steps undertaken in the field to record, visualize and interpret contexts and features
detected during field investigation.
Case studies
Catalhoyuk, Turkey
The Çatalhöyük Research Project (www.catalhoyuk.com/) represents an important reference for pre-
senting the early integration and use of 3D realistic surface models in support of archaeological field
investigation. The project has a long history of engagement with digital 3D technology (Tringham &
Stevanović, 2012; Berggren et al., 2015; Taylor et al., 2018), and since the end of the 1990s, the use of 3D
models has been integrated into this research activity as a tool which ‘allows for experimentation with
different ways of experiencing the site’ (Hodder, 2000, p. 8).
With the diffusion of 3D recording instruments, a number of activities were started on site to assess
the limitations and the potential of the data in supporting field practices. After the initiation of experi-
ments developed within the framework of the 3D Digging project (Forte et al., 2012), three dimensional
recording techniques were adopted, customized and further developed by different teams of specialists
operating on site (Knüsel, Haddow, Sadvari, Dell’Unto, & Forte, 2013; Lercari & Lingle, 2016). After a
period of transition and experimentation, the team of field archaeologists were introduced to the data
generated, which was used in combination with GIS platforms and tablets, and since 2013, 3D realistic
surface models, were systematically employed by excavation teams to record spaces, buildings and features
452 Nicoló Dell’Unto
Figure 23.5 3D Model of the plaster head (21666) after conservation. The model was generated using Agisoft
PhotoScan pro version 1.2.6. with acquisition campaign and processing done by Nicoló Dell’Unto. A colour
version of this figure can be found in the plates section.
at different steps of the investigation. To integrate the documentation system adopted on site, a number of
protocols and workflows to support field investigation were developed (Taylor et al., 2018). Once in the
GIS, 3D realistic surface models allowed archaeologists to interact with the information in a completely
different way, helping them to gain a more accurate and holistic overview of the contexts and materials
retrieved on-site (Dell’Unto, 2016). The opportunity to generate and use 3D realistic surface models in
the field proved, on some occasions, to be crucial to understanding the complex relationships that were
revealed between contexts and structures (Carpentier, 2015). Specifically, during the excavation season of
2015, a painted plaster head was retrieved in the North Area within Building 132 (Figures 23.5 and 23.6).
Due to its instability, the plaster head was removed before it could be studied in spatial relation to the
rest of the structure. However, by using the 3D archive produced since 2013, it was possible to recognise the
plaster head while still in situ (Lingle et al., 2015). The realistic 3D surface model that had been previously
stored allowed for identification of the plaster head in its original position, which was located at the junc-
tion of several walls and spaces in the southwest corner of the building. At the same junction, a crawl space
carved into a wall was also located. Unfortunately, due to a layer of plaster which covered the structures at
the time when the 3D model was created, it was not possible to understand the spatial relationship between
the plaster head and the different parts of the building. For this reason, a new 3D version of the plaster head
was made after the work of the conservation lab and was virtually replaced in its original location using the
3D realistic surface models as geometric and spatial references, as shown in Figure 23.6.
The 3D model of the plaster painted head created after conservation was geo-referenced and imported
into the project geodatabase. Then, using a 3D GIS platform, it was visualized in three dimensions to
simulate the spatial relationship between the different contexts retrieved at different times during the
excavation process. This spatial 3D simulation allowed for the re-examination of the plaster head in rela-
tion to the rest of the building, proving that the feature was located in the direction of the crawl-hole,
rather than the building itself (Lingle et al., 2015).
The use of this methodology allows for the selection and visualization in a three-dimensional/
geo-referenced space of the different contexts exposed and removed during the different phases of
the excavation. The use of such a methodology allowed the plaster head to be contextualised within
the construction sequence of the building (Lingle et al., 2015), proving that this methodology is
Figure 23.6 (a) 3D realistic surface model of building 131 which displays the plaster head when still in situ.
(b) Spatial simulation of the 3D plaster head after conservation re-integrated in its original position within
the 3D GIS.
Source: Acquisition campaign and 3D models were created by Jason Quinlan, Marta Perlinska and Nicoló Dell’Unto
454 Nicoló Dell’Unto
capable of bridging the work of several specialists working on site and that it can obtain results otherwise
impossible to achieve. The Çatalhöyük experience has shown how, in order to be useful to archaeological
field research, 3D realistic surface models need to be acquired regularly and with a specific strategy in mind.
Kämpinge, Sweden
Another case study where 3D realistic models were used in support of field investigations from their
beginning is the archaeological site of Kämpinge, Sweden. Since 2014, the site has been investigated by
the Institute of Archaeology and Ancient History at Lund University and is one of a group of Middle
and Late Neolithic coastal sites in the Öresund region dating from 8500–6000 cal BP, which belong to
the Kongemose and Ertebölle cultures (Brinch Petersen, 2015). Due to the site’s complexity, since the
first excavation season, 3D realistic surface models were employed to record all the contexts and features
retrieved. The documentation method employed was based on an extensive use of image-based 3D
reconstruction techniques combined with a Real-Time Kinematic (RTK) Global Positioning System
(GPS) to spatially document contexts and materials retrieved during the field investigation campaign.
Once generated, the 3D models were georeferenced, imported and made available within a 3D GIS plat-
form to be used by the excavators in situ as 3D geometrical references for drawing the contexts in 3D
directly in the trench before their removal (Dell’Unto et al., 2017). The geodatabase developed during the
excavation was designed to support single context recording; all the information concerning the context
description was recorded electronically and at the trowel’s edge, as shown in Figure 23.7.
Figure 23.7 3D documentation constructed for the archaeological field investigation of Kämpinge, Sweden.
(a, b, c) Trench N09–04 3D recorded and visualized inside the 3D GIS used on site. (d) 3D models of Trench
06 made during the excavation season 2014 and 2015.
The analytical role of 3D graphics 455
The opportunity to visually simulate in three dimensions the excavation process allowed for a better
understanding of the actions performed by different teams, thus providing a better and more holistic
overview of the relationships between the contexts already removed by the excavation process (Dell’Unto
et al., 2017). Across the years, the system proved to be capable of reflecting and emphasizing observations
which accrued as a result of an active engagement with the contexts and materials retrieved on site, trig-
gering an interpretation process among different recording agents which occurred at the same time in
the real and in the virtual world. Among the many results obtained by using this approach in the field,
the most interesting result was the way the system affected the excavation strategy: increasing, and not
reducing, the levels of intellectual engagement with the material being investigated. During the excava-
tion, archaeologists carefully reviewed in three dimensions aspects of the trenches from previous years
by combining together in a 3D space the contexts and artefacts that were retrieved at different times by
different teams.
The advantages of this approach were visible from the first excavation season, when six artefacts
identified as a flint flake, a flint axe core, a piece of worked amber, a fragmented disc in black slate, a
flint “Trollsten” and a retouched flint flake were documented in 3D whilst still in situ (Apel, Leffler,
Landeschi, & Dell’Unto, 2015). The 3D models were georeferenced, imported, visualized and stored in
the 3D geodatabase as part of the documentation strategy. During the second excavation season, a larger
trench was opened in the same location by a different team to further investigate the contexts identified
in the previous year. This team employed a tablet PC to visualize the 3D contexts and artefacts retrieved
the year before. The ability to use the 3D information stored in the geodatabase for testing multiple 3D
simulations allowed the archaeologists to gain a clear spatial understanding of the relationship between
the sequence of contexts detected the year before and the new materials. Specifically, the use of this
methodology allowed for the review of the exact position of the artefacts previously excavated and for
a spatial comparison with the contexts and artefacts undergoing active excavation. The possibility of
simulating in three dimensions the spatial relationship of those contexts as if they were exposed at the
same time allowed for a deeper level of understanding of the complex relations retrieved across the years
(Dell’Unto et al., 2017).
Conclusion
Before the diffusion of 3D acquisition instruments and techniques, 3D models were rarely used by
field archaeologists. The very few 3D reconstructions that were presented in the literature at the end of
the 1990s were created mainly by private companies whose development focused on public outreach
(Frischer, 2008).
The production of 3D realistic models for use in archaeological investigation is no longer an obstacle.
Advances in 3D acquisition techniques and computational resources today allow the acquisition and
processing of realistic 3D models as an integral part of the archaeological excavation process. More prob-
lematic seems to be the lack of data visualization systems (see also Eve & Graham, this volume) that are
customized for enhancing the real potential of this information, as well as the absence of routines aimed
at instructing archaeologists on how and when to use these systems to support their interpretations.
Three-dimensional realistic surface models have the capacity to describe a substantial amount of infor-
mation, which can also be linked to a very large body of different data, with both spatial and non-spatial
information. To date, despite their resolution and accuracy, the large number of isolated 3D models has
had a very limited impact on wider site interpretation. The establishment of data management systems
capable of contextualizing such data will eventually augment the use of 3D realistic models within the
framework of more complex spatial queries, increasing their impact on the current interpretation process.
456 Nicoló Dell’Unto
The introduction of such complex systems and data while the interpretation process is ongoing is not
an easy task. Therefore, an important question to be considered is how and when we should engage with
these systems and data in the field. What needs to be taken into account when creating this information
and what kind of affordances do these data need to have in order to be useful for interpreting the past?
Despite the results so far achieved within different field projects, 3D realistic surface models have only
just begun to be integrated into archaeological field practice. To date, those applications have provided
archaeologists with the opportunity to experiment with investigative approaches capable of discovering
information that was previously impossible to detect. Due to their characteristics, 3D realistic surface
models have proven to be easy to integrate and use in more traditional documentation practices, func-
tioning as a robust palimpsest for supporting on-site discussion and highlighting information that was
previously difficult, or even impossible, to detect. This approach allows for the management and analysis
of archaeological data in three dimensions, making it possible to simulate in the field the multi-temporal
actions performed during the excavation process, gluing together the fragmented data retrieved during
different years.
The introduction of 3D realistic surface models, together with different types of 3D data, 3D volumes
and point clouds, will bring important changes to the way archaeologists collect and use information
retrieved as the result of field activities, affecting not just the final outcome of the interpretation process
but also the way the process is constructed. Moreover, the increased production of 3D surface models
will lead to the creation of large 3D archives of archaeological data, which will further affect the way
information will be transmitted and used among the community of practitioners.
Acknowledgements
The work described in this paper was generously supported by the Birgit och Sven Håkan Ohlssons
Foundation and the DARKLab, Laboratoriet för Digital Arkeologi, Lund University, Sweden. The author
wishes to thank the Humanities Laboratory, Lund University, for the opportunity to use instruments and
facilities to perform parts of the experiments described in the text. I wish to thank James Taylor and
Giacomo Landeschi for endless discussions on this topic, the editors of this volume for the great feedback
on this text, the Catalhoyuk Research Project, the 3D Digging Project and The Kämpinge Project for
the opportunity to develop these experiments and participate in productive discussions.
References
Allen, K. M. S., Green, S. W., & Zubrow, E. B. W. (1990). Interpreting space: GIS and archaeology. London: Taylor and
Francis.
Apel, J., Leffler, J., Landeschi, G., & Dell’Unto, N. (2015). Stenålderslokalen vid Kämpinge 24:2, Räng Socken (RAÄ
4:1). Skåne – Säsongen 2015 Excavation report. Lunds Universitet. Lund, Sweden.
Berggren, Å., Dell’Unto, N., Forte, M., Haddow, S., Hodder, I., Issavi, J., . . . Taylor, J. (2015). Revisiting reflexive
archaeology at Çatalhöyük: Integrating digital and 3D technologies at the trowel’s edge. Antiquity, 89, 433–448.
Bevan, A., Li, X., Torres, M., Green, S., Xia, Y., Zhao, K., & Rehren, T. (2014). Computer vision, archaeological clas-
sification and China’s terracotta warriors. Journal of Archaeological Science, 49, 249–254.
Brinch Petersen, E. (2015). Diversity of mesolithic vedbaek. In Acta Archaeol (Vol. 86:1). Oxford: Wiley.
Callieri, M., Dell’Unto, N., Dellepiane, M., Scopigno, R., Soderberg, B., & Larsson, L. (2011). Documentation and
interpretation of an archeological excavation: An experience with dense stereo reconstruction tools. In M. Del-
lepiane, S. Serna, H. Rushmeier, L. Van Gol, & F. Nicolucci (Eds.), VAST the 11th international symposium on virtual
reality archaeology and cultural heritage (pp. 33–40). Prato: Eurographics.
The analytical role of 3D graphics 457
Campana, S. (2014). 3D modeling in archaeology and cultural heritage-theory and best practice. In S. Campana &
F. Remondino (Eds.), 3D surveying and modeling in archaeology and cultural heritage theory and best practices (pp. 7–13).
Oxford: BAR International Series.
Carpentier, F. (2015). “Buildings 6, 24 and 17” (Çatalhöyük 2015 Archive Report, report by the Çatalhöyük Research
Project 44–8). Retrieved from Catalhoyuk Research Project www.catalhoyuk.com/research/archive_reports
Chapman, H. (2006). Landscape archaeology and GIS. Stroud: Tempus.
Conolly, J., & Lake, M. (2006). Geographical information systems in archaeology (Cambridge manuals in archaeology). Cam-
bridge, UK: Cambridge University Press.
Dellepiane, M., Dell’Unto, N., Callieri, M., Lindgren, S., & Scopigno, R. (2013). Archaeological excavation monitor-
ing using dense stereo matching techniques. Journal of Cultural Heritage, Elsevier, 14(3), 201–210.
Dell’Unto, N. (2014). The use of 3D models for intra-site investigation in archaeology. In S. Campana & F. Remon-
dino (Eds.), 3D surveying and modeling in archaeology and cultural heritage theory and best practices (pp. 151–158).
Oxford: BAR International Series.
Dell’Unto, N. (2016). Using 3D GIS platforms to analyse and interpret the past. In M. Forte & S. Campana (Eds.),
Digital methods and remote sensing in archaeology: Archaeology in the age of sensing (pp. 305–322). Cham, Switzerland:
Springer.
Dell’Unto, N., Landeschi, G., Apel, J., & Poggi, G. (2017). 4D recording at the trowel’s edge: Using three-dimensional
simulation platforms to support field interpretation. Journal of Archaeological Science: Reports, Elsevier, 12, 632–645.
Dell’Unto, N., Landeschi, G., Leander Touati, A. M., Dellepiane, M., Callieri, M., & Ferdani, D. (2016). Experienc-
ing ancient buildings from a 3D GIS perspective: A case drawn from the Swedish Pompeii Project. Journal of
Archaeological Method and Theory, Springer, 23(1), 73–94.
De Reu, J., Plets, G., Verhoeven, G., De Smedt, P., Bats, M., Cherretté, B., . . . De Clercq, W. (2013). Towards a
three-dimensional cost effective registration of the archaeological heritage. Journal of Archaeological Science, 40,
1108–1121.
Der Manuelian, P. (2013). Giza 3D: Digital archaeology and scholarly access to the giza pyramids: The giza project at
Harvard University. In A. Addison, G. Guidi, L. De Luca, & S. Pescarin (Eds.), Proceedings of digital heritage 2013.
Digital Heritage International Congress (pp. 727–734). Marseille: IEEE – Institute of Electrical and Electronics
Engineers Inc.
Ducke, B., Score, D., & Reeves, J. (2011). Multiview 3D reconstruction of the archaeological site at Weymouth from
image series. Computers & Graphics, 35, 375–382.
ESRI. (2010). What’s new in ArcGIS 3D Analyst 10, February 2010. Resource document. Retrieved from http://help.
arcgis.com/en/arcgisdesktop/10.0/help/index.html#//00qp0000000z000000.htm. Accessed 20 March 2018.
Forte, M. (2014). 3D archaeology: New perspectives and challenges: The example of Çatalhöyük. Journal of Eastern
Mediterranean Archaeology and Heritage Studies, 1, 1–29.
Forte, M., Dell’Unto, N., Issavi, J., Onsurez, L., & Lercari, N. (2012). 3D archaeology at Çatalhöyük. International
Journal of Heritage in the Digital Era, 1, 351–377.
Forte, M., Dell’Unto, N., Jonsson, K., & Lercari, N. (2015). Interpretation process at Çatalhöyük using 3D. In I.
Hodder, & A. Marciniak (Eds.), Assembling Çatalhöyük (Vol. 1). Themes in Contemporary Archaeology, Vol. 1.
Leeds: Maney.
Forte, M., & Silotti, A. (1997). Virtual archaeology. London: Thames and Hudson Ltd.
Frischer, B. (2008). From digital illustration to digital heuristics. In B. Frischer & A. Dakouri-Hild (Eds.), Beyond
illustration: 2D and 3D digital technologies as tools for discovery in archaeology. Oxford: Archaeopress and BAR Inter-
national Series.
Galeazzi, F., & Rissetto, H. (2018). Editorial introduction: Web-based archaeology and collaborative research. Journal
of Field Archaeology, 43(sup1), S1–S8. doi:10.1080/00934690.2018.1512701
Garstki, K., Arnold, B., & Murray, M. (2015). Reconstituting community: 3D visualization and early Iron Age social
organization in the Heuneburg mortuary landscape. Journal of Archaeological Science, 54, 23–30.
Garstki, K. (2016). Virtual representation: The production of 3D digital artifacts. Journal of Archaeological Method and
Theory, 24, 726–750.
Giuliani, C. F. (2008). Prefazione. In M. Bianchini (Ed.), Manuale di rilievo e di documentazione digitale in archeologia
(pp. 9–12). Roma: Aracne editrice.
458 Nicoló Dell’Unto
Hodder, I. (2000). Developing a reflexive method in archaeology. In I. Hodder (Ed.), Towards reflexive method in
archaeology: The example at Çatalhöyük (pp. 3–14). Cambridge: McDonald Institute for Archaeological Research.
Huyzendveld, A. H., Di Ioia, M., Ferdani, D., Palombini, A., Sanna, V., Zanni, S., & Pietroni, E. (2012). The virtual
museum of the tiber valley. In A. Grande & V. Bendicho (Eds.), Proceedings of III Congreso International de Arquelo-
gia e Informatica Grafica, Patrimonio e Innovation [proceedings of the III international congress of archaeology and computer
graphics, heritage and innovation] (pp. 97–101). Sevillia, Spain: Virtual Archaeology Review.
Jeffrey, S. (2015). Challenging heritage visualization: Beauty, aura and democratisation. Open Archaeology, 1, 144–152.
Jensen, P. (2018). Semantically enhanced 3D: A web-based platform for spatial integration of excavation documenta-
tion at Alken Enge, Denmark. Journal of Field Archaeology, S43, 1–14.
Knüsel, C. J., Haddow, S. D., Sadvari, J. D., Dell’Unto, N., & Forte, M. (2013). Bioarchaeology in 3D: Three-dimensional
modeling of human burials at Neolithic Çatalhöyük. Poster presented at 82nd meeting of the American Association of
Physical Anthropologists in Knoxville, TN, 9–13 April. https://doi.org/10.13140/RG.2.2.18447.69287
Landeschi, G. (2018) Rethinking GIS, three-dimensionality and space perception in archaeology. World Archaeology.
https://doi.org/1080/00438243.2018.1463171
Landeschi, G., Apel, J., Lundström, V., Storå, J., Lindgren, S., & Dell’Unto, N. (2018). Re-enacting the sequence:
Combined digital methods to study a prehistoric cave. Archaeological and Anthropological Sciences. https://doi.
org/10.1007/s12520-018-0724-5
Landeschi, G., Nilsson, B., & Dell’Unto, N. (2016). Assessing the damage of an archaeological site: New contribu-
tions from the combination of Image-based 3D modelling techniques and GIS. Journal of Archaeological Science:
Reports, 10, 431–440.
Lercari, N., & Lingle, A. M. (2016). Çatalhöyük digital preservation project (Çatalhöyük 2016 Archive Report, report by
the Çatalhöyük Research Project). Retrieved from Catalhoyuk Research Project www.catalhoyuk.com/research/
archive_reports
Lercari, N., Shiferaw, E., Forte, M., & Kopper, R. (2017). Immersive visualization and curation of archaeological
heritage data: Çatalhöyük and the Dig@IT App. Journal of Archaeological Method and Theory, 25(2), 368–392.
Lingle, A., Dell’Unto, N., Der, L., Doyle, S., Killackey, K., Klimowicz, A., Tung, B. (2015). Painted plaster head (Çat-
alhöyük 2015 Archive Report, report by the Çatalhöyük Research Project 44–8). Retrieved from Catalhoyuk
Research Project www.catalhoyuk.com/research/archive_reports
Lock, G., & Stancic, Z. (1995). The impact of GIS on archaeology: A European perspective. New York: Taylor and Francis.
Magnani, M., & Schroder, W. (2015). New approaches to modeling the volume of earthen archaeological features:
A case-study from the Hopewell culture mounds. Journal of Archaeological Science, 64, 12–21.
Opitz, R. (2014). Three dimensional field recording in archaeology: An example from Gabii. In B. R Olson &
W. R. Caraher (Eds.), 3D imaging in Mediterranean archaeology (pp. 64–73). North Dakota: The Digital Press, the
University of North Dakota.
Opitz, R., & Johnson, T. D. (2016). Interpretation at the controller’s edge: Designing graphical users interfaces for
the digital publication of the excavations at Gabii (Italy). Open Archaeology, 2, 1–17.
Opitz, R., & Limp, W. F. (2015). Recent developments in High-Density Survey and Measurement (HDSM) for
archaeology: Implications for practice and theory. Annual Review of Anthropology, 44(1), 347–364.
Powlesland, D. (2016). 3Di enhancing the record, extending the returns, 3D imaging from free range photography
and its application during the excavation. In H. Kamermans, et al. (Eds.), The three dimensions of archaeology proceed-
ings of the XVII UISPP world congress (1–7 September 2014, Burgos, Spain) volume 7/sessions A4b and A12 (pp. 13–32).
Oxford: Archaeopress.
Reilly, P. (1989). Data visualization in archaeology. IBM Systems Journal, 28(4), 569–579.
Reilly, P. (1991). Towards a virtual archaeology (pp. 133–139). CAA 18. Oxford: Archeopress.
Renfrew, C. (1997). Foreword. In M. Forte & Silotti (Eds.), Virtual archaeology. London: Thames and Hudson Ltd.
Roosevelt, C., Cobb, P., Moss, E., Olson, B., & Unluso, S. (2015). Excavation is destruction digitization: Advances in
archaeological practice. Journal of Field Archaeology, 40(3), 263–380.
Scopigno, R., Callieri, M., Dellepiane, M., Ponchio, F., & Potenziani, M. (2017). Delivering and using 3D model on
the web: Are we ready? Virtual Archaeology review, 8(17), 1–9.
Taylor, J., Issavi, J., Berggren, Å., Lukas, D., Mazuccato, C., Tung, B., & Dell’Unto, N. (2018). “The rise of the
machine”: The impact of digital tablet recording in the field at Çatalhöyük. Internet Archaeology, 47.
The analytical role of 3D graphics 459
Tringham, R., & Stevanović, M. (2012). Last house on the hill: BACH area reports from Çatalhöyük, Turkey. Monumenta
Archaeologica, Vol. 27. Los Angeles: Cotsen Institute of Archaeology (UCLA) Press.
Wheatley, D., & Gillings, M. (2002). Spatial technology and archaeology: The archaeological applications of GIS. London:
Taylor and Francis.
Wilhelmson, H., & Dell’Unto, N. (2015). Virtual taphonomy: A new method integrating excavation and post-
processing of human remains. American Journal of Physical Anthropology, 157(2), 305–321.
von Schwerin, J., Richards-Rissetto, H., Remondino, F., Agugiaro, G., & Girardi, G. (2013). The MayaArch3D project:
A 3D WebGIS for analyzing ancient architecture and landscapes. Literary and Linguistic Computing, 28(4).
24
Spatial data visualisation
and beyond
Stuart Eve and Shawn Graham
Introduction
Deformed visions
The attempt to appreciate the sensory worlds of others, distant in time and place necessitates an unlearning:
that we subject to scrutiny our sensory education, of which the prejudice towards vision is only one part.
(Gosden, 2001, p. 166)
I want to propose a theory and practice of a Deformed Humanities. A humanities born of broken, twisted
things. And what is broken and twisted is also beautiful, and a bearer of knowledge. The Deformed Humani-
ties is an origami crane – a piece of paper contorted into an object of startling insight and beauty.
(Sample, 2012)
Vast computational power promises us that rainbow’s-end we’ve been chasing: the ability to experience,
visualise and explore the past as it was. Even if we couch that desire in caveats, still, the desire remains.
There is nothing wrong with this desire; what is wrong is to pretend that it does not exist.
We have to consider that our digital sense – that extended cognition that overlays and permeates space
(knowing what friends are up to miles away because of constant social media updates; the ability to be
guided through traffic congestion via constantly updated maps; the sense of loss that occurs when there
is no wi-fi signal) is part of the sensorium that archaeologists must now contend with. Let us then begin
with this newest sense, and consider the ways it can intersect with physical space especially when that sense
is dependent on these ephemeral, ghostly, haunted objects that ‘send [our] social relations off down a new
path, not through any intention on the part of the object, but through its effects on the sets of social
relations attached to various forms of sensory activity’ (Gosden, 2001, p. 165).
For instance – many Wikipedia articles contain geographic metadata. They are articles about a par-
ticular place. What tool would we reach for to understand this geographic coverage? A map, of course,
replete with dots or other icons. But Wikipedia exists in its own digital space(s), social and informatic,
spaces that overlie real world space. Several years ago we built ‘Historical Friction’ (Graham & Eve, 2013),
a web-toy app extended from Ed Summers’ Ici (Summers, 2016). Summers’ app took the geolocation
from a user’s device and returned the list of Wikipedia articles geocoded to nearby places. ‘Historical
Spatial data visualisation and beyond 461
Friction’ by contrast vocalised that list with several computerised voices from text-to-speech synthesizers.
The denser a locale, the greater the cacophony of computers yelling at the listener. The web-toy was not
a pleasant experience. It depended on the user pulling out the ear-buds, taking off the headphones, and
seeing the place with new eyes in the revelatory silence.
Digital data is there for us to reach out and experience. It is not safely confined to a computer in the
lab. It permeates space. Experiencing it can be like Sample’s origami crane. In this chapter we gesture
towards ways we might usefully deform archaeological spatial data, thinking especially about sound and
vision.
Archaeological vision
The opening passage of Stephanie Moser’s exploration of the birth of archaeological visualisation states:
It is no surprise that archaeology – a discipline that is centered on the study of material culture –
relies heavily on a large suite of visual products to record, interpret, and present its findings to
professional and public audiences.
(Moser, 2012, p. 292)
She goes on to define visualisation as follows: “On the one hand, it results from the products that result
from graphically representing archaeological materials and on the other it refers to the process of interpre-
tation embodied in this visual translation” (Moser, 2012, p. 295). This is also true of spatial visualisations.
When one thinks of spatial data visualisation the ‘map’ is immediately brought to mind. Cartography
and map-making have been at the centre of how we visualise the world for millennia (see Andrienko &
Andrienko, 2006; Slocum et al., 2008; Kraak & Ormeling, 2013; Tyner, 2014; Gillings, Hacıgüzeller, &
Lock, 2019a for overviews). As archaeologists we draw plans, sections and put countless dots on maps (the
meditative nature of manually drawing such plans has recently been celebrated by Caraher, 2015, as slow
archaeology). We explore and record an archaeological site horizontally and vertically using complicated
(but also familiar) notation such as the hachure or stippling. As well as using this abstract symbology we
produce more ‘accurate’ products like photographs of our trenches and the landscapes in which we are
working. We convert the electrical impulses from our geophysical equipment to colours and hues to help
us visualise the resistance of the soil to electricity. We capture signals from satellites and convert them to
a precise location on the planet which we then represent by a dot or the node in a line on a map. This
volume itself is replete with precisely this kind of visualisation.
These techniques are all very familiar to the archaeologist and each one has a vast amount of litera-
ture that can be examined, questioned and challenged. There is no space within this short chapter to
do justice to a detailed exploration of each of these methods, however, it is fair to say that the majority
of spatial visualisations created by archaeologists are currently created using Geographic Information
Systems (GIS). These visualisations tend to be presented as 2D plans or maps effectively recreating the
drawn record, albeit with clearer symbology and layout. Perhaps as a result of this digital proxy for the
hand-drawn record, visualisation of space using GIS has traditionally been seen as a by-product of a deeper
spatial analysis, as Ebert puts it, the “read-only mode of GIS” (2004, p. 320). This view has been recently
challenged by Gupta and DeVillers, who argue that “visualisation encourages the use of our cognitive
abilities (rather than equations and algorithms) to process information and generate new knowledge”
(2017, p. 855). The degree to which our orthodox visualization techniques nurture and encourage such
an engagement is currently moot.
462 Stuart Eve and Shawn Graham
Archaeological spatial visualisation also has to take account of the temporal dimension. For example,
we present results from surveys that took place at specific times, representing artefacts that that were
deposited sometimes thousands of years apart. Any spatial visualisation that we create must necessar-
ily deal with both spatial and temporal uncertainty (see Fusco & de Runz this volume). Unfortunately,
current GIS software “typically enables navigation of the spatial and thematic dimensions, but it does
not offer effective exploration of the temporal dimension” (Gupta & DeVillers, 2017, p. 876). The
overwhelming majority of archaeological spatial visualisation is performed through the medium of the
cartographic product, be that a set of time-series diagrams, an interactive 2D or 3D GIS interface with
a ‘time-slider’ to explore the temporal aspect or a simple map showing points, lines and polygons. This
often means that the data has to be simplified to fit the requirements of the available tools, rather than
encouraging an exploration of different forms of visualisation.
As Gillings, Hacıgüzeller and Lock state, “there is nothing wrong with maps that are argumentative,
discordant, disruptive, playful, provocative or simply beautiful . . . . if novel connections and relations can
only be built [through these methods] then that is how it will have to be” (2019b, pp. 11–12). Beyond
the traditional map or plan, other forms of spatial visualisation exist that enable us to approach archaeo-
logical data in different ways. These include the novel presentation of statistical analyses, such as Martin
Sterry’s work (2018) which uses the Hue-Saturation-Value (HSV) colour wheel to visualise results of
multi-dimensional correspondence analysis of pottery use in Roman Britain. Recent advances in web-
based technology have allowed annotated interactive 3D virtual reality visualisations of LiDAR and other
data to be presented through online portals such as SketchFab (see https://sketchfab.com/markwalters).
It is now even possible to 3D print scale models of landscapes or artefacts and ‘visualise’ them haptically
(Neumüller, Reichinger, Rist, & Kern, 2014; Di Franco, Camporesi, Galeazzi, & Kallmann, 2015).
Perhaps unsurprisingly, an analysis of the available literature on visualisation suggests that archae-
ologists are only the tip of the spatial visualisation iceberg (see for instance MacEachren et al., 1998;
Slocum et al., 2001; Brewer, MacEachren, Abdo, Gundrum, & Otto, 2000; Crampton, 2002; Howard &
MacEachren, 1996, whose work, from a network-theoretic point of view, ties the scholarship of geo-
graphic visualisation together).
If we perform the same quick computational reading of the citation knowledge graph (a network
analytic reading of the results of a Google Scholar search for ‘archaeological + data + visualization’, so
as to see data visualization beyond archaeological GIS), Figure 24.1, and look for the articles that tie the
network together (taking that as a signal that the ideas contained therein bridge scholarship), we find a
very strong focus on Virtual Reality work (Acevedo, Vote, Laidlaw, & Joukowsky, 2001; Vote, Acevedo,
Laidlaw, & Joukowsky, 2002; Allen et al., 2004; Van Dam, Laidlaw, & Simpson, 2002; Forte, Dell’Unto,
Issavi, Onsurez, & Lercari, 2012). If we are not doing GIS and if we are not drawing plans or plotting
dots, we are building virtual reality (VR); our debt to archaeological photography and the ‘visual’ aspect
of visualisation seems clear.
Hamilakis (2014, p. 22) has argued that the emergence of photography was the medium of capitalism
in the 19th century, in that photographs themselves became a kind of currency, a new form of visual
economy (citing Sekula, 1981; Poole, 1997). This autonomous and disembodied sense of vision was
quickly adopted in archaeology, making archaeology a “device of modernity” (Hamilakis, 2014, p. 9).
Archaeology’s privileging of the visual therefore is also complicit in the ontological admixture that
Hamilakis describes between aesthetics and politics, in that both circle around what is permitted to be
sensed, experienced, and appreciated, and by whom: consensus versus dissensus (2014, p. 415). The tools
and techniques of computational approaches to archaeology merely replicate this consensus. And yet,
archaeology is about that full-bodied sensuous engagement with the things and environments of the world,
at the trowel’s edge, from which we craft the past. This tension, Hamilakis tells us, is the wedge with
Figure 24.1 A citation network of results returned from a Google Scholar search of ‘geographic + visual-
ization’, using Ed Summers’ python package ‘Etudier’. “Google Scholar aims to rank documents the way
researchers do, weighting the full text of each document, where it was published, who it was written by, as
well as how often and how recently it has been cited in other scholarly literature.” https://scholar.google.
com/intl/en/scholar/about.html. The results give us a sense of the most important works by virtue of
these citation patterns. Thus, MacEachren, Boscoe, Hau, & Pickle (1998); Slocum et al. (2001); Brewer,
MacEachren, Abdo, Gundrum, & Otto (2000); Crampton (2002); Howard & MacEachren (1996) are most
functionally important in tying scholarship together. This is not the same thing as being the most often cited
work. Rather, these are the works whose ideas bridge otherwise disparate clumps; they are most central. A
colour version of this figure can be found in the plates section.
464 Stuart Eve and Shawn Graham
which we might insert a more fully sensorial engagement in archaeology (2014, p. 9). “There is no per-
ception which is not full of memories” going on to assert that “ . . . it is my conviction that all academic
writing should become evocative, merging scholarly discourses with mnemonic and autobiographical
accounts” (2014, pp. 9–10). For Hamilakis, the merging of different ways to sense the world (Ingoldian
knots, perhaps, of lives lived, Ingold, 2015) means that all sensorial experience is synesthesia (Hamilakis,
2014, pp. 410–411).
Hamilakis also argues that “The human individual, especially as perceived and enacted in Western
capitalist modernity, is not the most appropriate unit of analysis for an archaeology of the senses. This
is not only because, as anthropological accounts have shown, human persons can be conceptualized and
embodied in diverse ways . . . . More important[ly] such an analytical category is inappropriate because
sensorial experience is activated at the moment of a transcorporeal encounter; this is an encounter among
human bodies, between human bodies and the bodies of other beings, and between human bodies and
objects, things, and environments” (Hamilakis, 2014, p. 411). This echoes Ingold in his discussion of the
“life of lines” (2015) where he argues not for assemblages, but for correspondences. The concept of the
‘assemblage’ is ‘too static’ because it does not allow for the frictions or tensions that bind things together.
Lines do – for they knot and twist and respond to one another. Meaning is not built from blocks juxta-
positioned, but from movement along a line, where it bunches up encountering other lines. As we shall
see later, if we cannot use the experienced senses of a human individual of today as a direct proxy for past
senses, perhaps we can instead use our present senses to create and experience new things about the past.
Hamilakis only deals with digital media briefly, regarding them as merely another prosthesis for
thought. Yet digital media is itself active and has a kind of agency (a way to effect change in the world)
in a way other classes of materials genuinely do not. Moreover, digital media bring another actor into the
mix, for digital work is a correspondence between user, machine, and programmer. Digital synaesthesia
emerges from this knotting. To work in a digital medium, to work with computational tools and semi-
autonomous software agents requires the performance of tacit knowledge and experience. We respond to
the machine and it in turn responds to us. We may call it a ‘black box’, which only serves to show that the
result is a deformance (portmanteau of ‘deform’ and ‘performance’), a making strange and an estrange-
ment from the sand and dirt and flies of the excavation. But if we recognize that computation is a kind of
knotted performance, then we should recognize also that computation returns an emotional connection
to this data, to remind us that data is always a proxy for human lives lived. And so it is not without ethical
consequences. The decisions we take in a digital medium, given the nature of computation (whose fun-
damental action is to copy), get multiplied in their effects. Copying implies connection, a tangled web of
articulations. Hence, the choice of representation (whether visual or aural), or form (or indeed, whom to
cite!), when there is a choice to be made (as there always is), is a force multiplier. Computation entangles
us, knots us, in networks/meshworks/filigrees of time and space. Computation expands our senses and
at the same time our entanglements with the world (c.f. Hodder, 2012).
As archaeologists we are still very much at the edge of exploring the potential of the full sensorium
(see Mlekuz, 2004; Frieman & Gillings, 2007; Eve, 2014; Primeau & Witt, 2017) and especially so when
attempting to ‘visualise’ the past and the results of our spatial analyses (for example, see work by Mur-
doch & Davies, 2017, on whether or not VR reconstructions could be spiritually affective). As the Inter-
net of Things (Xia, Yang, Wang, & Vinel, 2012; Kopetz, 2011) becomes a reality, our concept of what is a
computer has also become more complicated. Our laptops, our smartphones, our GPS devices and even
our toasters (Engadget, 2018) are connected to the internet at all times and beginning to blur the bound-
aries between the real world that we inhabit, and the virtual world that we visit via our devices. Currently,
however, a paradigm shift is occurring within the computer science sector towards ‘spatial computing’
(Shekhar, Feiner, & Aref, 2015). Spatial computing recognises this digital kinesthesia, encompassing “the
Spatial data visualisation and beyond 465
ideas, solutions, tools, technologies, and systems that transform our lives by creating a new understand-
ing of locations – how we know, communicate, and visualise our relationship to locations and how we
navigate through them” (ibid., 72). Historically archaeologists have only embraced some aspects of spatial
computing, most notably geographic information systems (Conolly & Lake, 2006) and spatial statistics
(Wheatley & Gillings, 2000). These technologies and methods are now as familiar to the archaeologist
as the trowel – but spatial computing needs to meet the challenges and embrace the opportunities of
constantly emerging and evolving technologies. This includes the sheer quantity of data being collected
(see Green, this volume; McCoy, 2017; Cooper & Green, 2016, for discussions of [geospatial] big data in
archaeology), but also the evolving concept of space as represented within the computing environment.
Traditional GIS deals with points, lines, polygons and rasters in a very abstracted way, yet there is now
a “ . . . need for new algorithms, as well as cooperation between users and the cloud, full 3D position
and orientation (pose) estimation of people and devices, and registration of physical and virtual things”
(Shekhar et al., 2015, p. 77). We are developing the technology to capture human bodies and archaeo-
logical objects with full degrees of freedom and can represent them in virtual space (Eve, 2018a). As we
will go on to demonstrate, we can now take our GIS objects or the results of our statistical analyses and
present and explore them in the real locations of real reality, rather than just on the screen of a computer.
The familiar 2D or 2.5D representations of the printed map or illustration can become a real 3D world
overlaid on the actual environment with which we can engage and embody.
The ‘embodied GIS’ was first introduced by Stuart Eve (Eve, 2012, 2014, 2017) to formalise the use of
Augmented Reality (AR) technology within archaeology. Augmented reality is a form of mixed reality
that takes digital data and blends it with the real world. Augmented reality “ . . . allows a user to work in
a real world environment while visually receiving additional computer-generated or modelled informa-
tion to support the task at hand” (Schnabel, Wang, Seichter, & Kvan, 2007, p. 4). George Papagiannakis
and colleagues produced one of the best-known cultural heritage AR applications, centred on the site of
Pompeii (Papagiannakis et al., 2004, 2005; Papagiannakis & Magnenat-Thalmann, 2007) . Using a special
see-through video headset along with dynamic modelling of the real and virtual world, Papagiannakis
and his team were able to insert virtual characters into various real buildings within Pompeii and guide
the visitors through a narrative as they walked through the site. A recent example of the use of AR in
archaeology was a ‘Pokémon Go’ meet up in the city of Chester orchestrated by Big Heritage and Nian-
tic Labs in 2017. Users of the Pokémon Go app were guided around the historical sites in the hope of
hunting virtual creatures (Pokémon) while learning more about the history of the city (Zeroghan, 2017).
Both of these examples overlay digital data on physical spaces, but in the context of our discussions of
Hamilakis’ work, it is worth remembering when using AR
[T]he introduction of the virtual elements should be kept to a minimum and, in contrast, the land-
scape itself should provide the bulk of the experience – the way in which steep slopes tire you; the
shelter gained from standing in the lee of a hill; the smells of the flowers; the sound of the birdsong;
and the views and perspectives that open and close as you explore the landscape.
(Eve, 2017, para. 3.3)
These are powerful modalities to explore. Yet they depend on proprietary software and hardware,
clunky to handle and awkward in the field. The embodied GIS and our entangled digital kinesthetic
sense can (and should) involve haptic full-body engagements (TeslaSuit, 2018), olfactory stimulation
(Eve, 2018b), gustatory stimulation (Iwata et al., 2004) or even direct electrical stimulation of nerve cells
across the body (Delazio et al., 2018). However, without picking the low-hanging fruit of the visual, at
present one of the easiest and most accessible way of evoking this digital kinaesthesia, and exploring and
466 Stuart Eve and Shawn Graham
Figure 24.2 Citation analysis using Summers’ Etudier package, from a Google Scholar Search for ‘data + soni-
fication’. Colours are works that have similar patterns of citation; size are central works that tie scholarship
together. This is not the same thing as ‘most cited’. On this reading, one should begin with Madhyastha and
Reed (1995); Wilson and Lodha (1996); Zhao, Plaisant, Shneiderman, and Duraiswami (2004); De Campo
(2007); Zhao, Plaisant, Shneiderman, and Lazar (2008). A colour version of this figure can be found in the
plates section.
presenting data is through the creative manipulation of aural data points across and within spaces as we
demonstrate in our method and case studies.
As long ago as 1994, John Krygier was arguing for the use of sound and ‘auralisation’ to represent
geographic data, pointing to even earlier work in the 1950s (Pollack & Ficks, 1954) on the use of sound
to represent multivariate data. A citation network analysis shows that Krygier’s work (Figure 24.2) has
not penetrated to any great degree into archaeology, and so we re-introduce ideas of sonification into
this space. In particular, he points to the use of sound coupled with animation, to indicate uncertainty:
Maps tend to be ‘totalising’ creatures: variations in uncertainty and quality are smoothed over to
create an orderly, homogeneous graphic. On one hand, this is why maps are so useful, and it is
obvious that maps enable us to deal with our uncertain and messy world by making it look more
certain and tidy. Yet it seems important that some sense of the uncertainty or quality of the repre-
sented data be available . . . The purpose of maps, remember, is to impose order, not to accurately
represent chaos. Further, there is only so much visual headroom on a display: using visual variables
to display uncertainty may have the effect of limiting the display of other data variables.
(Krygier, 1994, p. 161)
Spatial data visualisation and beyond 467
This concern with uncertainty fits well with the ‘fuzziness’ that a digital synesthesia would promote, and
the kinds of ‘deformance’ or ‘brokenness’ that digital humanities theoreticians like Mark Sample (2012)
argue for. We turn then to sonification as a method and simple ways/case studies that some of this bro-
kenness can be returned to our archaeological geographies.
Method
There is a deep history and literature on archaeoacoustics and soundscapes that tries to capture the sound
of a place as it was (see, for instance, Wall, 2018, on the creation of St. Paul’s or Jeff Vietch’s work on
ancient Ostia, 2017). But we are attempting to sonify spatial datasets – to visualise them with sound, in
situ. This is not so much a recreation of the sounds of the past, but instead a way of exploring our data
about the past. For example, where we might look at a graphical representation of a scatter of flints over
a field, using the visual devices to distance ourselves from the abstract notion of flint counts – we can
instead move through that field wearing headphones, retrieving our location from GPS, and hear the
changes in the data, hear the hotspots (and perhaps more importantly notice the absences of sound) as
we walk. The resulting aural experience is a literal ‘deformance’ that makes us hear modern layers of the
past in a new way.
As Graham (2016) outlines:
Sonification is the practice of mapping aspects of the data to produce sound signals. In general, a
technique can be called ‘sonification’ if it meets certain conditions. These include reproducibil-
ity (the same data can be transformed the same ways by other researchers and produce the same
results) and what might be called intelligibility – that the ‘objective’ elements of the original data
are reflected systematically in the resulting sound.
Last and Usyskin (2015) have undertaken a number of experiments to test how humans react to soni-
fication of datasets and what kinds of tasks this method can achieve. Their results show that even listeners
with no formal training in music can perceive useful distinctions on the data. These distinctions included
common data visualisation tasks such as classification and clustering.
Because music is sequential and has a duration, Last and Usyskin argue that time-series data is particu-
larly well-suited to sonification (2015, p. 424). Time-series data is also sequential and evolves over time. In
many aspects of sonification,‘parameter mapping’ is used to match a certain data series to various auditory
dimensions (in our flint example, the amount of flint present in a location might be matched to the pitch
of the sound – the higher the pitch the greater the concentration of flint). Rasterised GIS datasets, by their
very definition, are continuous surfaces of data, and every point of space has a value. Therefore, when
we move through the space represented by that raster, physically walking over the field of flint scatters, it
can be considered similar to panning the mouse pointer over the raster of flint concentrations. The data
is continuous and so sonification of that data is quite appropriate. We journey through the space, at the
same time as journeying through the soundscape created by and from that data.
There is also an effect where our expectations of what the sound ‘is’ or ‘represents’ causes us to literally
hear sounds that are not there. A typical example involves flattening all of the instruments and voices in
a pop-song into a midi file, and then playing that midi file as a piano solo. If one is already familiar with
the song, one can hear the ‘voice’ singing. If not, the sound is unintelligible noise. This effect is sometimes
called an ‘auditory hallucination’(c.f. Koebler, 2015). This example shows how in any representation of
data we can hear/see what is not, strictly speaking, there. We fill the holes with our own expectations.
The sonification of the flint example is subject to the same spatial resolution issues as a more traditional
468 Stuart Eve and Shawn Graham
visualisation, the resulting soundscape will change if we use a 5m pixel resolution (picking up the smaller
variations in the data) or a 25m pixel resolution (only playing the broader trends). The same is true of
any visualisation; it just is perhaps more apparent as we consider sound. Thus, as with all methods of
visualisation, we need to be critically self-aware, and foreground that reflection as part of our analysis.
Case studies
The areas of the cemetery that are visually empty are suddenly transformed into areas containing a
vast number of voices of the dead. There is a common belief that it is bad luck or disrespectful to
Spatial data visualisation and beyond 469
walk over somebody’s grave, therefore the ‘empty’ paths that were previously seen as a ‘safe’ places
to walk, suddenly become areas that are superstitiously liminal.
(Eve, 2017, para. 4.2)
The experiment also raised issues about power and control in the cemetery and how that is reflected by
the placement of the graves. In contrast to the cacophony produced by the pauper pits, when you move
closer to a larger, expensive grave monument the cacophony is reduced to just one or two voices – as
the expensive graves have been placed to stand apart from the other graves. The voices of the rich and
powerful are heard as clearly in death as they were in life. We would argue that this social stratification
and also the affective nature of using sounds and voices to represent the pauper graves would not be so
obvious if we were looking at a simple visualisation on a screen or printed on a map.
Conclusion
Within this chapter, we have shown that the visualisation of spatial data is not just limited to dots on a
map, or hachures on an archaeological plan – instead we demonstrate that opening up archaeological data
to be experienced through other sensory modalities might open our understandings of the past in new
ways. The traditional methods of visualising our data have much merit and should not be discarded, they
are familiar, and because of that familiarity they are easy to understand and also often easy to produce
using modern software. But we would argue that we are now at the point in the development of spatial
computing where we can explore our data in parallel using different interfaces and different sensory
modalities.
Spatial data visualisation and beyond 471
We have used examples of the sonification of data as one way into accessing these different modalities.
Whilst the software and hardware to sonify data is still not mainstream, presently it is developed enough
to enable researchers to begin to use it (much more so than, for instance, olfactory or gustatory inter-
faces). Not all data is suitable for sonification, in the same way that not all data is suitable for visualisation
in a scatter chart or a raster surface. Nevertheless we have shown that sonification can become another
vector for knowledge mobilisation. Just as in a stylised visual map, it is not a passive representation of
the archaeological data, but a performance of the data that gestures beyond itself, to conjure up other
associations, meanings, and emotions.
As available technology and methods progress we are going to be able to move beyond the simple
map or distribution chart and begin to experience our data with our bodies, with multiple senses. We
are going to be able to experience our datasets in situ, as we walk through an archaeological site or land-
scape – and we are not going to just see the patterns change, we are going to hear, feel, taste and smell
them. Spatial data visualisation is no longer visualisation at all, it is an embodied experience that uses
multiple sensory modalities to represent the same underlying datasets, each modality telling its own story
and revealing its own unique patterns.
References
Acevedo, D., Vote, E., Laidlaw, D. H., & Joukowsky, M. S. (2001). Archaeological data visualization in VR: Analysis of
lamp finds at the Great Temple of Petra, a case study. Proceedings of the conference on Visualization’01 (pp. 493–496).
IEEE Computer Society.
Allen, P., Feiner, S., Troccoli, A., Benko, H., Ishak, E., & Smith, B. (2004). Seeing into the past: Creating a 3D modeling
pipeline for archaeological visualization. 3D Data Processing, Visualization and Transmission, 2004. 3DPVT 2004.
Proceedings. 2nd International Symposium on (pp. 751–758). IEEE.
Andrienko, N., & Andrienko, G. (2006). Exploratory analysis of spatial and temporal data: A systematic approach. New
York: Springer Science & Business Media.
Brewer, I., MacEachren, A. M., Abdo, H., Gundrum, J., & Otto, G. (2000). Collaborative geographic visualization: Enabling
shared understanding of environmental processes. Information Visualization, 2000. InfoVis 2000. IEEE Symposium
on (pp. 137–141). IEEE.
Caraher, W. (2015). Slow archaeology. North Dakota Quarterly, 80(2), 43–52.
Conolly, J., & Lake, M. (2006). Geographical information systems in archaeology. Cambridge: Cambridge University Press.
Cooper, A., & Green, C. (2016). Embracing the complexities of “Big Data” in archaeology: The case of the
English landscape and identities project. Journal of Archaeological Method and Theory, 23(1), 271–304. https://doi.
org/10.1007/s10816-015-9240-4
Crampton, J. W. (2002). Interactivity types in geographic visualization. Cartography and Geographic Information Science,
29(2), 85–98.
Croxall, B., & Warnick, Q. (2017). Failure | digital pedagogy in the humanities | MLA commons. Retrieved April 29,
2018, from https://digitalpedagogy.mla.hcommons.org/keywords/failure/
De Campo, A. (2007). Toward a data sonification design space map. Atlanta: Georgia Institute of Technology.
Delazio, A., Nakagaki, K., Klatzky, R. L., Hudson, S. E., Lehman, J. F., & Sample, A. P. (2018). Force jacket:
Pneumatically-actuated jacket for embodied haptic experiences. In Proceedings of the 2018 CHI Confer-
ence on Human Factors in Computing Systems (pp. 320:1–320:12). New York, NY, USA: ACM. https://doi.
org/10.1145/3173574.3173894
Di Franco, P. D. G., Camporesi, C., Galeazzi, F., & Kallmann, M. (2015). 3D printing and immersive visualization for
improved perception of ancient artifacts. Presence, 24(3), 243–264.
Ebert, D. (2004). Applications of archaeological GIS. Canadian Journal of Archaeology/Journal Canadien d'Archéologie,
319–341.
Engadget. (2018). The world now has a smart toaster. Retrieved April 26, 2018, from www.engadget.com/2017/01/04/
griffin-connects-your-toast-to-your-phone/
472 Stuart Eve and Shawn Graham
Eve, S. (2012). Augmenting phenomenology: Using augmented reality to aid archaeological phenomenology in the
landscape. Journal of Archaeological Method and Theory, 19(4), 582–600. https://doi.org/10.1007/s10816-012-9142-7
Eve, S. (2014). Dead men’s eyes: Embodied GIS, mixed reality and landscape archaeology. BAR British Series 600. Oxford:
Archaeopress.
Eve, S. (2017). The embodied GIS: Using mixed reality to explore multi-sensory archaeological landscapes. Internet
Archaeology, (44). https://doi.org/10.11141/ia.44.3
Eve, S. (2018a). Losing our senses, an exploration of 3D object scanning. Open Archaeology, 4(1), 114–122. https://
doi.org/10.1515/opar-2018-0007
Eve, S. (2018b). A dead man’s nose: Using smell to explore the battle of waterloo. In D. Medway, K. McLean,
C. Perkins, & G. Warnaby (Eds.), Designing with smell: The practices, techniques and challenges of olfactory creation (In
Press). London: Routledge.
Eve, S., Hoffman, K., Morgan, C., Pantos, A., & Kinchin-Smith, S. (2014). Voices recognition paradata document. Retrieved
February 3, 2016, from www.heritagejam.org/s/VoicesRecognitionParadata.pdf
Favro, D., & Johanson, C. (2010). Death in motion: Funeral processions in the Roman forum. Journal of the Society
of Architectural Historians, 69(1), 12–37.
Foo, B. (2018). music-lab-scripts: Scripts for generating music. Python. Retrieved from https://github.com/beefoo/music-
lab-scripts (Original work published 2014).
Forte, M., Dell’Unto, N., Issavi, J., Onsurez, L., & Lercari, N. (2012). 3D archaeology at Çatalhöyük. International
Journal of Heritage in the Digital Era, 1(3), 351–378.
Frieman, C., & Gillings, M. (2007). Seeing is perceiving? World Archaeology, 39(1), 4. https://doi.
org/10.1080/00438240601133816
Gillings, M., Hacıgüzeller, P., & Lock, G. (Eds.). (2019a). Re-mapping archaeology: Critical perspectives, alternative mappings.
New York, NY: Routledge.
Gillings, M., Hacıgüzeller, P., & Lock, G. (2019b). On maps and mapping. In M. Gillings, P. Hacıgüzeller, & G. Lock
(Eds.), Re-mapping archaeology: Critical perspectives, alternative mappings (pp. 1–16). New York, NY: Routledge.
Gosden, C. (2001). Making sense: Archaeology and aesthetics. World Archaeology, 33(2), 163–167.
Graham, S. (2015). Listening to watling street. Retrieved from www.heritagejam.org/2015exhibitionentries/2015/9/18/
listening-to-watling-street-dr-shawn-g raham
Graham, S. (2016). The sound of data (a gentle introduction to sonification for historians). Programming Historian.
Retrieved from https://programminghistorian.org/lessons/sonification
Graham, S., & Eve, S. (2013). Historical friction. Retrieved February 3, 2016, from https://github.com/shawngraham/
historicalfriction
Gupta, N., & DeVillers, R. (2017). Geographic visualization in archaeology. Journal of Archaeological Method and
Theory, 24, 852–885.
Hamilakis, Y. (2014). Archaeology and the senses: Human experience, memory, and affect. Cambridge: Cambridge Uni-
versity Press.
Hodder, I. (2012). Entangled: An archaeology of the relationships between humans and things. New Jersey: John Wiley &
Sons.
Howard, D., & MacEachren, A. M. (1996). Interface design for geographic visualization: Tools for representing reli-
ability. Cartography and Geographic Information Systems, 23(2), 59–77.
Ingold, T. (2015). The life of lines. Abingdon, UK: Routledge.
Iwata, H., Yano, H., Uemura, T., & Moriya, T. (2004, March). Food simulator: A haptic interface for biting. In IEEE
virtual reality 2004 (pp. 51–57). IEEE.
Koebler, J. (2015, December 18). The strange acoustic phenomenon behind these wacked-out versions of pop songs.
Retrieved April 29, 2018, from https://motherboard.vice.com/en_us/article/kb7agw/the-strange-acoustic-
phenomenon-behind-these-wacked-out-versions-of-pop-songs
Kopetz, H. (2011). Internet of things. In Real-time systems (pp. 307–323). New York: Springer.
Kraak, M.-J., & Ormeling, F. J. (2013). Cartography:Visualization of spatial data. Abingdon, UK: Routledge.
Krygier, J. B. (1994). Chapter 8: Sound and geographic visualization. In A. M. Maceachren & D. R. F. Taylor
(Eds.), Modern cartography series (Vol. 2, pp. 149–166). Academic Press. https://doi.org/10.1016/B978-0-
08-042415-6.50015-6
Laino, F. (2014). 2014 entries. Retrieved February 3, 2016, from www.heritagejam.org/2014-entries/
Spatial data visualisation and beyond 473
Last, M., & Usyskin, A. (2015). Listen to the sound of data. In Multimedia data mining and analytics (pp. 419–446).
New York: Springer.
MacEachren, A. M., Boscoe, F. P., Haug, D., & Pickle, L. W. (1998). Geographic visualization: Designing manipulable
maps for exploring temporally varying georeferenced statistics. In Information Visualization, 1998. Proceedings. IEEE
Symposium on (pp. 87–94). IEEE.
Madhyastha, T. M., & Reed, D. A. (1995). Data sonification: Do you see what I hear? IEEE Software, 12(2), 45–56.
Marsillo, C. (2018). Ottowa love stories. Retrieved from https://ottlovestories.wordpress.com/
McCoy, M. D. (2017). Geospatial Big Data and archaeology: Prospects and problems too great to ignore. Journal of
Archaeological Science, 84, 74–94.
Mlekuz, D. (2004, November 11). Listening to landscapes: Modelling past soundscapes in GIS [text.article]. Retrieved
November 16, 2010, from http://intarch.ac.uk/journal/issue16/mlekuz_index.html
Moser, S. (2012). Early artifact illustration and the birth of the archaeological image. Archaeological Theory Today,
292–322.
Murdoch, M., & Davies, J. (2017). Spiritual and affective responses to a physical church and corresponding virtual
model. Cyberpsychology, Behavior, and Social Networking, 20(11), 702–708. https://doi.org/10.1089/cyber.2017.
0249
Neumüller, M., Reichinger, A., Rist, F., & Kern, C. (2014). 3D printing for cultural heritage: Preservation, accessibil-
ity, research and education. In 3D research challenges in cultural heritage (pp. 119–134). Berlin, Heidelberg: Springer.
https://doi.org/10.1007/978-3-662-44630-0_9
Papagiannakis, G., & Magnenat-Thalmann, N. (2007). Mobile augmented heritage: Enabling human life in ancient
Pompeii. International Journal of Architectural Computing, 5(2), 396–415.
Papagiannakis, G., Schertenleib, S., O’Kennedy, B., Arevalo-Poizat, M., Magnenat-Thalmann, N., Stoddart, A., &
Thalmann, D. (2005). Mixing virtual and real scenes in the site of ancient Pompeii. Computer Animation and
Virtual Worlds, 16(1), 11–24.
Papagiannakis, G., Schertenleib, S., Ponder, M., Arévalo, M., Magnenat-Thalmann, N., & Thalmann, D. (2004).
Real-time virtual humans in AR sites. Proceedings of IEE Visual Media Production 2004 (pp. 273–276). Stevenage,
Hertfordshire: IEE.
Pollack, I., & Ficks, L. (1954). Information of elementary multidimensional auditory displays. The Journal of the
Acoustical Society of America, 26(2), 155–158.
Poole, D. (1997). Vision, race, and modernity: A visual economy of the Andean image world. Princeton, NJ: Princeton
University Press.
Primeau, K. E., & Witt, D. E. (2017). Soundscapes in the past: Investigating sound at the landscape level. Journal of
Archaeological Science: Reports, 19, 875–885.
Sample, M. (2012). Notes towards a deformed humanities. Retrieved April 29, 2018, from www.samplereality.
com/2012/05/02/notes-towards-a-deformed-humanities/
Schnabel, M. A., Wang, X., Seichter, H., & Kvan, T. (2007). From virtuality to reality and back. In Proceedings of the
IASDR 2007 conference. Hong Kong: The Hong Kong Polytechnic University.
Sekula, A. (1981). The traffic in photographs. Art Journal, 41(1), 15–25. https://doi.org/10.1080/00043249.1981
.10792441
Shekhar, S., Feiner, S. K., & Aref, W. G. (2015). Spatial computing. Commun. ACM, 59(1), 72–81. https://doi.
org/10.1145/2756547
Slocum, T. A., Blok, C., Jiang, B., Koussoulakou, A., Montello, D. R., Fuhrmann, S., & Hedley, N. R. (2001). Cogni-
tive and usability issues in geovisualization. Cartography and Geographic Information Science, 28(1), 61–75. https://
doi.org/10.1559/152304001782173998
Slocum, T. A., McMaster, R. M., Kessler, F. C., Howard, H. H., & Mc Master, R. B. (2008). Thematic cartography and
geographic visualization. New Jersey: Prentice Hall.
Sterry, M. (2018). Multivariate and spatial visualisation of archaeological assemblages. Internet Archaeology, (50).
https://doi.org/10.11141/ia.50.15
Summers, E. (2016). ICI: Edit Wikipedia pages near you. JavaScript. Retrieved from https://github.com/edsu/ici
(Original work published 2013).
Teslasuit. (2018). Teslasuit: Full body haptic suit. Retrieved April 27, 2018, from https://teslasuit.io/
Tyner, J. A. (2014). Principles of map design. New York: Guilford Publications.
474 Stuart Eve and Shawn Graham
Van Dam, A., Laidlaw, D. H., & Simpson, R. M. (2002). Experiments in immersive virtual reality for scientific visu-
alization. Computers & Graphics, 26(4), 535–555.
Veitch, J. (2017). Soundscape of the street: Architectural acoustics in Ostia. In E. Betts (Ed.), Senses of the empire:
Multisensory approaches to Roman culture. Abingdon-On-Thames: Taylor & Francis.
Vote, E., Acevedo, F. D., Laidlaw, D. H., & Joukowsky, M. S. (2002). Discovering petra: Archaeological analysis in
VR. IEEE Computer Graphics and Applications, 22(5), 38–50.
Wall, J. (2018). Recovering lost acoustic spaces: St. Paul’s cathedral and Paul’s churchyard in 1622. Retrieved April 27, 2018,
from www.digitalstudies.org/articles/10.16995/dscn.58/
Walters, M. (2018). Mark Walters on Sketchfab. Retrieved April 27, 2018, from https://sketchfab.com/markwalters
Wheatley, D., & Gillings, M. (2000). Vision, perception and GIS: Developing enriched approaches to the study of
archaeological visibility. In G. Lock (Ed.), Beyond the map (pp. 1–27). Amsterdam: IOS Press.
Wilson, C. M., & Lodha, S. K. (1996). Listen: A data sonification toolkit. Atlanta: Georgia Institute of Technology.
Xia, F., Yang, L. T., Wang, L., & Vinel, A. (2012). Internet of things. International Journal of Communication Systems,
25(9), 1101.
Zeroghan. (2017, July 28). Interview with Dean Paton, big heritage: Pokemon GO at the chester heritage festival. Retrieved April
27, 2018, from https://pokemongohub.net/post/interview/interview-dean-paton-big-heritage-pokemon-go-
chester-heritage-festival/
Zhao, H., Plaisant, C., Shneiderman, B., & Duraiswami, R. (2004). Sonification of geo-referenced data for auditory informa-
tion seeking: Design principle and pilot study. Atlanta: ICAD.
Zhao, H., Plaisant, C., Shneiderman, B., & Lazar, J. (2008). Data sonification for users with visual impairment: A case
study with georeferenced data. ACM Transactions on Computer-Human Interaction (TOCHI), 15(1), 4.
Index
Note: page numbers in italics indicate figures and page numbers in bold indicate tables on the corresponding
pages.
field data processing 30–33, 31–33 potential field measurements 379–382, 381; main
first-order effect 158 convolution 377–379, 380; method of using
first-order spatial intensity 61, 62 377–396; micro-gravity measurements 390–391,
Fish, P. R. 221 391; multi-sensor magnetic measurements 382–383;
Fish, S. K. 221 preprocessing techniques 377; processing of shallow
Fisher, P. F. 170, 320, 322 depth 377–383, 378, 381; seismic measurements
Fletcher, M. 213 392–393, 393–395
Flory, P. J. 77 georeferencing 365
Fort, J. 130, 131, 132, 132, 142 Georgia Coast model 236–242, 238–241
Fourier series 380, 381, 383 geostatistics: Ballyhenry rath, County Antrim,
Fox, C. 7, 85 Northern Ireland 103–106, 104–112; basis of 95–96;
Frachetti, M. 408 case studies in 103–114; Coins of Allectus 106–114,
Franklin, J. 160 109–115; conclusions on 114–116; conditional
F-ratio 219 simulation in 102–103; introduction to 93–94;
Frisch, H. L. 77 method in 94–103; scale and spatial structure in
F-statistic 140, 151 94–95; simple kriging (SK) in 100–102; variogram in
F-test 223 96–99, 96–100
Fulminate, F. 282 Getis, A. 160
functional uncertainty 185 Getis’ local Gi* 161
Fusco, J. 178 Gietl, R. 353
fuzzy sets: case studies on 175–188; conclusions on Gilbert, N. 247
188–189; FGISSAR (Fuzzy Geographic Information Gillings, M. 20, 93, 333, 462
System for Spatial Analysis in aRchaeology) 175–188, GISSAR (Geographic Information System for Spatial
176–185, 187–188; introduction to 169–171; Analysis in aRchaeology) 175
method of 172–174, 172–175 GitHub 34, 36
Givoni, B. 338
Gabaix, X. 78, 79, 80 Gkiasta, M. 162
Gabriel graphs 280–281, 281, 287–289, 288, 289, 291 Global Navigation Satellite System positioning 30
Gadsby, D. 36 Global Positioning Systems (GPS) 21, 94; field data
Gaffney, V. 215, 334 processing 31–32, 31–32; Real-Time Kinematic
Gajewski, K. 163 (RTK) 454, 454–455
Gangodagamage, C. 138 global spatial network analysis 278–290, 279
Ganskopp, D. 352 GlobalXplorer Project 370–371
Gardner, R. H. 118 Goldman, R. 338
Garmy, P. 352 Gonçalves, C. 135, 142
Garrad, C. 32 Google Scholar 462, 463
Gattiglia, G. 432, 441 Gosden, C. 408, 460
Geissenklösterle see Anatomically Modern Humans Graham, S. 467, 469
(AMH) Graph theory 323
generalised least squares 141 GRASS GIS 80, 83, 89–90
Geographical Information Science (GISci) 410 Green, C. 20, 23, 419–420, 421
geographically weighted regression (GWR) 161–162 Grewe, K. 348
geographic information systems (GIS) 1, 11, 18, 27, 156, grey literature 23
231, 409; 3D (see three-dimensional models); agent- Grimm, V. 247, 258, 261, 262
based modelling (ABM) and 258; archaeological Groenhuijzen, M. R. 353
vision and 461–462; embodied 465; GRASS 80, ground penetrating radar (GPR) 385–389, 386–389
83, 89–90; local spatial analysis and 157–158; Guattari, F. 10
purposive selection and 50; in regional environmental Guichard, F. 249
relationships analysis 215–217, 221, 222; sampling and Güimil-Fariña, A. 352
50; temporal 410, 411–412, 419–421, 421; visibility Guiot, J. 254
analysis based on (see visibility analysis, GIS-based) Gupta, N. 461
geophysical data: case study on 396–398, 397–405;
conclusions on 398, 406; creation of maps from 383; Hacıgüzeller, P. 462
electrical resistivity tomography (ERT) 383–384, Hage, P. 281
384; electromagnetic induction measurements 382, Hagen, J. 348
392; ground penetrating radar 385–389, 386–389; Hägerstrand, T. 414
interpretation of 396; introduction to 376; magnetic Haggett, P. 9
Index 479
simple kriging (SK) 100–102 state historic preservation offices (SHPOs) 22–23
simple random sampling 44, 44 stationarity 155–156
Simultaneous Autoregressive Model (SAR) 141–142 Steele, J. 162
Siolas, A. 171 Stephan, E. 161
site catchment analysis (SCA) 9, 213–214 Sterry, M. 462
slow archaeology 461 Steward, J. H. 7, 56, 212
Smith, J. 36 Stockmayer, W. H. 77
Smithwick, E. A. H. 118 Stolar, J. 156
Society of American Archaeology 36 stratified sampling 44, 45, 52, 52–53
Solberg, R. 367 strontium 193
sonification: archaeological vision and 461–467, 463, Strupler, N. 34
466; case studies on 468–470; conclusions on 470– Stucky, D. 334
471; introduction to 460–467, 463, 466; method for Study of Archaeology, A 8
467–468 Stukeley, W. 5
Soule, R. 338 Styring, A. 161
soundscape 467–468 Suarez, R. 254
Southwestern Archaeological Research Group (SARG) summed probability distribution of radiocarbon dates
213 (SPDRD) 162–165, 164
space: archaeology and 1–3; concepts of 12; Sweet, R. 5
representational 3, 12–13 systematic sampling 44, 44–45
space and time: case studies on 416–421; concepts Systems Theory 8
of spatiotemporality and 408–409; conceptual
computer modeling of 410–411, 411; conclusions on Tabbagh, A. 376, 392
421–424, 423; dynamic mapping of 417, 417–418; Taylor, J. S. 418
event-based modeling 415, 416; implementation Taylor, W. 8
of models of 411–412; introduction to 408–411, t-conorms 173
411; object lifespan approach to 418–419, 419, 420; technological determinism 13
spatiotemporal model of 412–415, 413, 415–416; Teltser, P. 20
temporal-GIS and 410; towards an archaeological Temporal-GIS 410, 411–412, 419–421, 421
temporal GIS of 419–421, 421 Terradas, X. 114
space syntax methodology: basic principles of 298–300, Terrell, J. E. 408
299; case study on 305–308, 306; conclusions on Thematic Mapper (TM) multispectral sensors 360
308–309; introduction to 296–298, 297; nodes and Theodorakopoulou, K. 94
edges in 301–303, 302; qualities and indicators in Thiesson, J. 392
303–305 thin plate spline (TPS) interpolation 122–123, 123
space-time cube 412–414, 413 Thomas, D. H. 56, 237
space-time path (STP) 414 Thomas, J. 408
spatial analysis 8–11, 12–13 Thomsen, C. 6
Spatial Analysis in Archaeology 9, 17 three-dimensional models: brief history of 444–445;
Spatial Archaeology 9 case studies on 451–455, 452–454; Çatalhöyük,
spatial archaeology, concept of 3–8 Turkey 451–455, 452–454; conclusions on 455–456;
spatial binning method 434–435 introduction to 444–449, 446–448; Kämpinge,
spatial expansion method 162 Sweden 454, 454–455; method for 449–450, 451;
spatial intensity 60–63, 62 visualized in archaeological field practice 445–449,
spatial interactions 63–67, 64, 65, 66 446–448
spatial material culture networks 276–277 Tiffany, J. A. 214
spatial narrative 8–11 tight coupling of ABM and GIS 258
Spatial regression models for the social sciences 151 Tilley, C. 11, 408
spatial sampling see sampling time see space and time
spatial structure in archaeology see geostatistics Time in Geographic Information Systems 410
spatial thinking 12–13 TimeMap 417, 417–418
Spearman’s rank correlation coefficient 136 time-slicing 414–415, 415
Squier, E. G. 6, 8 t-norms 173
Srivastava, P. K. 171 Tobler, W. 336, 349
Stančič, Z. 215, 334 triangulated irregular network (TIN) 121, 314
Stanley, E. H. 118 triangulation networks 121, 282, 314
Stark, M. T. 219 Trier, D. 367
484 Index