100% found this document useful (3 votes)
409 views

Archaeological Spatial Analysis - A Methodological Guide-Routledge (2020)

Archaeology, Analysis, Method
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (3 votes)
409 views

Archaeological Spatial Analysis - A Methodological Guide-Routledge (2020)

Archaeology, Analysis, Method
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 545

Archaeological Spatial Analysis

Effective spatial analysis is an essential element of archaeological research; this book is a unique guide
to choosing the appropriate technique, applying it correctly and understanding its implications both
theoretically and practically.
Focusing upon the key techniques used in archaeological spatial analysis, this book provides the
authoritative, yet accessible, methodological guide to the subject which has thus far been missing
from the corpus. First is a richly referenced introduction to the particular technique, followed by a
detailed description of the methodology, then an archaeological case study to illustrate the application
of the technique, and conclusions that point to the implications and potential of the technique within
archaeology.
The book is designed to function as the main textbook for archaeological spatial analysis courses at
undergraduate and post-­g raduate level, while its user-friendly structure makes it also suitable for self-­
learning by archaeology students as well as researchers and professionals.

Mark Gillings is a Professor of Archaeology in the Department of Archaeology & Anthropology at


Bournemouth University. His research interests concentrate upon the productive spaces that emerge through
the integrated study of landscape, archaeological theory and digital archaeology, with a particular focus upon
the potentials of all things geospatial and virtual. Much of his recent research has centred upon the prehistoric
landscapes of south-western Britain, and the relationships that animated the complex, multi-­scalar motleys of
monumental structures and traces of everyday dwelling that characterise this region.

Piraye Hacıgüzeller is a senior postdoctoral researcher at the Ghent Centre for Digital Humanities and the
Archaeology Department of Ghent University. Her research interests are the theory and practice of digital
archaeology and, more generally, digital humanities, specifically in the cases of geospatial data visualisation,
management and analysis. She is the co-­editor of a recent book on archaeological mapping, Re-­mapping
Archaeology: Critical Perspectives, Alternative Mappings (Routledge, 2018).

Gary Lock is Emeritus Professor of Archaeology at the University of Oxford where he has spent 35 years
teaching and researching several areas of archaeology. One of his specialisms is the British Iron Age, especially
hillforts, and he was Co-­PI of the Atlas of Hillforts of Britain and Ireland. His other main area of interest is
computer applications in archaeology, especially GIS and spatial archaeology, in which he has published several
books. He has recently retired as Chair of the Computer Applications in Archaeology conference.
Archaeological Spatial
Analysis
A Methodological Guide

Edited by Mark Gillings, Piraye Hacıgüzeller


and Gary Lock
First published 2020
by Routledge
2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN
and by Routledge
52 Vanderbilt Avenue, New York, NY 10017
Routledge is an imprint of the Taylor & Francis Group, an informa business
© 2020 selection and editorial matter, Mark Gillings, Piraye Hacıgüzeller and Gary Lock;
individual chapters, the contributors
The right of Mark Gillings, Piraye Hacıgüzeller and Gary Lock to be identified as the authors
of the editorial material, and of the authors for their individual chapters, has been asserted in
accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this book may be reprinted or reproduced or utilised in any
form or by any electronic, mechanical, or other means, now known or hereafter invented,
including photocopying and recording, or in any information storage or retrieval system,
without permission in writing from the publishers.
Trademark notice: Product or corporate names may be trademarks or registered trademarks, and
are used only for identification and explanation without intent to infringe.
British Library Cataloguing-­in-­Publication Data
A catalogue record for this book is available from the British Library
Library of Congress Cataloging-­in-­Publication Data
A catalog record for this book has been requested
ISBN: 978-0-815-37322-3 (hbk)
ISBN: 978-0-815-37323-0 (pbk)
ISBN: 978-1-351-24385-8 (ebk)
Typeset in Bembo
by Apex CoVantage, LLC.
To Nurol Hacıgüzeller, my mum, with gratitude and love . . .

Piraye

To Mike Fletcher who sparked my interest in statistics many years ago and to Jude for ­
continuing support and love.

Gary

To Steve Dockrill and Glyn Goodrick. Legends.

Mark
Contents

List of figures x
List of tables xxiv
List of contributors xxvi

1 Archaeology and spatial analysis 1


Mark Gillings, Piraye Hacıgüzeller and Gary Lock

2 Preparing archaeological data for spatial analysis 17


Neha Gupta

3 Spatial sampling 41
Edward B. Banning

4 Spatial point patterns and processes 60


Andrew Bevan

5 Percolation analysis 77
M. Simon Maddison

6 Geostatistics and spatial structure in archaeology 93


Christopher D. Lloyd and Peter M. Atkinson

7 Spatial interpolation 118


James Conolly

8 Spatial applications of correlation and linear regression 135


Piraye Hacıgüzeller
viii Contents

9 Non-­stationarity and local spatial analysis 155


Enrico R. Crema

10 Spatial fuzzy sets 169


Johanna Fusco and Cyril de Runz

11 Spatial approaches to assignment 192


John Pouncett

12 Analysing regional environmental relationships 212


Kenneth L. Kvamme

13 Predictive spatial modelling 231


Philip Verhagen and Thomas G. Whitley

14 Spatial agent-­based modelling 247


Mark Lake

15 Spatial networks 273


Tom Brughmans and Matthew A. Peeples

16 Space syntax methodology 296


Ulrich Thaler

17 GIS-­based visibility analysis 313


Mark Gillings and David Wheatley

18 Spatial analysis based on cost functions 333


Irmela Herzog

19 Processing and analysing satellite data 359


Tuna Kalaycı

20 Processing and analysing geophysical data 376


Apostolos Sarris

21 Space and time 408


James S. Taylor

22 Challenges in the analysis of geospatial ‘big data’ 430


Chris Green

23 The analytical role of 3D realistic computer graphics 444


Nicoló Dell’Unto
Contents ix

24 Spatial data visualisation and beyond 460


Stuart Eve and Shawn Graham

Index 475
Figures

2.1 Sources and types of errors in data collection and compilation, data processing and data
usage that result in final global error, adapted from Hunter and Beard (1992). 19
2.2 The archaeological workflow in terms of a computational pipeline from data
acquisition to unpublished data, and re-­use. Problems of quality impact data in each
stage. Black boxes in this workflow occur wherever archaeologists employ software
and tools whose code are unavailable to review and modify and that do not enable
documentation of transformations. 20
2.3 Parsing options in OpenRefine for a file in keyhole markup language. 28
2.4 A spreadsheet style interface on OpenRefine that shows information in columns. 29
2.5 The cleaned version of the file ready with coordinates for mapping. 29
2.6 The cleaning sequence or ‘recipe’ for converting kml into comma separated value (csv)
format. The code can be exported, modified and re-­used. 30
2.7 An overview of the location and estimated size of the study area in Saint-­Pierre, France. 31
2.8 A map showing points from two surveys that were collected on a total station.
Location of the total station or origin is represented as a star, survey on archaeological
features on surface is marked in green, and the survey of topography is in brown. 32
2.9 The survey points overlaid on a scanned map that is geo-­rectified to WGS-­UTM 21.
A Python script was developed to enable rotation and transformation of points in
a local coordinate system to a global coordinate system (UTM) using two known
coordinate pairs. 33
3.1 Examples of random and systematic spatial samples using points, rectangles, and
transects as the sample elements. (a) random point sample, (b) systematic transect
sample (walking north), and (c) systematic, stratified, unaligned sample of small squares. 44
3.2 Example of a random Probability Proportional to Size (PPS) sample of agricultural
fields used as sampling elements. Any field that contains one or more of the random
points is included in the sample (hatched). Note how larger fields are over-­represented,
but this may have practical advantages in fieldwork in terms of survey costs. 46
3.3 Map of the survey region of the Ayl to Ras an-­Naqab Survey in southern Jordan, with
three strata and 500 m × 500 m sample elements. 52
Figures xi

3.4 Map of a portion of the Wadi Quseiba survey region in northern Jordan, showing the
ephemeral stream channels and the population of landscape elements or “polygons”
that constituted the sampling frame for Stratum 2 of this survey. 54
3.5 Decline in the Relative Standard Error (RSE) of micro-­refuse counts with increasing
sample size in the use of sequential sampling in Wadi Ziqlab. Sampling stopped after
the three-­point slope was less than 0.03 for three consecutive measures of RSE. 56
4.1 Examples of the first-­order spatial intensity of a point pattern and its summaries: (a) a
random point distribution (n = 100, the study area is notionally 10 × 10 map units
in size), (b) a quadrat count of the same, (c) the histogram of observed quadrat counts
and the expected Poisson distribution if the pattern is random, (d) an inhomogeneous
point distribution where the intensity of points is higher in the top-­r ight corner,
(e) a quadrat count of the same, (f) the histogram of observed quadrat counts and the
expected Poisson distribution if the pattern is random, (g) kernel density estimate of
the inhomogeneous pattern in (d) using a Gaussian kernel with a standard deviation
of 0.5 map units, (h) the same, but with a kernel standard deviation of 1 map unit, and
(i) the same, but with a kernel standard deviation of 2 map units. 62
4.2 Three hypothetical distributions (a–c) and how they manifest as K functions (d–f) and
pair correlation functions (g–i), the x-­axes in (d–i) are in metres and refer to the radius of
the circles around each point within which the respective K or Pair Correlation Function
(PCF) statistic is calculated; the critical envelope encompasses 95% of 999 simulations. 64
4.3 Multi-­scale second-­order effects: (a) a simulated process with small-­scale regularity
and medium-­scale clustering (b) the pair correlation function of a (the x-­axis is in
metres and refers to the radius of the circles around each point within which the pair
correlation statistic [y-­axis] is calculated; the critical envelope encompasses 95% of 999
simulations). 66
4.4 Mapping the spatial probability of a point subset: (a) a hypothetical example of 200 points
that combines the random and inhomogeneous point patterns from Figures 4.1 (a, d).
(b) the local spatial probability (out of 1) of finding a point from the inhomogeneous
surface Figure 4.1d, with a much higher probability to the top-­r ight (c) UK Portable
Antiquities Scheme data showing Iron Age gold and silver coins of ‘Dobunni’ style,
and (d) the local spatial probability of finding gold coins with an area of much higher
probability on the western borders. 68
4.5 Archaeological survey evidence from the Greek island of Antikythera: (a) individual
houses and field huts of approximately 19th-­early 20th century AD date, (b) surface
pottery of approximately 19th-­early 20th century AD date collected during
fieldwalking of the whole island, (c) access to flat land (a count of how many flatland
cells are within a radius of 500m), and (d) distance to the nearest large freshwater
spring (square root-­transformed). 72
4.6 PCFs for three stages of model fitting, calculated in the same way for both buildings
and surface pottery (the x-­axis is in metres and refer to the radius of the circles around
each point within which the pair correlation statistic [y-­axis] is calculated; the critical
envelope encompasses 95% of 999 simulations): (a–b) observed PCF and 95% critical
envelope constructed from random simulations of a homogeneous Poisson process
(i.e. a null model of complete spatial randomness); (c–d) the same as (a–b), but with
envelopes now constructed from conditional random simulations from a fitted first
order model using the two covariates in Figure 4.5(c–d); and (e–f) the same as (c–d),
xii Figures

but with envelopes now constructed from simulations conditioned on both the two
first order covariates and an additional clustering model. 73
5.1 Discrete City Clustering Algorithm applied to population density in the UK. This
shows the step by step approach of the cluster being identified on a given lattice. Top
left shows a populated lattice and top right a cell is chosen as the starting point, and its
immediate neighbours are then incorporated. In the final bottom right quadrant the
process has been reapplied to those neighbours as well. 78
5.2 Continuum City Clustering Algorithm (CCA). With Continuum CCA, the technique
is applied in a continuous space (as opposed to a lattice) and neighbours are defined
as falling within a given radius ‘l’. The technique is applied sequentially, starting with
an arbitrarily selected point, and is then applied repeatedly to the newly included
neighbours until the cluster grows no more. 79
5.3 Percolation transition plot – max. cluster size vs. percolation radius. 81
5.4 Percolation cluster transitions for Domesday settlement. Evolution of the largest cluster
in the percolation process of Domesday settlement, overlaid on the transition plot (as
in Figure 5.3). Maps of the clusters at the distance threshold for each transition are
depicted. Each vector point colour represents membership, when two or more nodes
are close enough to be part of the same cluster. 83
5.5 Domesday vill clusters at 3km and 2.9km overlaid on English coastline and
Domesday counties. 84
5.6 Domesday vill and 19th-­century settlement clusters. (a) Domesday vill clusters at
3.2km overlaid on coastline and Domesday counties, generated from Domesday
vill datasets provided by Stuart Brookes; (b & c) Roberts and Wrathmell’s 19th
Century Settlement Nucleation dataset at 3km and at 3.5km overlaid by Roberts and
Wrathmell’s central province. 84
5.7 Hillfort clusters in Britain, at (a) 34km, (b) 12km and (c) 9km percolation radius. 86
5.8 Central Wales cluster at 9km with sites plotted according to area, and the rivers
Wye and Severn. 87
5.9 Cotswold cluster at 10km radius with sites plotted according to area, and the rivers
Wye, Severn and Thames. 88
6.1 Transect with paired points selected for lags of 1 and 2 units. 96
6.2 Selection of paired observations for directional variograms. 97
6.3 Omnidirectional experimental variogram with fitted model. 98
6.4 Bounded variogram model. 99
6.5 Location of GPS measurements at Ballyhenry rath. 104
6.6 Experimental variogram of GPS measured heights, Ballyhenry rath. 105
6.7 Experimental directional variogram of GPS measured heights, Ballyhenry rath. 106
6.8 Experimental detrended directional variogram of GPS measured heights, Ballyhenry rath. 107
6.9 Experimental detrended variogram of GPS measured heights, Ballyhenry rath, with
fitted model (Bessel model with a sill of 0.257 and a range of 10.379 m). 108
6.10 Elevation estimates (in metres), derived using kriging with a trend model. 109
6.11 Kriging variances. 110
6.12 ‘2.5D’ representation of (a) kriged elevations and (b) conditionally simulated values
(viewed from the southwest). 111
6.13 Radiate of Allectus: C mint percentages in 5 km grid cells. 112
6.14 Directional variogram of C mint percentages. 113
Figures xiii

6.15 Directional variogram of C mint percentages: 90º clockwise from north (east-­west);
with fitted model. 114
6.16 Kriged map of C mint percentages. 115
7.1 Simple interpolation examples: (a) with two known point values, using linear
interpolation estimates D = 15; (b) adding a third sample location and using inverse
distance weighted squared interpolation estimates D = 12.5. 121
7.2 Spline as a concept: (a) regularized with high weighting, allowing the interpolation
estimates to exceed the z-­values to maintain smoothness at points marked by arrows;
(b) a tension spline, which adheres to the original data values at the expense of smoothness. 123
7.3 A variogram showing increasing variance between samples of values drawn from
increasing distances apart. After a distance of 60 m there is no increase in variance. 124
7.4 Anisotropy in a hypothetical sample of semi-­regularly spaced test units. The isolines
depict sherd counts in 5-­sherd intervals, illustrating how the rate of change is greater
on the north-­south axis than on the east-­west axis. 125
7.5 Archaeological point sample. (a) location of samples and artifact counts; (b) sample
with border-­area edge correction. The random test samples are designated by an ‘x’; the
building location by the rectangle. 127
7.6 Visual differences in the surfaces of nine interpolation methods at 1 m resolution. RMS
errors for each model are provided in Table 7.2. 129
7.7 Interpolation example modified from Salisbury (2013, Figure 5). Note that the
high resolution (small pixel size) has exceeded the limits of the original data. The
interpolation is thus unstable and noisy where there is higher local variance, for
example in the area north of the ‘trample zone’ (arrows). 131
7.8 Interpolation example modified from Fort (2015, Figure 1). An interpolated surface
model of radiocarbon dates from Neolithic sites (black dots) depicting the space-­time
process of the spread of agriculture across Europe. Note the areas indicated by arrows
in the southwest and northeast of the model showing where data sparseness causes
instability in estimates. 132
8.1 Cost-­distance surface with 150 km isopleths having (a) Buran Kaya III (BK);
(b) Geissenklösterle (GEISSE); (c) Krems-­Hundssteig (KRE-H) as origin sites. 143
8.2 Linear regression models created with the Ordinary Least Squares (OLS) method
to determine the association between the time difference for the appearance of the
Gravettian techno-­complex at different sites and their least-­cost distance to three origin
sites. (a) model with Buran Kaya III as origin; (b) model with Geissenklösterle as origin;
(c) model with Krems-­Hundssteig as origin. 145
8.3 (a) Map illustrating the difference between x value (i.e. least-­cost distance in
kms) at each location and average x value in order to give an indication of spatial
autocorrelation. (b) Map illustrating the difference between y value (i.e. time difference
in years) at each location and average y value in order to give an indication of spatial
autocorrelation. 147
8.4 Map showing residuals at each location in order to give an indication of spatial
autocorrelation. 149
9.1 Screenshot depicting the distribution of radiocarbon dates available from the Canadian
Archaeological Radiocarbon Database, version 2.1. 157
9.2 Simulated point patterns with associated observed and expected L function (a variant of
Ripley’s K function where the theoretical expectation of Complete Spatial Randomness
xiv Figures

(CSR) is a straight line): (a) homogeneous Poisson process; (b) clustered point process;
(c) spatially inhomogeneous Poisson process with different intensities between left
and right sides of the window of analysis; (d) second-­order spatial heterogeneity with
a combination of regular and clustered patterns. The function suggests aggregation
(clustering) when the observed L function is above the expected value and segregation
(regular spacing) when below. 159
9.3 Lithic distribution analysis from the Sebkha Kelbia survey, Tunisia, showing contrasting
results between global and local bivariate L functions of stone tools divided by their
raw material (Gafsa flint vs. flint sourced elsewhere): (a) distribution of the analysed
stone tools (filled circle: Gafsa sourced flint; hollow circle: flint sourced from elsewhere);
(b) bivariate L function showing significant segregation between the two classes
between 20 and 320 meters (MC: Monte-­Carlo); (c) local bivariate L function scale
showing evidence of aggregation at 100 meters (black dots indicate location of Gafsa
sourced flints with a statistically significant proportion of neighbours composed by flint
sourced from elsewhere). 161
9.4 Local spatial permutation test of the summed probability distribution of radiocarbon dates
(SPDRD) from Neolithic Europe showing locations with higher or lower geometric
growth rates than the expectation from the null hypothesis (i.e. spatial homogeneity in
growth trajectories) at the transition period between 6500–6001 and 6000–5501 cal BP. 164
10.1 Illustration of a possible fuzzy definition of the concept ‘young’ for humans. 172
10.2 Interpretation examples of membership values, and illustration of the α-­cut concept
in the case where the fuzzy set is representing a possibility distribution. For instance,
the domain value subset [v9, v10] is the core (the α-­cut A1) of the fuzzy set and
means very possible, while [min, max] is the support (the α-­cut A0) and means almost
impossible. Any domain values outside [min, max] are impossible. 173
10.3 Illustration of a fuzzy wooded area. Experts determine 3 main areas: area 1, where the
concept of the wooded area is plainly respected (the α-­cut A1); area 2, which encloses
area 1, where the concept is partially respected (the α-­cut A0.5); and area 3, which
encloses the two previous areas, representing the limits for which the concept can at least
partially be defined (the α-­cut A0). Considering several α-­cuts (at least 2, A1 and A0), the
membership degree of each domain value can be obtained: it is at least the highest degree
of the α-­cuts it belongs to and can potentially be obtained by spatial interpolation as well. 174
10.4 Illustration of a connected spatial α-­cut. 175
10.5 Membership functions of two fuzzy periods/dates (f and g); a, b, c and d are the areas
of non-­overlap. 176
10.6 Visualization of Roman streets in Reims that were anterior to the period “around 200
AD” according to the confidence we have in the results. 177
10.7 Visualization of entities which have an activity dated to the “Middle of the 2nd
century” and which belong to site PC 88. 178
10.8 Simulated map of Reims’ streets during the 3rd century AD generated from the fuzzy
spatiotemporal data stored in FGISSAR (Fuzzy Geographic Information System for
Spatial Analysis in aRchaeology) with an adaptation of a pattern recognition method
(the Hough Transform). The darker the object, the higher the possibility of its presence
during the 3rd century AD. 179
10.9 Spatiotemporal trajectories and imperfection of archaeological data in Syrian Arid
Margins during the Bronze Age. 180
Figures xv

10.10 The four steps of the proposed methodology. 180


10.11 Detecting local spatial configurations from reliable and unreliable archaeological sites
with Local Indicators of Spatial Association. 181
10.12 Delineation of ‘sub-­spaces’ from waterways demonstrated on a small area of the Arid
Margins. 182
10.13 Fuzzy sets framework set up to estimate each sub-­space’s potential of attracting
settlement throughout the studied area. 183
10.14 Application of the fuzzy sets framework to each sub-­space. 184
10.15 Mapping each sub-­space’s fuzzy sets category as in Figure 10.14. 184
10.16 Mapping the attractiveness of and possibilities of finding settlements at the sub-­spaces
using fuzzy set estimates and survey intensity levels. 185
10.17 From type-­1 to type-­2 fuzzy sets: introducing the reliability of archaeological sites in
fuzzy set calibration. 187
10.18 Type-­2 fuzzy set settlement estimates for Early Bronze Age IV Arid Margins
comparing results for ‘cluster’ sites and ‘outlier’ sites. 188
11.1 Expected 87Sr/86Sr ratios in the immediate vicinity of Annaghmare based on the BASr
baseline for Ireland after Snoeck et al. (2019) (top: median; bottom: median absolute
deviation from the median). The black dots indicate the locations of the modern plant
samples used to generate the biologically available strontium (BASr) baseline and the
white dots indicate outliers for the Silurian sandstone, greywacke and shale (Formation 49). 199
11.2 Boxplot showing the variation in the observed 87Sr/86Sr ratios of modern plants
from the outcrops of Silurian sandstone, greywacke and shale (Formation 49) in the
Annaghmare region. The grey shaded area shows the 87Sr/86Sr ratios that lie within 1
median absolute deviation (MAD) of the median. Samples with 87Sr/86Sr ratios less than
3 MAD below the median or greater than 3 MAD above the median can be considered
to be non-­locals. 200
11.3 Focal medians for 5km BASr catchments calculated from the BASr baseline for Ireland
(left) and residuals between expected 87Sr/86Sr ratios based on the BASr catchments and
the observed 87Sr/86Sr ratio for Individual A2 (right). Locations from which Individual
A2 could have originated are shown in white. Locations from which Individual A2 is
unlikely to have originated are shown in blue (more depleted) and orange (more enriched). 201
11.4 Expected 87Sr/86Sr ratios based on the BASr baseline for mainland Britain (after Snoeck
et al., 2018, Figure 2) (Left) and expected d18O values based on the ground water
baseline for the United Kingdom and Republic of Ireland (after Darling et al., 2003,
Figure 6) (Right). The expected δ18O values have been converted from δ18Ow values
to δ18Op values using the equation from Daux et al.(2008) to allow direct comparison
with the observed isotope value for Burial K at Duggleby Howe. 203
11.5 Probability density surface and maximum likelihood estimations showing the locations
where Burial K from Duggleby Howe could have originated from based on the
observed 87Sr/86Sr ratio, the observed δ18O value and both the observed 87Sr/86Sr ratio
and δ18O value. The geographic assignments based on dual isotope tracers are unduly
influenced by one of the isotopes (oxygen) raising further questions about the utility of
oxygen as a tracer isotope. 205
12.1 Slope data and rock pile locations in the Hohokam agricultural field complex: (a) slope
in 3 equal-­area categories, (b) slope data showing rock pile and background samples
with boxplot and cumulative graphs, (c) slope data showing rock pile and check
xvi Figures

dam samples with boxplot and cumulative graphs, (d) slope data and rock piles with
cumulative graphs, (e) sampling distribution of 9,999 sample means from the region
with indication of realized sample mean. 222
12.2 Slope data and rock pile locations in the Hohokam agricultural field complex: (a) slope
in eight categories with midpoints plotted against rock pile density, (b) rock piles
and slopes within a circumscribed study region with cumulative distribution plots,
(c) elevation, slopes, and a logistic regression model for rock piles based on these data. 224
12.3 A northwest Arkansas historic data set from 1892: (a) the 18 × 27 km study region
with 589 historic farmsteads and roads plotted over topography with towns outlined,
(b) maps of the four principal components of historic settlement with central values of
legend indicating most preferred locations. 226
13.1 Southern portion of the coastal Georgia study area: maximum available calories for
white-­tailed deer (Odocoileus virginianus) for the month of September (ca. 500 BP). 238
13.2 Southern portion of the coastal Georgia study area: maximum available calories for all
shellfish species for the month of September (ca. 500 BP). 239
13.3 Southern portion of the coastal Georgia study area: returnable calories for all resources
combined for the month of January (ca. 500 BP). 240
13.4 Southern portion of the coastal Georgia study area: returnable calories for all resources
combined for the month of September (ca. 500 BP). 241
14.1 Schematic illustration of the features of an Agent Based Model (ABM) with cognitive
agents, based on the model described (Lake, 2000a). 249
14.2 Example of the realistic rendering of a simulated landscape. 250
14.3 Graphed Agent Based Model (ABM) simulation results which collectively illustrate
several aspects of experimental design: (a) plotted points of the same colour and k value
differ due to stochastic effects alone; (b), two different parameters s and k are varied;
and (c) two different agent rules, “CopyTheBest” and “CopyIfBetter” are explored. 260
14.4 Comparison of Long House Valley simulation results with archaeological evidence. 263
14.5 Population curves produced by 100 runs of the calibrated Long House Valley Model,
differing only in random seed. 266
15.1 Four different network data representations of the same hypothetical Mediterranean
transport network. (a) adjacency matrix with edge length (in km) in cells
corresponding to a connection; (b) node-­link-­diagram where edge width represents
length (in km). Please refer to the colour plate for a breakdown by transport type
where red lines = sea, green = river, grey = road); (c) edge list; (d) geographical layout.
Once again, please refer to the colour plate for a breakdown of transport type. 274
15.2 A planar network representing transport routes plotted geographically (a) and
topologically (b). A non-­planar social network representing social contacts between
communities plotted geographically (c) and topologically (d). Note the crossing edges
in the non-­planar network. 278
15.3 Examples of three different node centrality measures: (a) nodes scaled by degree
centrality, (b) nodes scaled by betweenness centrality with path segment lengths shown,
(c) nodes scaled by closeness centrality with path segment lengths shown. 279
15.4 Examples showing relative and Gabriel graph neighborhood definitions: (a) A is a
relative neighbor of B because there are no nodes in the shaded overlap between the
circles around A and B, (b) A and B are not relative neighbors because C falls within
the shaded overlap. (c) A and B are Gabriel neighbors because there are no nodes
Figures xvii

within the circle with a diameter AB, (d) A and B are not Gabriel neighbors because C
falls within the circle with a diameter AB. 281
15.5 Network representation of the Orbis network: geographical layout (a, c) and topological
layout (b, d). Node size and colour represent betweenness centrality weighted by
physical distance in (a) and (b), and they represent unweighted betweenness centrality
in (c) and (d): the bigger and darker blue the node, the more important it is as an
intermediary for the flow of resources in the network. By comparing (a, b) with (c, d),
note the strong differences in which settlement is considered a central one depending
on whether physical distance is taken into account (a, b) or not (c, d). 284
15.6 Geographical network representation of the Orbis network: geographical layout (a) and
topological layout (b). Node size and colour represent increasing physical distance over
the network away from Rome: the larger and darker the node, the further away this
settlement is from Rome following the routes of the transport system. Note the fall-­off
of the results with distance away from Rome structured by the transport routes rather
than as-­the-­crow-­flies distance. 286
15.7 Nearest neighbour network results of the Orbis set of nodes. Node size represents
degree. Insets show degree distributions. Note how the network only becomes
connected into a single component when assuming 4-­nearest-­neighbours. 288
15.8 Maximum distance network results of the Orbis set of nodes. Node size represents
degree. Insets show degree distributions. Note how the network only becomes
connected into a single component when assuming 440 km as the maximum distance. 290
15.9 Results of the Orbis set of nodes; (a) relative neighbourhood network and (b) Gabriel
graph. Node size represents degree. Insets show degree distributions. Note how the
networks, as compared to the results shown in Figures 15.7 and 15.8, better succeed in
representing the shape of the Orbis transport network and the long-­distance maritime
routes crossing the Mediterranean. 291
16.1 London Tube maps: (a) the 1908 version superimposed on a city plan. (b) the 1933
version featuring H. Beck’s topological redesign. 297
16.2 Schloss Friedeburg, Saxony-­Anhalt, Germany. The ground floor of the main residential
building (17th c. CE): (a) the state plan of 1930. (b) a simplified plan with points of
access marked by arrows, topological graph superimposed. (c) the justified graph with
room types, rings and depth from carrier indicated. (d) the path matrix with sums of
path lengths. 299
16.3 (a–c) Simplified ground floor plan of the main residential building of Schloss
Friedeburg: (a) with the non-­convex rooms highlighted and a suggestion for
(approximately convex) subdivision of non-­convex rooms 1 and 6. (b) with the axial
map superimposed and line segments indicated for the longest and most integrated
axial line. (c) with three overlapping isovists and their centre-­points indicated. (d) the
axial map of the ground floor of the main residential building of Schloss Friedeburg,
with the topological graph of convex break-­up superimposed. (e) examples of
diamond-­shaped topological graphs. (f) justified graph of the axial break-­up of the
ground floor of the main residential building of Schloss Friedeburg. 302
16.4 The Palace of Pylos (Ano Englianos), Messenia, Greece (13th c. BCE): (a) a simplified
plan of the earlier building state with results of VGA and the shortest convex routes to
throne room 6 superimposed as a partial topological graph. (b) a simplified plan of the
later building state with results of VGA and the shortest convex routes to throne room
xviii Figures

6 superimposed as a partial topological graph. (c) a simplified plan of the later building
state with shading indicating areas of convex spaces most easily accessible from the
three different points of access and separate j-­g raphs for access through each of the
latter. (d) a simplified plan of the later building state with shading indicating areas of
convex spaces most easily accessible from the three main courts 58, 63 and 88, “A”
marking the archive, “NEB” the Northeastern Building (the presumed clearing-­house)
and “P” pantries (courts assumed to be served from these are indicated by subscript
nos.). (e) J-­g raph of the later building state. 306
17.1 A single, binary viewshed designed to explore the visual impact of a 5 m high wooden
post that had been erected at the Avebury monument complex in Wiltshire, England.
A substantial structure, the post had been raised in the early Neolithic period at a
location that would eventually be traversed by a megalithic setting of paired standing
stones. The viewshed identifies those areas of the Avebury landscape where a 1.65 m
high viewer could have theoretically seen the post. 315
17.2 (a) Conceptualisation of the basic line-­of-­sight (LOS) algorithm: LOS between two
locations in an altitude matrix can be established by comparing the height of each cell
that intersects the line with the height of the line at that location, interpolating where
necessary. (b) Note that view-­to and view-­from are not necessarily reciprocal because
they represent different assumptions about the location of the viewer. An R3 viewshed
algorithm essentially repeats this calculation for every cell in the altitude matrix (except
the viewpoint) and records the result(s) for each cell. 317
17.3 The concept of the R2 viewshed algorithm which operates by: (a) generating a ‘view
horizon’, noting the visibility of each cell on that horizon, and storing the elevation
angle of view to the observer a1, a2 etc.; (b) expanding the horizon by one cell and
calculating new angles of view B for each cell on the new view horizon; (c) the angle
of intersection with the previous horizon A is inferred (from a2 and a3) and the new
angle B is compared with A to determine whether a new cell is visible or not. 318
17.4 Buffering to avoid edge effects. The map depicts a group of Roman coin hoards in
the Don Valley in northern England. A series of visibility analyses were carried out
in order to determine whether the hoards were preferentially placed in relatively
concealed (or hidden) locations. (a) In the centre is the convex hull bounding
the group of hoard locations. (b) Assuming a maximum view radius of 3,440 m
(corresponding to Ogburn’s (2006) limit of normal 20/20 vision for a 1 m wide
object) we would need to process the area included in this buffer to avoid edge effects.
(c) If we increased this to 6,880 m (the limit of human acuity for a 1 m wide object)
we would need to extend our processing area accordingly – in this case to the outer buffer. 319
17.5 Using a scaling factor to compensate for edge effects. 320
17.6 (a) A binary viewshed generated from the prehistoric post setting at Avebury depicted
in Figure 17.1 (circled in white) shown over a shaded relief model. (b) A probabilistic
viewshed calculated from the same location (digital elevation model (DEM) errors are
modelled as normally distributed with a root mean square error (RMSE) of 3 m).
White areas represent 100% probability, with the probability declining as the shading
becomes darker. 321
17.7 (a) The cumulative viewshed generated by summing the binary viewsheds of the 17
coin hoard locations depicted in Figure 17.4 with a maximum view radius of 6,880 m.
Colours at the red end of the green-­red scale indicate locations from which higher
Figures xix

numbers of mounds are visible. (b) The total viewshed calculated for the entire study
region (in this case, the convex hull depicted in Figure 17.4 with a 500 m buffer –
8,938 viewpoint locations). This encodes views-­from the individual viewpoints, where
the red end of the green-­red scale indicates those locations from which a larger area is
modelled as being visible. (c) The above analysis repeated with viewpoint/target offsets
adjusted to encode views-­to the viewpoint locations. 324
17.8 Viewsheds generated for each of the tower-­kivas. The green zone represents the view
from ground level and the red the top of the tower. Blue dots indicate Puebloan
archaeological sites in the landscapes of the tower kivas. The radiating buffers extend
for 20 km around each site – the maximum viewing range used for the analyses. 326
18.1 Cost functions estimating walking time; on the x-­axis downhill slopes are negative. 338
18.2 Cost functions estimating walking time: uphill and downhill costs are averaged. 339
18.3 Simple example of an isotropic cost grid (left) and the corresponding accumulated cost
surface (ACS) (right). The origin of the accumulation process is the centre of the cost
grid (cost value = 10). 341
18.4 Dijkstra’s algorithm applied to a cost grid. 341
18.5 (a) Simple isotropic cost grid, (b) possible moves starting at the origin, (c) traversing the
barrier cells by long moves, (d) subdividing long moves. 342
18.6 (a–f) Depict ACS results based on the cost grid shown in Figure 18.5(a). The outcomes
of an inadequate barrier radius of merely 5 m are shown in (a–c). For (d–f) an
adequate barrier radius of 7.5 m was chosen. The N values indicate the number of
nearest neighbouring cells that can be reached from the origin without detour. Images
(g) and (h) illustrate the impact of different N values by grids showing the differences
in accumulated costs. 343
18.7 Small digital elevation model (DEM) with a cell size of 10 m and a constant slope
value of 10% with the corresponding ACS for the cost function Q(ŝ) = 1 + (ŝ/š)²,
with š = 10. 344
18.8 ACS for different slope dependent cost functions (nos. 1, 6, and 9 in Table 18.2) on
three gradients, with an additional barrier as in Figure 18.5(a) (radius 7.5 m). For each
cost function, the costs vary from 0 at the origin, depicted in white, to the largest
accumulated cost value depicted in black. (a) Ericson & Goldstein, (b) Tobler, (c) Q(12). 345
18.9 A path built to be level on a steep slope in the hilly area east of Cologne, Germany. 346
18.10 Least-Cost Paths (LCPs) (black lines) from the origin in the centre to five different
targets. The outcome of the LCP algorithm depends on the number of nearest
neighbours that can be reached without detour (N) and the width of the barrier
((a) width = 5 m; (b, e, g) width = 7.5 m; (c, f, h) width = 10 m; cf. Figure 18.6). 347
18.11 The study area southwest of Cologne covering approximately 13 × 10 km. 348
18.12 LCPs based on the formulas by Tobler, Irmischer, and Herzog/Minetti as well as
quadratic slope dependent cost functions combined with costs for traversing water
courses and/or wet soils. 349
18.13 (a) Comparison of the Q(10) LCPs with the Agrippa road, (b) comparison of the local
prominence for these two routes (white = low, black = high prominence) (c) LCPs
with increased isotropic costs in areas of low prominence. 350
18.14 The best performing LCPs for the models considered and the Least-Cost Site
Catchments (LCSCs) derived from the Q(14) cost model for the Görresburg temple
and the villa. 351
xx Figures

19.1 The electric field remains perpendicular to the magnetic field as the electromagnetic
energy propagates. 361
19.2 A simplified depiction of electromagnetic radiation. The visible light, which is mainly
made up of red, green and blue, constitutes only a small fraction of the full spectrum. 362
19.3 The main steps in a satellite remote sensing analysis. The workflow has a hierarchical
structure. Evaluation of results may reset the workflow until a satisfactory solution is
achieved. 363
19.4 A CORONA scene (DS1102–1025DF007, December 1967) showing the location and
extent of hollow ways radiating from Tell Brak, Syria. The terminal points of hollow
ways can be mathematically modelled in order to estimate the area of agricultural
production around sites. 368
19.5 Multi-­spectral data analysis with vegetation indices provides a detailed and dynamic
representation of agricultural landscapes. These models surpass static descriptions of
agro-­economic zones, which are usually based on strict assumptions about productivity.
The circles in the figure show production boundaries of Bronze Age settlements. 369
19.6 Scatter plots revealing the strength of the relationship between settlement size and
estimates of total production. The plot (a) shows a weak relationship when estimated
production values are directly compared with settlement size. However, when a
biennial fallowing strategy is introduced for settlements smaller than 50 hectares (b),
the relationship becomes much stronger. 369
20.1 Multisensor (8 sensors) vertical magnetic gradient survey with SENSYS at Pella, North
Greece. Left image indicates the original data suffering from various spikes, traverse
striping and grid mismatching. Right image indicates the results of processing that
tried to remove those specific effects. 378
20.2 Application of the Fast Fourier Transform (FFT) power spectrum analysis of the
magnetic data obtained from the Bronze Age cemetery of the Békés Koldus-­Zug
site cluster – Békés 103 (BAKOTA project). The depth of the various targets (h) is
easily determined by measuring the slope of the power spectrum at different segments
and dividing it by 4π (Spector & Grant, 1970). The radially averaged spectrum was
calculated and used to separate the magnetic signals coming from deep sources (h=2.87 m)
and shallow sources (h=0.73 m) below the sensor. The spectrum was also used as a
guide to define a bandwidth filter in order to eliminate the sources with wavenumber
more than 550 radians/m and less than 100 radians/m respectively, and enhance the
magnetic signal coming from the potential archaeological structures. 381
20.3 (a) 3D resistivity model. (b) Three dimensional distribution of the calculated apparent
resistivity results from model A. (c) Pseudo-­3D slices of the resistivity resulting from
the 2D inversions along the X, Y and XY axes. (d) 3D resistivity model from the three
dimensional inversion. Due to the wide range of the resistivity values, a logarithmic
scale is used. 384
20.4 An example of processing approaches applied to a radargram obtained at Lechaion
archaeological site with an antenna of 250MHz: (a) Raw data without any processing,
(b) application of Dewow, (c) Spreading Exponential Compensation (SEC) gain with
an attenuation of 6.16, start gain of 2.56 and maximum gain of 542, (d) application of
automatic gain with a window width of 1.5 and maximum gain of 500, (e) application
of regional background removal, (f) migration process, (g) application of a high pass
filter, and (h) envelope transformation. 386
Figures xxi

20.5 Gravity residual anomalies recorded above two tombs (Tombs 4 (above) and 8 (below))
of the Roman cemetery on the Koutsongila Ridge at Kenchreai on the Isthmus of
Corinth, Greece. The centres of the tomb chambers are located approximately at the
middle of the transects. According to the resulting graphs, it is estimated that both
tombs have a width of about 4.5–5 m. The gravity signature of tomb T4 is better
presented compared to the one of tomb T8, probably, because T4 is located within a
more homogeneous geological unit (valley fill deposits), whereas T8 is located at the
border between the valley fill deposits and conglomerate outcrops that extend to the
central section of the ridge. Both tombs have created a well-­defined gravity anomaly
with at least 0.04–0.08 mGal maximum variation with respect to the average background. 391
20.6 Results of a seismic refraction survey at the area of the assumed ancient port of
Priniatikos Pyrgos in East Crete, Greece: (a) 2D image representing the depth to
the bedrock, which reaches about 40 m below the current surface. The black dots
represent the position of the geophones along the seismic transects. The area has
been completely covered by alluvium deposits and other conglomerate formation
fragments as a result of past landslide and tectonic activity. The interpretation of the
velocity of propagation of the acoustic waves revealed the spatial distribution of (b) the
alluvium deposits at the top (velocity of 491 m/sec), (c) the lower and upper terrace
deposits (velocity of 1830 m/sec), (d) the medium depth sandstones and conglomerates
(velocity of 2400 m/sec) and (e) the deeper weathered limestone or cohesive
conglomerates (velocity of 4589 m/sec) 393
20.7 Results of the geophysical surveys at Velestino Mati. The magnetic data (a) indicates the
nucleus of the settlement at the west top of the magoula with some expansion towards
the east top. A number of high dipolar magnetic anomalies are associated with burnt
daub foundations that were also confirmed from the Electromagnetic Induction (EMI)
soil magnetic susceptibility (b) and the soil resistance data (c). Magnetic susceptibility
also confirmed the existence of enclosures around the tell. 397
20.8 Results of the geophysical surveys at Almyriotiki. The magnetic data (a) presented a
clear image of the internal planning of the settlement: Burnt daub structures follow
a circular orientation around the top of the tell. The houses expand further to the
south, where some weaker magnetic anomalies representing stone houses with internal
divisions are also present. An irregular wide ditch system encloses the settlement
from the east and the north and it is confirmed from the EMI magnetic susceptibility
(b) and soil conductivity measurements (c). The high soil conductivity to the north
coincides with an area susceptible to periodic flooding. The above were also confirmed
from the soil viscosity measurements (d) as an indicator of the soil permittivity. 399
20.9 Results of the geophysical surveys at Almyros 2. The magnetic data (a) depict clearly
the concentration of burnt daub structures at the centre of the tell, expanding
further to the south. The settlement is surrounded by a double ditch system, which
is confirmed by both EMI magnetic susceptibility (b) and soil conductivity data
(c). A number of breaks in this double enclosure are most probably associated with
multiple entrances to the settlement. Soil conductivity seems also to increase outside
the settlement to the south and west directions (north to the top), namely in the area
which is most susceptible to flooding. 403
21.1 Lin and Mark’s conceptual data models, where raster datasets perhaps based upon spatial
interpolation (SI)/generalisation (SG) methods are converted into voxels, which may in
xxii Figures

turn be manipulated through temporal interpolation (TI)/generalisation (TG) methods.


The image highlights the difference between 2D and 3D raster data (voxels) in GIS. 413
21.2 Example of Langran’s ‘Snapshot Approach’. In this case ‘snapshot’ (Si) presents a
particular ‘world state’ at time (ti). 415
21.3 Example of Langran’s ‘Temporal Grid’ solution – here a temporal grid is created and a
variable length ‘list’ is attached to each grid cell to denote successive changes. 416
21.4 Example of Langran’s ‘Amendment Vector Approach’ – showing urban encroachment
where urban encroachment is represented as a base state (left) with incremental
amendment vectors. 416
21.5 The TimeMap Data Viewer (TMView) map space. 417
21.6 Sample frame from an animation in the ‘Up In Flames’ study that combined
synchronised animated density graphs (produced in the R Software Environment) with
animated density maps (produced in ArcGIS). 419
21.7 Diagram showing the relationship between the various inputs and outputs of the ‘OH_
FET’ urban fabric model (social use, space and time) and the dynamics of the potential
analytical outputs. 420
21.8 Time-­GIS (TGIS) screenshot showing dates symbolised according to temporal
topology. The colour coding is according to the temporal topological relationship
between each date and the currently selected time period. 421
22.1 Archaeological investigations recorded over time in England from the 16th century
until 2013 as in the National Record of the Historic Environment Excavation Index
(based upon data extracted from Historic England, 2011). The picture is certainly not
complete, but is representative of the immense increase in archaeological investigation
seen from the mid to late 20th century. 431
22.2 Comparison of different spatial bins used by the EngLaId (The English Landscapes and
Identities) project at the same scale. 437
22.3 Artefacts recorded by the Portable Antiquities Scheme (PAS) displayed using 100 year
time-­slices from 1500 BC to AD 999. The data has been binned into 5 km hexagons
and probabilities of each artefact falling into each time-­slice in each hexagon calculated
and then summed. 438
22.4 Map produced using 3 km hexagons showing early medieval evidence for field systems. 439
23.1 A 3D model of the house of Caecilius Iucondus visualized through Unity 3D. 446
23.2 3D GIS for the documentation of insula VI at Pompeii. Vertical and horizontal
drawings are made directly in the system based on the 3D surface model of the house
of Caecilius Iucondus as a 3D reference. 447
23.3 A virtual simulation of Stora Förvar. The volumes of spit layers excavated during the
19th century were reconstructed by combining the original drawings and the 3D
surface model of the cave made by laser scanning technology. 448
23.4 The different steps undertaken in the field to record, visualize and interpret contexts
and features detected during field investigation. 451
23.5 3D Model of the plaster head (21666) after conservation. The model was generated
using Agisoft PhotoScan pro version 1.2.6. with acquisition campaign and processing
done by Nicoló Dell’Unto. 452
23.6 (a) 3D realistic surface model of building 131 which displays the plaster head when still
in situ. (b) spatial simulation of the 3D plaster head after conservation re-­integrated in
its original position within the 3D GIS. 453
Figures xxiii

23.7 3D documentation constructed for the archaeological field investigation of Kämpinge,


Sweden. (a, b, c) Trench N09–04 3D recorded and visualized inside the 3D GIS used
on site. (d) 3D models of Trench 06 made during the excavation season 2014 and 2015. 454
24.1 A citation network of results returned from a Google Scholar search of ‘geographic
+ visualization’, using Ed Summers’ python package ‘Etudier’. “Google Scholar aims
to rank documents the way researchers do, weighting the full text of each document,
where it was published, who it was written by, as well as how often and how recently
it has been cited in other scholarly literature.” The results give us a sense of the most
important works by virtue of these citation patterns. Thus, MacEachren, Boscoe,
Hau, & Pickle (1999); Slocum et al. (2001); Brewer, MacEachren, Abdo, Gundrum, &
Otto (2000); Crampton (2002); Howard & MacEachren (1996) are most functionally
important in tying scholarship together. This is not the same thing as being the most
often cited work. Rather, these are the works whose ideas bridge otherwise disparate
clumps; they are most central. 463
24.2 Citation analysis using Summers’ Etudier package, from a Google Scholar Search for
‘data + sonification’. Colours are works that have similar patterns of citation; size
are central works that tie scholarship together. This is not the same thing as ‘most
cited’. On this reading, one should begin with Madhyastha and Reed (1995); Wilson
and Lodha (1996); Zhao, Plaisant, Shneiderman, and Duraiswami, (2004); De Campo
(2007); Zhao, Plaisant, Shneiderman, and Lazar (2008). 466
Tables

4.1 Nearest neighbour (NN) test for the three hypothetical distributions in Figure 4.2(a–c)
(with an edge correction applied, as proposed by Donnelly (1978)). 65
4.2 Summary results of final models (after adjustment for the correlation of the clustering
component). 74
7.1 Interpolation methods and parameters used in the analysis. 128
7.2 Ranked RMS results by method and resolution. 128
8.1 Early Gravettian calibrated Accelerator Mass Spectrometry (AMS) dates of sites
included in the study together with least-­cost path distances from the three earliest sites
to the sites included in each correlation and regression. 144
8.2 Details of calculations for the numerator of Equation 8.1 where Geissenklösterle is
taken as the origin. 146
8.3 Details of calculations for the Autocorrelated Errors Model (ρ = 0.36, Equation 8.11). 150
11.1 Geographic assignments for Burial K from Duggleby Howe ranked by highest
probability density and lowest Euclidean distance, using regions based on National
Character Areas (England), National Landscape Character Areas (Wales) and
Landscapes of Scotland (Scotland). 206
12.1 Descriptive slope (percent grade) statistics for rock piles, check dams, and background
samples. 223
12.2 Global correlation matrix (left) for the four variables measured in the agricultural field
complex and logistic regression parameter estimates (right) indicating multivariate
relationships between rock pile presence and the four variables. 225
12.3 Lowest four principal components derived from 10 environmental variables in historic
Northwest Arkansas with largest absolute coefficients of eigenvectors shown in
boldface for interpretive purposes. 227
14.1 Rules for choosing new farming and settlement locations (from Axtell et al., 2002, Table 2). 265
14.2 Original ‘base’ parameter values for the Long House Valley model (from Axtell et al.,
2002, Table 4). 266
Tables xxv

15.1 Top 20 highest ranking towns according to the topological betweenness centrality
measure and the distance weighted betweenness centrality measure. Towns highly
ranked according to both measures are highlighted. 285
15.2 Results of global network measures for all tested models and the undirected Orbis
network (in bold). Highlighted results show some similarity in global network
measures with the Orbis network. 289
18.1 Cost components applied in selected archaeological least-­cost studies published in 2010
or later. 335
18.2 Slope-­dependent cost functions, with ŝ percent slope, and s = ŝ/100 mathematical
slope. If Δd (or ΔD) is missing in the cost formula, the result of the cost formula is
to be multiplied by the distance covered. Rows 1 to 7 list cost functions estimating
time, the formulae in rows 10 to 12 estimate energy consumption. The cost functions
listed in rows 8 and 9 measure abstract cost units, which can best be understood by
comparing estimates resulting from movement on a gradient with that on level ground. 337
18.3 Published terrain factors for cost functions measuring time (unit: hour) or energy
consumption (unit: joule) of a walker. Note: ‘m asl’ refers to metres above sea level. 340
18.4 DEM data provided by the ordnance survey institution (Geobasis NRW) responsible
for this part of Germany 348
18.5 Comparison of the Agrippa Road section and the LCP generated based on the cost
function Q(10) with a critical slope of 10 percent (see no. 9 in Table 18.2) combined
with a penalty factor of 5 for crossing streams. For the two routes, the percentage in
each prominence category is given. 350
18.6 Areas included within the LCSCs in hectares. 351
21.1 Table summarising the three key computational approaches to the integrated
conceptual modeling of spatiotemporality in spatial technologies. 411
21.2 Table summarising the seven baseline temporal operators of Allen’s interval algebra
(1983), which, along with their inversions, define a total of 13 relationships between
two temporal intervals. 423
Contributors

Peter M. Atkinson – Lancaster University

Edward B. Banning – University of Toronto

Andrew Bevan – University College London

Tom Brughmans – University of Barcelona

James Conolly – Trent University

Enrico R. Crema – University of Cambridge

Cyril de Runz – BDTLN, LIFAT, University of Tours

Nicoló Dell’Unto – Lund University

Stuart Eve – University of Leicester

Johanna Fusco – UMR IMBE (Aix-­Marseille University, FR); UMR 7300 ESPACE (University of Nice, FR)

Mark Gillings – Bournemouth University

Shawn Graham – Carleton University

Chris Green – University of Oxford

Neha Gupta – University of British Columbia

Piraye Hacıgüzeller – Ghent University


Contributors xxvii

Irmela Herzog – The Rhineland Commission for Archaeological Monuments and Sites

Tuna Kalaycı – Foundation for Research and Technology, Hellas

Kenneth L. Kvamme – University of Arkansas

Mark Lake – University College London

Christopher D. Lloyd – Queen’s University Belfast

Gary Lock – University of Oxford

M. Simon Maddison – Independent Researcher

Matthew A. Peeples – Arizona State University

John Pouncett – University of Oxford

Apostolos Sarris – Foundation for Research and Technology, Hellas

James S. Taylor – University of York

Ulrich Thaler – Römisch-­Germanisches Zentralmuseum

Philip Verhagen – Vrije Universiteit Amsterdam

David Wheatley – University of Southampton

Thomas G. Whitley – Sonoma State University


1
Archaeology and spatial analysis
Mark Gillings, Piraye Hacıgüzeller and Gary Lock

Part 1 – archaeology and space


This book comprises twenty-­three detailed chapters describing key spatial analytical techniques and
their application to archaeology. As the title of the book suggests the focus is on methodology, and the
chapters herein cover a range of techniques, both established and emerging. Although the emphasis is
on practice – the how to do it – it is crucial to stress from the very start that underlying any application
of these techniques must be the why we do it. Each chapter in the volume offers an introduction cover-
ing the background of that particular technique. Here we present some thoughts on the development
of ‘spatial archaeology’ more generally and why we think it is fundamental to much of what we do as
practicing archaeologists.
We start by considering the centrality of space to everyday life and archaeology as a discipline, and
open up this discussion further by laying out some of the relations that archaeological space finds itself
entangled with, such as time, practice and representation. Following this, we offer a brief historical over-
view of the development of spatial analysis in archaeology. This explores the contribution of the early
antiquarians, through the formulation and zenith of formal spatial techniques in the late 1950s to early
1980s, their fall from favour and then second coming in the 1990s due to the introduction of a range
of spatial technologies, not least Geographic Information Systems (GIS). This is then followed by a con-
sideration of why concepts of space and spatiality underlie much archaeological thought, what can be
called ‘spatial thinking’ in archaeology, and how this relates to what is understood by ‘spatial analysis’. The
chapter concludes with a careful consideration of what it means to think spatially in order to foreground
the goal of the volume as a whole, which is to make a positive contribution to the on-­going development
of archaeological spatial literacy at a time of significant theoretical and methodological transformation.
Being human embodies space and spatial relationships within a material world and just as this applied
to people living in the past, so it applies to those of us concerned with trying to understand those past lives
through their remaining material residues. Most, if not all, archaeological material has a spatial component
and it is not surprising, therefore, that spatial thinking and spatial analysis has been a central archaeological
endeavour since the beginnings of the discipline. While some other social sciences and humanities dis-
ciplines, particularly history, have claimed a fairly recent ‘spatial turn’ (Bodenhamer, Corrigan, & Harris,
2010; Warf & Arias, 2009), archaeology in all its changing forms has always incorporated an implicit or
2 Mark Gillings, Piraye Hacıgüzeller and Gary Lock

explicit acceptance that space and spatial relationships are a fundamental part of ‘doing archaeology’. It
has also been highly proactive in co-­opting and developing the methodological tools needed to explore
this spatiality, culminating in the rich variety of approaches available to us today; a consequence of devel-
opments in theory and practice alongside changing analytical and technological opportunities.
In fact, space, spatiality and spatial awareness are such fundamental parts of being human that we often
take them for granted at the bodily level of moving through and experiencing the world. Developments
in digital geospatial technology and the increasingly pervasive presence of locational media have increased
this familiarity only further (see Wood, 2012, p. 280). It is this very familiarity that risks blinding us to
the spatial formations of social life and to how we actively manipulate space and spatial relationships to
shape the world around us through the activities, behaviours and structures that give our lives meaning.
These spatial manipulations and interventions are what make us distinctive and different to other cultural
groups, offering both social cohesion and social exclusion at the same time. This assumption of essential
human spatiality is what underlies much traditional ‘spatial archaeology’ – the isolation and interpretation
of spatial patterns within archaeological evidence that relate archaeological activity in the present to the
generative processes in the past that we are interested in.
But what are some of the complex relations that describe archaeological space and spatialities?
Through which relations are we able to ‘do’ spatial archaeology and interpret human-­space interactions
in the past? We briefly lay out five of them here. For one, space in archaeology is linked to time (see Tay-
lor, this volume). Prior to the mid-­20th century, cultural evolutionary and cultural historical approaches
in archaeology explicitly privileged time over space working with long time scales, and grand themes
and trends (see Trigger, 1998), echoing a modernist discursive practice (Roberts, 2012, p. 14). Yet, argu-
ably, archaeologists realised relatively early on that notions of space and notions of time were intimately
linked and these were often treated as tacit conceptual axes along which analyses and interpretations were
structured, often alongside a further axis such as ‘form’ (Spaulding, 1960) sociality and the social (Soja,
1996) and materials and material relations (Conneller, 2011; Lucas, 2012, pp. 167–168).
Secondly, space in archaeology is about mobility, rendering movement across space a key focus of
archaeological and anthropological inquiry (e.g. Hammer, 2014; Richards-­Rissetto & Landau, 2014;
Snead, Erickson, & Darling, 2009; cf. Verhagen, Nuninger, & Groenhuijzen, 2019). Regardless of its
context and spatial scale, traversing space affords imaginations about what is to come, reflections on
what is left behind and memories of places. As such, it connects time and space with living. After all,
human life is a temporal process that unfolds with the formation of places through movement and
through the material and immaterial traces that movement leaves behind (Ingold, 1993; see Atkin-
son & Duffy, 2019; McCormack, 2008). Yet how can a preoccupation with spatial movement and all
of the terms that come with it (e.g. flows, networks and liquidity, often used to describe and analyse
conditions of late capitalism, and to construct sweeping grand narratives of globalisation) give hope
to archaeologists trying to come to terms with “specific, tangible materialities of particular times and
places” (Dalakoglou & Harvey, 2012, p. 459)? It appears that thinking through space with movement
is a rather different approach than looking at movement to observe and describe space: while the latter
involves an examination of selected outcomes including spatial patterns in order to trace antecedent
causes, the first attempts to follow forward moment-­to-­moment spatial formations intently (see Ingold,
2011, pp. 6–7; Knappett, 2011).
Thirdly, space in archaeology is about stories and daily practices (such as practices of gathering, com-
position, alignment and reuse) that typically form spatial assemblages with archaeologically traceable
material dimensions (McFarlane, 2011, p. 649; see Seigworth, 2000; Thrift, 2008). The iterative processes
of creating these assemblages and relations are in fact processes of place-­making, processes of dwelling.
They involve assembling relations between humans, non-­humans, materials, immaterials, and animate
Archaeology and spatial analysis 3

and inanimate things that continually produce the character and history of places (see also McFarlane,
2011). Places come about then through routine repetition “that is permeated by the past and the pres-
ent, and oriented toward the future” (Resina & Wulf, 2019, p. vii); they come about through enactment
and performance of relations anew almost daily. What about telling and re-­telling engaging qualitative
and quantitative archaeological stories about these places? They can be considered as just another act of
re-­performing spatial relations that re-­assemble archaeological places in the present.
Fourthly, space in archaeology is as much about absences as about presences. Archaeological forma-
tion processes often hinder us from asking certain questions about spatialities as spatial relations between
things or things themselves can be absent from the archaeological contexts in question. Often, archaeolo-
gists also lack an adequate appreciation of what is actually missing. Since such uncertainties are endemic
to archaeology, archaeological practices involve and remain open to new ways of incorporating them
within archaeological writing and analysis about space. Nonetheless, the presence of materials remain
almost exclusively the single origin of signification and meaning in archaeological contexts. What about
‘archaeologically empty spaces’, i.e. spaces devoid of materials that can be directly associated with past
human activities? As Löwenborg (2018, p. 37) stresses “[a]n archaeologist should not assume that ‘empti-
ness’ is a random prehistoric phenomenon. The saying ‘the absence of evidence is not the evidence of
absence’ certainly applies to archaeology”. As such, ‘empty spaces’ in archaeology should be subject to
description, representation and interpretation as much as any other archaeological space. Their status
in archaeology today is in fact an effect of common signification and meaning-­making practices in the
discipline. That is, there is nothing inherently meaningful or meaningless about ‘empty spaces’ but just
how we do archaeology today.
Finally, space in archaeology is about the challenges of representing it and increasingly so in the digital
age. As discussed in Lefebvre’s (1991, p. 38) detailed treatment of space, “representations of space” form
one of the three fundamentals that structure spatial understanding. Lefebvre conceptualizes “representa-
tions of space” as the space of planners, scientists and engineers who attempt to identify “spatial practice”
and “representational spaces” with it. In archaeology, it is through the process of engaging with evidence
that we conceive representations of space, i.e. the interpretative constructs of the excavation plans, dis-
tribution maps and spatial models used to represent and explain spatial and social relationships. At the
same time it is through these representations of space that we attempt to link past spatial practices with
Lefebvre’s “representational spaces”, i.e. the lived spaces of past people. Put differently, representing space
in archaeology is a generative act, a process of constructing how archaeologists get to know, experience,
understand and deal with space. And as post-­representational cartography has made clear, every engage-
ment with representations of space (such as ‘using’ maps) re-­creates those representations as well as the
spaces that are represented (Hacıgüzeller, 2017). As such, creating representations of space via GIS or any
other tool, and using them in archaeology are not simple acts; they are processes with great consequences
that need to be identified and talked about (Wood, 2010).

Part 2 – towards spatial archaeology


To understand the current status of spatial analysis in archaeological research, it is important to consider,
albeit briefly, its disciplinary development. In the overview that follows we focus mainly on British devel-
opments from 16th century antiquarianism to the development of archaeology as a discipline towards
the end of the 19th and through the 20th centuries. Related to this, although with important differences,
is North American archaeology, which is touched upon here but described in detail by Willey and
Sabloff (1993). Within the range of wider geographical and cultural contexts, we readily acknowledge
that the history of archaeology is a complex international one which must incorporate many different
4 Mark Gillings, Piraye Hacıgüzeller and Gary Lock

national traditions (Schnapp, 1996; Boast, 2009). As a result, what we offer here is undoubtedly partial
and selective.
We start at the very birth of the discipline, because some of the core themes that shape current spatial
research in archaeology had their origins in the very first phases of archaeological enquiry; not least the
tensions that exist between empiricism and synthesis. For archaeology to become an evidence-­based dis-
cipline it needed not just quantities of evidence but for that information to be structured, ordered, cata-
logued and made accessible. Increasing amounts of evidence require effective synthesis and interpretation
in order to produce narratives which provide an understanding of the past. Other key themes that have
been woven into them have concerned the relationship between space and time and the representation
and reasoning of change through time and across space. Another important element is that of scale – at
its simplest level the details of and associations between artefacts, sites and landscape. The interacting
scales of empiricism and synthesis range from the details of artefacts to enable typologies and dating, to
the recording of sites, landscapes and broad geographical regions.
A defining characteristic of archaeology, and of much antiquarian activity, is what we can broadly
call fieldwork. Whether this is investigating the small-­scale relationships available through excavation
or the larger views offered by landscape, the physical remains of the past have demanded an apprecia-
tion of spatial relationships explained through reasoning and representation. Those early explorers of the
past established methods of recording that have formed the basis for more recent methodologies, some
of which are still central to the writing of archaeology today, a good example being distribution maps
(Wickstead, 2019). Similarly the recording of excavations through plans and sections, and the later incor-
poration of stratigraphical relationships, have provided the basis for spatial thinking at that scale since the
early days of the discipline.
In 1533 John Leland was commissioned to travel the kingdom to “make a search for England’s Antiq-
uities” and record the “places wherein Records, Writings and secrets of Antiquity were reposed” (Chan-
dler, 1993). His recording of historical documents, artefacts and places were assembled into his ‘Itinerary’,
a large collection of notes, that offered a remarkable account of his nine years of travelling the land with
the intention of producing ‘a map’ of Great Britain. The importance of Leland’s contribution in terms of
methodology is that he escaped from the library and went on the road. Although claimed as the ‘father
of English topography’ Leland’s itinerary is in fact a ‘map in words’ for it contains very few graphical
representations and he never produced a map as originally intended. Spatial relationships between towns,
villages, archaeological sites and other points of interest are described in his notes by measurements of
distance and compass directions.
The next great work of English antiquarianism, William Camden’s Britannia, first published in Latin in
1586, was in many ways the realisation of Leland’s dream (Schnapp, 1996, p. 141). Even with the addition
of over 50 maps, however, the Britannia is still primarily a work of narrative, where text incorporates spa-
tial descriptions and understandings and the maps provide mainly a reference for location. For Schnapp
(1996, p. 154), Camden typified the British interest in archaeological cartography, the “description of
landscape and listing of monuments”. For the beginnings of spatial archaeology in the modern sense of
recording, representing and interpreting individual sites and then doing the same within their landscape
settings, however, we have to wait until the middle of the 17th century and the work of John Aubrey.
Aubrey’s approach to landscape and archaeological sites was somewhat different to those of Leland and
Camden whose studies and writings were very much in the tradition of continental Renaissance human-
ism (Sweet, 2004). He can be seen as an important part of the historical revolution that paralleled the sci-
entific revolution based on collecting evidence, questioning, interpreting and validating (Hunter, 1975).
This involved a shift in spatial thinking so that the accurate planning of earthworks and other (mainly
prehistoric) archaeological sites became tools for analysis rather than just for display, in his own words
Archaeology and spatial analysis 5

“comparative antiquitie writt upon the spott from the monuments themselves” (Hunter, 1975, p. 181).
His belief was that material remains could illuminate the past which should not be bound by texts alone.
A generation after Aubrey, and influenced by his writings, William Stukeley continued and developed
this tradition of fieldwork-­based recording, analysis and interpretation (Piggott, 1985); his extensive
fieldwork between the years 1718 and 1725 have had a lasting archaeological legacy. His acceptance of
the importance of spatial relationships is inherent within his accurate plans but also in his other forms
of spatial representation. He produced many ‘prospect’ views, natural style pen and wash drawings of
monuments in their landscape setting often annotated although his most innovative technique was the
“circular view” (Peterson, 2003). This representation is removed from vertical measured plans and land-
scape prospect views and shows the 360 degree horizon around a particular point with landscape features
and archaeological sites integrated.
In many ways Stukeley stood at a crossroads in the development of archaeological spatial thinking
and the resulting recording methods and interpretative representations. Aubrey had acknowledged that
what he did was chorography, literally “place writing”, discerning the past from the present through an
intimate knowledge of landscape where all aspects of human activity and history are recorded with the
emphasis on producing narrative so that any spatial considerations were supporting information. In con-
trast, Stukeley was the first secretary of the newly formed Society of Antiquaries of London, an act in itself
which announced antiquarianism as a nascent discipline in its own right rather than being just one of
the pantheon of interests covered by the Royal Society. Sweet (2004) sees this as the ‘Battle of the Books’
between the ‘ancients’ and the ‘moderns’, which by the early 18th century saw an established difference
between history which was rooted in the study of texts, inscriptions and coins, and antiquarians who
rather than partaking in “gentlemanly learning”, i.e. the Classical texts, were concerned with record-
ing and analysis in the belief that understanding could be gleaned from the material remains themselves
(Sweet, 2004, p. 8).
This move towards an antiquarianism as the foundation for archaeology is nowhere better demon-
strated than by Sir Richard Colt Hoare’s The Ancient History of Wiltshire (1812 and 1821). Colt Hoare’s
methodology, as well as the resulting publications, are important for being the first integration of large-­
scale landscape survey with systematic targeted excavation. Colt Hoare describes himself as an “historian
and topographer” and his intentions are clear from the start, ‘we speak from facts not theory’ he claims in
the Introduction. As with earlier works, the maps and plans were integrated with rich textual description
and some novel techniques were favoured, for example the three-­dimensional plan of a barrow group
justified by:

a large group of twenty-­seven tumuli, which, being so thickly clustered, could not be numbered
sufficiently distinct on the general map: I have therefore had them engraved on a separate plate,
which will explain, better than any verbal description, the different forms of the barrows which
compose this group.
(Colt Hoare, 1812, p. 121)

The period up until 1840 has been characterised as ‘speculative’ in the development of American
archaeology (Willey & Sabloff, 1993) and mirrors the speculation witnessed in Europe but with the
added challenge of explaining the Native Americans, who they were and where they had come from.
Replacing the early descriptive writings following the first European contact, by the opening of the
19th century explorers and travellers crossing North America were systematically recording the topog-
raphy, flora and fauna of the new landscapes that confronted them. As in Europe this ‘natural scientific’
approach often included archaeology and a major focus at this time were the mounds and earthworks
6 Mark Gillings, Piraye Hacıgüzeller and Gary Lock

west of the Appalachians, especially in Ohio and surrounding areas, which gave rise to the “mound
builder debate” (Trigger, 1989, p. 104). The essence of this was whether the native Americans encoun-
tered at that time were descendants of the mound builders or a ‘lost race’ had built them, although
the importance in terms of spatial understandings is that the monuments provided complex cultural
remains in the form of earthworks which could be mapped. The first creditable attempt at this was
the work of Caleb Atwater (1820) a local postmaster in Ohio who surveyed many of the monuments
and produced accurate scale plans with descriptive detail, an impressive achievement considering the
almost total lack of a precedent.
His important paper published in 1820 was in the first volume of the Transactions of the newly
formed American Antiquarian Society and marked the transition from the speculative to classificatory
descriptive period (1840 to 1914) (Willey & Sabloff, 1993), epitomised by the Society’s call in 1799 for
the recording of archaeological remains by “accurate plans, drawings and descriptions” (Willey & Sabloff,
1993, p. 32). Although Atwater’s interpretations were completely fanciful and favoured more advanced
mound builders who had since left the area rather than the existing local groups, his recording was
insightful and certainly responded to the Society’s call for accuracy. Twenty-­five years after Atwater’s pub-
lication, Ephraim G. Squier and Edwin H. Davis began systematic fieldwork which would build on this
and result in what remains today the primary source for these monuments (Squier & Davis, 18481). The
quality of their field recording far surpassed anything previous, combining scaled plans, cross-­sections, the
integrated recording of landscape and archaeological features, and two-­dimensional attempts at represent-
ing elevation and topography.
In both North America and Europe through the first half of the 19th century the collection of infor-
mation, the recording and mapping of sites and the collection of artefacts, increased rapidly thus creating
the need for more ordered ways of classifying the material. The need for temporal refinement increased
in importance as it became obvious that the evidence spanned long periods of time that needed to be
sub-­divided and ordered, and between the years 1820 and 1870 much endeavour in Europe, and Scan-
dinavia in particular, was focussed on chronology (Gräslund, 1987). This included both methodology
and the resulting chronological schemes with the development of the 3-­Age system and its refinement
and application by Christian Thomsen, Jens Worsaae, Oscar Montelius and others laying the surviving
foundations for prehistoric archaeology world-­wide. Although the emphasis of these developments lay
heavily on the empirical study of large quantities of artefacts, there was an important spatial element to
the work. Thomsen’s original method, and Worsaae’s subsequent confirmation through fieldwork, relied
on accurate recording of ‘find associations’, the spatial relationships of finds within excavated contexts.
At the landscape scale the spatial implication was that sites could now be relatively dated through the
establishment of typologies, either of the sites themselves or through artefacts associated with them.
The decades either side of the opening of the 20th century on both sides of the Atlantic saw the
professionalisation of archaeology with the opening of museums and university departments and the
resulting establishment of networks for communicating ideas through travel, meetings and publications.
Chronological schemes were well developed by now, for example, Worsaae’s for much of European
prehistory, a remarkable feat based on extensive travel and the empirical study of many thousands of
artefacts. Within this milieu the concept of ‘cultural groups’, ‘cultural units’ or ‘cultural stages’ developed,
spread and was modified in various ways although in general terms it represented what became known as
‘culture-­history’, characterised as a shift of interest from the artefacts to the people who made and used
them; a shift from establishing chronology to writing history (Trigger, 1989, Chapter 5). Although the
term ‘culture’ was applied in American archaeology at this time it was ill-­defined and not an accepted
methodological tool (Willey & Sabloff, 1993, p. 89), the main developments in this respect took place in
Europe and especially through the work of Vere Gordon Childe (Green, 1981).
Archaeology and spatial analysis 7

Childe used the idea of cultural groups combined with Montelius’s typologies and through an
extensive program of travelling and recording artefacts and sites produced the first synthesis of Euro-
pean prehistory (1925). His approach offered little in terms of spatial representation as it is based on
severe space-­time reductionism; any spatial considerations are in the accompanying text and occasional
distribution maps and these are usually at the regional scale so lack detail. Space is conceptualised as a
passive background across which people, ideas and cultures diffuse and migrate. On both continents the
Childean reductionist chart with time categorised along one axis and space along the other, together with
the associated writing of culture-­history, had a profound and lasting effect. Gordon Willey’s Introduction to
American Archaeology (1966) and Christopher Hawkes’ ABC of the British Iron Age (1959) are both based
on such spatial minimalism and the idea of an ‘archaeological culture’ still lingers in some quarters.
Incorporated into these early approaches are two important elements, firstly possible relationships
between past people and their environment, an ecological focus, and secondly how spatial differentiation
and spatial relationships can tell something about human relationships, a social focus. This theme has
developed into what we today call ‘landscape archaeology’ and its importance within the discipline is clear
throughout this book; this is landscape as a spatial metaphor.
The foundations for the representation and understanding of spatial distributions and relationships
through the use of distribution maps were established by antiquarians although it was in the first half of
the twentieth century that the technique was developed to incorporate added interpretative power. One
of the pioneers of this ‘geographical’ approach was Cyril Fox who is best known for his Personality of
Britain (Fox, 1959 originally 1932). Fox argued that the geology, topography, form of coastline, climate,
vegetation all combine to have a profound effect on the areas of occupation and cultural attributes of
people; in effect this was historical geography. These enriched distribution maps provided the interpre-
tative power for Fox especially through the addition of temporal sequence added by phased maps thus
enabling comparative change through time based on spatial comparisons. Fox himself recognised various
issues within spatial thinking of the time so that, for example, ‘massed maps’ use a variety of symbols to
convey ‘cultural complexity’ although it is the ‘resultant patterns that are of interest’ and when detail is
important to the argument a ‘special map’ is provided, for example of a specific pottery type. Fox’s own
words illustrate how a distribution map, a relatively simple form of spatial representation, can be used to
construct a sensitive appreciation of human spatiality through imaginative spatial reasoning:

the keys to understanding [the landscape] were the major landmarks; our traveller steered his way
along plateau and spur, past barrow and cairn and stone circle, by the sight of successive mountain
tops. So guided he reached his goal . . . .
(Fox, 1959, p. 91 (originally 1932)).

Central to developments in North America was Julian Steward (1950) and his ideas and methodolo-
gies of ‘cultural ecology’ and ‘area studies’ (Kerns, 2003). As an example of this approach, and of great
importance in the development of spatial thinking linked to large-­scale fieldwork, especially the organ-
isation and collection of survey data and its interpretation, is the Virú Valley Project, Peru (Willey, 1953,
1974). The thinking behind this project, its methodologies and interpretive framework have had a lasting
influence on both sides of the Atlantic. It was innovative in many ways, indeed Gordon Willey himself
described it as ‘experimental’, although at the time he did not realise the potential of it. While chronol-
ogy and ‘pottery sequences’ were still the accepted focus of fieldwork, the objectives here were ambitious
and innovative from the outset with the intention of identifying individual sites, recording each one and
reconstructing past landscapes so as to: “reconstruct cultural institutions as reflected in settlement configu-
rations”, with “settlement patterns” defined as “the way man [sic] disposed himself over the landscape”
8 Mark Gillings, Piraye Hacıgüzeller and Gary Lock

as shown by dwellings and community life reflected by environment, technology and the “institutions
of social interaction” (Willey, 1953). The relationships between settlements, pyramids and cemeteries, for
example, are represented by arrows and dotted lines indicating social groups while the overall limit for
agricultural activity is also delineated. Spatial relationships between sites are no longer a descriptive extra
to understanding each individual site but are central to the wider focus of landscape and the understand-
ing of social and cultural life at different scales. Comparing this to, for example Squiers and Davis, it
can be seen that spatial thinking is developing into a more formal spatial analysis that goes beyond the
inherent locational relationships.
As early as 1948 Walter Taylor’s book A Study of Archaeology was suggesting that archaeology needed to
go beyond the collecting and ordering of data and to some extent predicted the positivism of the 1960s
by arguing for interpretation based on the repeated reworking of hypotheses. It was Willey and Phillips
(1958 although this is based on two papers from 1953 and 1955) who built on Taylor’s proposals and
argued for theory based on rigorous methodology as the way forward for archaeology stating that the
discipline lacked ‘a systematic body of concepts and premises constituting archaeological theory’ (Willey &
Phillips, 1958, p. 1, their emphasis). They suggest that archaeology should be more science than history
and there is a need for cross-­cultural generalisations so that rather than the existing concern solely with
“the nature and position of unique events in space and time”, its ultimate purpose should be “the discov-
ery of regularities that are in a sense spaceless and timeless” (Willey & Phillips, 1958, p. 2).

Part 3 – from spatial analysis to spatial narrative and back again


While very little of Willey and Phillips’s proposals and examples were explicitly spatial, the formalisation
of their methodology and the call for explicit theoretical discussion heralded a new way of thinking that
was to affect all areas of archaeology. The 1960s and 1970s was a period which took notice of Willey
and Phillips and introduced and developed revolutionary new methodologies, techniques and theoreti-
cal frameworks on both sides of the Atlantic. These encompassed thinking about archaeology generally,
including space, spatial analysis and spatio-­cultural interpretations. Not least in terms of influence was
the work of Lewis Binford who rejected the subjectivity of culture-­historical approaches and advocated
analysis and interpretation based on the philosophy and methods of the natural sciences (1964, 1965).
This chimed with the prevailing positivism of the 1960s in the West, a future fuelled by nuclear power,
the unbridled promise of computer technology and, of course, in archaeology the interpretative power
offered by the new radiocarbon dating. Rapidly subsumed within these new approaches was (General)
Systems Theory where the ‘system’ to be modelled was some unit of culture, for example a village or
a hillfort, defined by a series of social and cultural variables within sub-­systems, such as religion, mate-
rial culture and economy, which could be affected by positive and negative feedback loops. Underlying
this was a reliance on statistics and computers (Hymes, 1965) not least because these systems could be
‘activated’ as simulations offering a range of ‘solutions’ which could never be reached otherwise (Doran,
1970). Aspects of probability theory, data analysis and classification were covered in mathematical detail
for the first time for an archaeological audience by Doran and Hodson (1975). It is noticeable within this
work how little is overtly spatial other than a critique of spatial models which they claim are “too simple
and limited in scope to do much more than add an air of objective rigour . . . .” (Doran & Hodson, 1975,
p. 292). An important, and lasting, element of this new way of thinking was the introduction of formal
(probability) sampling (Redman, 1974) which epitomised the claim to objectivity compared to subjective
judgement sampling which had prevailed previously.
A doyen of these new approaches in Britain was David Clarke whose early book Analytical Archaeol-
ogy (1968) starts as it means to go on: “Archaeology is an undisciplined empirical discipline. A discipline
Archaeology and spatial analysis 9

lacking a scheme of systematic and ordered study based upon declared and clearly defined models and
rules of procedure” (Clarke, 1968, p. xv). What follows is a detailed exploration of systems theory applied
to different scales of cultural unit. While the spatial component is represented by often minimalist dia-
grams, the underlying interpretative theory is explained in depth in the text. Spatial data, relationships and
interpretative concepts are often represented through models and modelling which are implicit within
Clarke’s approach although it was four years later that these were made explicit through his collection
of papers in Models in Archaeology (1972a). This reinforced archaeology’s long standing relationship with
geography, not least their parallel claimed revolutions in quantification, and geographical spatial thinking,
being a direct analogy to Chorley and Haggett’s Models in Geography (1967), who saw models as “. . . .
constituting a bridge between the observational and theoretical levels” (Chorley & Haggett, 1967, p. 24).
Typical of the approaches within Clarke’s book are the formal locational modelling techniques based
on forms of economic theory, particularly from the German geographical tradition. For example, Hod-
der’s (1972) analysis of Romano-­British settlement uses Central Place Theory and Thiessen polygons to
interrogate the spatial relationships between different sized settlements based on the ‘services’ they pro-
vided. Ellison and Harriss (1972) apply Site Catchment Analysis to the location of sites within a landscape
wherein the ‘resources’ have been classified according to their agricultural potential. By quantifying the
resources around sites of different periods shifts between pastoral and arable could be claimed. Clarke’s
own paper in the volume, (1972b) is an innovative approach to the analysis of a single settlement, the
Iron Age village of Glastonbury, important for its multi-­scalar approach. Starting with ‘modular units’
of a single house and associated structures, the analysis then moves to the site’s area and then the region
bringing in added spatial and attribute data. A temporal element is added through an ‘economic cycle
model’ which represents the annual agricultural cycle based on an infield, outfield and wasteland. There
is no doubt that these approaches meet the criteria of spatial thinking by providing a conceptual and
analytical framework, by providing analysis and communication alongside the manipulation, interpreta-
tion and explanation of spatial data. These elements are brought together in Clarke’s influential Spatial
Archaeology, (1977a) where in the opening chapter (Clarke, 1977b) he presents a thorough explication
of spatial theories and methods of the time incorporating four underlying general theories that attempt
to move beyond description to explanation: anthropological, economic, social physics and statistical. He
draws an important distinction between ‘quasi-­deductive’ non-­formal spatial approaches which are based
on empirical visual interpretation of patterning, and formal modelling based on quantitative analysis.
Interestingly though, even within this strongly positivist framework Clarke does offer a word of caution,
“the underlying theory has been criticised as too ideal in its disregard for non-­economic factors and the
fact that ‘cost’ is at least in part a culturally conditioned and relative threshold” (Clarke, 1977b, p. 19).
An important volume which in many ways represents the zenith of spatial quantification during the
1970s is Hodder and Orton’s Spatial Analysis in Archaeology (1976). Acknowledging the importance of
the distribution map as the accepted method of displaying locations and relationships between them,
the book proceeds through a wide range of statistical techniques that extend their interpretative power
based on hypothesis testing possibilities – “the methods aid the testing of hypotheses about spatial pro-
cesses, allow large amounts of data to be handled, and enable predictions to be made about the location,
importance and functioning of sites” (Hodder & Orton, 1976, p. 241). The importance of this volume
is that, in comparison to the others mentioned above, it offers an effective ‘how to do it’ manual which
is light on theoretical discussion and justification, making it a popular choice for working archaeologists.
By the early 1980s cracks were beginning to appear in the acceptance and validity of the quantitative
revolution in both archaeology and geography. In the rush towards post-­modernist humanism, the scien-
tific method and its valorised claims of objectivity gave way to a wide range of approaches and theoretical
stances based on more subjective and person-­centred understandings of space, place and how to interpret
10 Mark Gillings, Piraye Hacıgüzeller and Gary Lock

spatial relationships. Rather than take a strictly chronological approach to these developments, we would
instead like to identify a number of generic trends that emerged from the complex web of sometimes
tensioned and contradictory approaches that have been grouped under labels such as post-­processualism,
post-­structuralism, phenomenology, non-­representational theory and, most recently, new materialism
(Thomas, 2015).
Perhaps the most striking element of this critique was the explicit questioning of the ‘space’ that lay at
the heart of the spatial archaeology that had flourished under the aegis of the New Archaeology. Despite
its professed reaction to the culture historical paradigm it sought to disrupt, when it came to space, the
work of both the culture historians and processualists was predicated upon an uncritical acceptance of
the ontological status of space as a self-­evident given. This served as a universal, a-­priori container (in a
Kantian sense) with a 3-­dimensional geometry or an inert background with 2-dimensions that could be
measured using standard units. This Cartesian space was neutral and external – the space of the Meso-
lithic was identical to the space of the Bronze Age, which was identical to the space of the Roman period
etc. Crucially, they were all identical to the space of the modern archaeologist. Patterns in this universal
space could be measured and analysed creating a link to the activities that originally created them. These
patterns may well have been distorted by a range of taphonomic and formation processes, but space itself
had no role to play, it had no agency.
The critiques initiated by post-­processualism posed a simple, yet radical, question. What if space is not a
universal, a-­priori backdrop to human existence? What if it is not (and never has been) a mute canvas upon
which past activities left tell-­tale traces that could be mapped and statistically summarised? What if the
space of the Mesolithic was different to the space of the Bronze Age, which was different to the space of the
Roman period etc. and all were different to the space of the modern archaeologist? Further, if space was not
immutable and neutral, then at any given time rather than a single monolithic space, there may have been
a host of competing spaces differentiated on grounds of politics, gender and authority. Spaces could also
be active and agential, exerting power through deft manipulation and careful configuration (e.g. Foucault’s
“heterotopia” (1986)). A growing interest in Phenomenology lead to an active concern with embodied
spatial understandings (rather than the kinds of spatial knowledge gained through abstract representations)
in which knowledge of the world emerged through meaningful engagement within it (Tilley, 1994). This
lead to a broader questioning of the role and necessity of traditional archaeological maps. Why position
such abstract representational schema between archaeologists and the materials they sought to study?
In this context, anthropological and ethnographic studies (many of which themselves drew inspira-
tion from Phenomenology) shed light upon a range of non-­western spatial understandings that could
serve as analogues, or points of departure, for interpreting the pre-­modern worlds of much archaeologi-
cal enquiry. They also questioned the reality of modernity’s claims on space as a monolithic, universal,
a-­priori container by revealing the sheer complexity of spatial understandings that actually exist in the
modern west (Rappaport, 1994) and the myriad ‘species of space’ through which we live our own lives
(Perec, 1997). Emerging from this work was a shift in focus from space to place. Places were portrayed as
key locales, rich in social meaning and significance, around which everyday life was anchored (Cresswell,
2004; Feld & Basso, 1996). Archaeologists had spent years marking dots (meaningful places) on maps with
space providing the situational context. In the ‘palatial’ archaeologies and anthropologies of the 1990s,
space was instead argued to gain its very existence from this configuration of places (Nixon, 2006). This
nodal interpretation of place, and humanist assumptions that locations could only become meaningful
once they had been drizzled with cultural significance, in turn came under scrutiny. Drawing inspiration
from the philosophical work of Deleuze and Guattari (1988, see also Bonta & Protevi, 2006), a static
nodal depiction of place was replaced by a more fluid and dynamic conceptualisation that identifies
Archaeology and spatial analysis 11

places as tangles and knots in the ongoing flow of life (Ingold, 2011, pp. 145–155; Thrift, 1996). In such
work, places and spaces are profoundly emergent and relational – people do not create them by adding a
dollop of ‘culture’ to otherwise neutral locations, but instead it is the very relations that the location itself
is folded up within that allow significance to emerge.
In short, archaeological studies in the 1990s inspired by then contemporary spatial thinking in other
disciplines (e.g. social anthropology, sociology and human geography) questioned the very foundations of
the spatial archaeology project of the 1970s – namely that “lurking beneath the distribution of dots on a
map was a spatial process and causality to be discovered” (Tilley, 1994, p. 9). The result was a questioning
of the key tools used by spatial archaeologists, such as projected maps and the battery of formal statistical
and mathematical tools used to analyse them. If space was a fluid, emergent, profoundly relational and
highly contextual phenomenon, then the identification, representation and analysis of spatial patterns
posed significant challenges that in turn required new methods to address. However, whilst these theoreti-
cal critiques were undoubtedly powerful, they were not matched by any commensurate development of
new methodologies. Upon rejection of these formal approaches to a large extent a methodological void
was created. Although Tilley’s phenomenology is only one strand of post-­processual thinking his com-
ment applies generally – “there is and can be no clear-­cut methodology arising from [post-­processual
thinking] to provide a concise guide to empirical research” (Tilley, 1994, p. 11). As he explained later
(Tilley 2008), formal methodology was not seen as being necessary as the aim was to provide a thick text
narrative, almost a return to the antiquarian’s chorography, that could be reinterpreted and re-­written
within an ongoing spiral of changing understanding. So here we have an important difference between
spatial analysis and spatial narrative both of which involve concepts, albeit very different, of space and
processes of reasoning. One key difference concerns the tools we use to represent spatial phenomena;
although spatial data underlie spatial narratives there is no interpretative requirement for them to be
explicitly depicted. Although this situation is undoubtedly changing, particularly with regard to the
potential of representational schema such as maps (see papers in Gillings, Hacıgüzeller & Lock, 2019), the
methodological developments in spatial archaeology that did take place alongside the theoretical ruptures
sketched above were largely computational and took their theoretical inspiration predominantly from
the New Archaeology. As a result rather than running hand-­in-­hand with this dynamically evolving
theoretical landscape, they ran in parallel.
In methodological terms, this most recent phase of spatial archaeology has been characterised by the
increasing importance of spatial technologies not least Geographic Information Systems (GIS). Since the
early days of GIS in archaeology (Allen, Green, & Zubrow, 1990) their potential has been recognised
within two important areas – firstly the management, integration and display of increasingly large and
complex forms of spatial data and, secondly, their potential within the area of spatial analysis. The rapid
adoption of GIS since the late 1980s, in various forms and in various ways, has had a major impact on
archaeology such that their use is now almost taken for granted. Even so, the use of GIS in archaeology
always has been, and still is (see Howey & Brouwer Burg, 2017; Verhagen, 2018), somewhat contentious
at the theoretical level due to the branching developmental pathway alluded to above, although the attrac-
tions of the technology are usually seen to outweigh any restrictions or disadvantages. Whilst at their
most strident, these arguments centred on accusations of a return to positivism and the inability of the
technology to respond to humanist, subjective understandings of place and landscape, the last decade has
seen a slow but welcome convergence. This has been characterised by a willingness on the part of spatial
analysts to engage with developments in archaeological theory, and a new openness to the possibilities of
spatial representations such as in the case of maps (e.g. Aldred & Lucas, 2019) and the methodological
possibilities offered by technologies such as GIS on the part of theorists (e.g. Fowler, 2013).
12 Mark Gillings, Piraye Hacıgüzeller and Gary Lock

Part 4 – thinking and analysing spatially


As should by now be clear, the techniques discussed in the following twenty-­three chapters represent the
culmination of some 300 or so years of spatial investigation within archaeology. In addition, whilst they
are ‘scientific’ insofar as they involve the rigorous application of precise and repeatable methodologies,
their application does not presuppose (or dictate) a single theoretical position. That is, put simply, the
use of Cartesian co-­ordinates does not have to signal the uncritical adoption of a modernist ontology.
Fortunately, archaeology’s long and detailed engagement with space and spatiality has meant strong levels
of spatial awareness, knowledge and literacy on the part of its practitioners and it is to the further devel-
opment and refinement of this that the current volume is directed.
If we needed to sketch some of the characteristics of effective spatial thinkers, the criteria identified
by the U.S. National Academies (2006, p. 20) offer a useful place to start:

1 They know where, when, how and why to think in spatial terms.
2 Practice spatial thinking in an informed way: they have a broad and deep knowledge of spatial con-
cepts and spatial representations, a command over spatial reasoning using a variety of spatial ways of
thinking and acting, and well-­developed capabilities for using supporting tools and technologies.
3 Adopt a critical stance insofar as they: can evaluate the quality of spatial data based on its source and
its likely accuracy and reliability; can use spatial data to construct, articulate, and defend a line of
reasoning or point of view in solving problems and answering questions; and can evaluate the valid-
ity of arguments based on spatial information.

The implications of this for archaeology are twofold. First, that spatial thinking is something that has
to be learnt, developed and supported. Second, that the use of spatial methods, techniques and tech-
nologies alone does not make the user a spatial thinker. The uncritical use of computer software, for
example, the push-­button solution of generating a viewshed, does not meet the three criteria above
for determining spatial intelligence (Lock & Pouncett, 2017). In helping researchers to develop such
an ability, the U.S. National Academies (2006, p. 12) also highlighted three fundamental elements, and
we have taken inspiration from this threefold schema in structuring the volume as whole, as well as the
individual chapters that make it up:

1 Concepts of space: providing the conceptual and analytical framework within which data can be integrated,
related and structured into a whole – as already mentioned, the issue here is often characterised as the
difference between ‘space’ and ‘place’, two very different concepts. The former is considered to be
objective, a blank background or container within which human action takes place. The fact that
these actions can be mapped, measured and analysed within a co-­ordinate is largely unquestioned
and taken as given. ‘Place’ on the other hand is a culturally constituted locale embedded with mean-
ing through the human actions and experiences that happen there (Cresswell 2004); it is relative
compared to absolute space. It has long been recognised in geography and archaeology that spatial
technologies are designed to work with absolute objective space and, therefore, are often challenged
by concepts of place (Curry, 1998).
2 Tools of representation: providing the forms within which structured information can be stored, analysed, compre-
hended and communicated – the issue here is encapsulated within the post-­modern ‘crisis of representa-
tion’, including how to represent those aspects of human experience that cannot be ‘scientifically’
measured and plotted. The data structures of technologies such as GIS are based on spatial primitives
Archaeology and spatial analysis 13

(point, line, polygon plus cell-­based coverages and attribute data). These are designed to represent a cer-
tain view of the world, one which is at odds with understandings not based on empirical objectivism.
3 Processes of reasoning: providing the means of manipulating, interpreting and explaining the structured
information – the issue here can be characterised as one of methodology, and equally a lack of it and
a need for it. Here we can differentiate between explicit and implicit methodology. What we can
call implicit methodology is particularly important within GIS and is incorporated into the general
critique of “technological determinism” (Huggett, 2000). Specifically, the procedures which under-
lie GIS operations involve pre-­determined algorithms, for example line-­of sight, least-­cost-­path and
interpolating a digital elevation model (DEM). These incorporate ‘black box’ logics and algorithms
not immediately available to the user but fundamental in influencing the interpretation arrived at
and often chosen unknowingly.

An objective of the current volume, albeit ambitious, is to make archaeologists better spatial thinkers,
and as a result better spatial analysts. It is concerned with formal techniques of spatial analysis. This is a
poorly defined term that is sometimes conflated with spatial thinking although to be represented herein
it must involve precise and repeatable methodology, these days usually computer-­based. Spatial analysis is
subsumed within spatial thinking and, as the following quote demonstrates, cannot be divorced from the
interplay between quantitative and qualitative approaches, formal and informal aspects:

Spatial analysis exists at the interface between the human and the computer, and both play important
roles. The concepts that humans use to understand, navigate, and exploit the world around them are
mirrored in the concepts of spatial analysis. So [a comprehensive discussion on spatial analysis] will
often appear to be following parallel tracks – the track of human intuition on the one hand, with all
its vagueness and informality, and the track of the formal, precise world of spatial analysis on the other.
(de Smith, Goodchild, & Longley, 2018; our emphasis).

More specifically, the objective of the current volume is to encourage spatial thinking on the part of
archaeologists by detailing a range of contemporary spatial analytical techniques in as accessible a fash-
ion as possible. The remit given to the authors was to structure their chapter in a clear and consistent
fashion. Each starts with an introduction to the technique, explaining why it is important and how it
has been used before. The next section in each chapter explains the methodology, followed by one or
more case-­studies applying the technique while the conclusion indicates future directions and possibili-
ties. Whilst the nature of some of the topics made it more difficult to follow this structure to the letter
than others, we think, the results have been worth the effort, providing an accessible summary of each
technique alongside relative inter-­chapter harmony. The chapters are embedded within contemporary
practice and are, therefore, to a large extent computer-­based and quantitative, although we avoid lengthy
discussions on software solutions which would rapidly age the book since they tend to come and go
quickly. What is crucial to stress is that the quantitative focus of the book does not indicate a return to
the positivism described above but rather a maturing realistic acceptance of the importance and poten-
tial of these methods and related technologies. We believe that the application of quantified, formal
methods can be used in an exploratory way – not least for developing more nuanced understandings
of space and spatiality – and that any results produced are not ‘the answer’ but rather the starting point
for a process of interpretation and understanding of the past. As a final note, we hope that the methods
explained in this book will encourage and guide archaeologists in undertaking spatial analysis for some
years to come.
14 Mark Gillings, Piraye Hacıgüzeller and Gary Lock

Note
1 This work was re-­published in 1998 by the Smithsonian Institute as a 150th Anniversary Edition with an extensive
introduction by D.J. Meltzer.

References
Aldred, O., & Lucas, G. (2019). The map as assemblage: Landscape archaeology and mapwork. In M. Gillings,
P. Hacıgüzeller, & G. Lock (Eds.), Re-­mapping archaeology: Critical perspectives, alternative mappings (pp. 19–36).
London: Routledge.
Allen, K. M. S., Green, S. W., & Zubrow, E. B. W. (Eds.). (1990). Interpreting space: GIS and archaeology. London:
Taylor and Francis.
Atkinson, P., & Duffy, D. M. (2019). Seeing movement: Dancing bodies and the sensuality of place. Emotion, Space
and Society, 30, 20–26.
Atwater, C. (1820). Description of the antiquities discovered in the state of Ohio and other western states. Transac-
tions and Collections of the American Antiquarian Society, 1, 105–267. (1997 re-­published by Arthur W. McGraw).
Binford, L. R. (1964). A consideration of archaeological research design. American Antiquity, 29, 425–441.
Binford, L. R. (1965). Archaeological systematics and the study of culture process. American Antiquity, 31(2), 203–210.
Boast, R. (2009). The formative century, 1860–1960. In B. Cunliffe, C. Gosden, & R. Joyce (Eds.), The Oxford hand-
book of archaeology (pp. 47–70). Oxford: Oxford University Press.
Bodenhamer, D. J., Corrigan, J., & Harris, T. M. (Eds.). (2010). The spatial humanities: GIS and the future of humanities
scholarship. Bloomington: Indiana University Press.
Bonta, M., & Protevi, J. (2006). Deleuze and geophilosophy. Edinburgh: Edinburgh University Press.
Chandler, J. (1993). John Leland’s Itinerary: Travels in Tudor England. Stroud: Alan Sutton Publishing.
Childe, V. G. (1925). The dawn of European civilization. London: Kegan Paul.
Chorley, R., & Haggett, P. (1967). Models in geography. London: Methuen.
Clarke, D. (1968). Analytical archaeology. London: Methuen.
Clarke, D. (Ed.). (1972a). Models in archaeology. London: Methuen.
Clarke, D. (1972b). A provisional model of an Iron Age society and its settlement system. In D. Clarke (Ed.), Models
in archaeology (pp. 801–869). London: Methuen.
Clarke, D. (Ed.). (1977a). Spatial archaeology. New York: Academic Press.
Clarke, D. (1977b). Spatial information in archaeology. In D. Clarke (Ed.), Spatial archaeology (pp. 1–32). New York:
Academic Press.
Colt Hoare, R. (1812 and 1821). The ancient history of Wiltshire (2 Vols.). Republished 1975 by EP Publishing and
Wiltshire County Library.
Conneller, C. (2011). An archaeology of materials. London: Routledge.
Cresswell, T. (2004). Place: A short introduction. Oxford: Blackwell Publishing.
Curry, M. R. (1998). Digital places: Living with geographic information technologies. London: Routledge.
Dalakoglou, D., & Harvey, P. (2012). Roads and anthropology: Ethnographic perspectives on space, time and (Im)
mobility. Mobilities, 7(4), 459–465.
Deleuze, G., & Guattari, F. (1988). A thousand plateaus. London: Athlone Press.
de Smith, M., Goodchild, M., & Longley, P. (2018). Geospatial analysis: A comprehensive guide (6th ed.). Retrieved May
2019, from www.spatialanalysisonline.com/
Doran, J. (1970). Systems theory, computer simulations and archaeology. World Archaeology, 1, 289–298.
Doran, J., & Hodson, F. (1975). Mathematics and computers in archaeology. Edinburgh: Edinburgh University Press.
Ellison, A., & Harriss, J. (1972). Settlement and land use in the prehistory and early history of southern England:
A study based on locational models. In D. Clarke (Ed.), Models in archaeology (pp. 911–962). London: Methuen.
Feld, S., & Basso, K. H. (Eds.). (1996). Senses of place. Santa Fe: SAR Press.
Foucault, M. (1986). Of other spaces. Diacritics, 16(1), 22–27.
Fowler, C. (2013). The emergent past. Oxford: Oxford University Press.
Fox, C. (1959). The personality of Britain: Its influence on inhabitant and invader in prehistoric and early historic times (4th
ed., originally published 1932). Cardiff: National Museum of Wales.
Archaeology and spatial analysis 15

Gillings, M., Hacıgüzeller, P., & Lock, G. (Eds.). (2019). Re-­mapping archaeology: Critical perspectives, alternative mappings.
London: Routledge.
Gräslund, B. (1987). The birth of prehistoric chronology. Cambridge: Cambridge University Press.
Green, S. (1981). Prehistorian: A biography of V. Gordon Childe. Bradford-­on-­Avon: Moonraker Press.
Hacıgüzeller, P. (2017). Archaeological (digital) maps as performances: Towards alternative mappings. Norwegian
Archaeological Review, 50(2), 149–171.
Hammer, E. (2014). Local landscape organization of mobile pastoralists in southeastern Turkey. Journal of Anthropo-
logical Archaeology, 35, 269–288.
Hawkes, C. (1959). The ABC of the British Iron Age. Antiquity, 33, 170–182.
Hodder, I. (1972). Locational models and the study of Romano-­British settlement. In D. Clarke (Ed.), Models in
archaeology (pp. 887–910). London: Methuen.
Hodder, I., & Orton, C. (1976). Spatial analysis in archaeology. New Studies in Archaeology, 1. Cambridge: Cambridge
University Press.
Howey, M. C. L., & Brouwer Burg, M. (2017). Assessing the state of archaeological GIS research: Unbinding analyses
of past landscapes. Journal of Archaeological Science, 84, 1–9.
Huggett, J. (2000). Computers and archaeological culture change. In G. Lock & K. Brown (Eds.), On the theory and
practice of archaeological computing (pp. 5–22). Oxford: Oxford University Committee for Archaeology.
Hunter, M. (1975). John Aubrey and the realm of learning. London: Duckworth.
Hymes, D. (Ed.). (1965). The use of computers in anthropology. London: Mouton.
Ingold, T. (1993). The temporality of landscape. World Archaeology, 25(2), 152–174.
Ingold, T. (2011). Being alive: Essays on movement, knowledge and description. London: Routledge.
Kerns, V. (2003). Scenes from the high desert: Julian Steward’s life and theory. Urbana: University of Illinois Press.
Knappett, C. (2011). Networks of objects, meshworks of things. In T. Ingold (Ed.), Redrawing anthropology: Materials,
movements, lines (pp. 45–63). Farnham: Ashgate.
Lefebvre, H. (1991). The production of space. Oxford: Blackwell’s.
Lock, G., & Pouncett, J. (2017). Spatial thinking in archaeology: Is GIS the answer? Journal of Archaeological Science,
84, 129–135. http://dx.doi.org/10.1016/j.jas.2017.06.0020305-­4403/
Löwenborg, D. (2018). Knowledge production with data from archaeological excavations. In I. Huvila (Ed.), Archae-
ology and archaeological information in the digital society (pp. 37–53). Abingdon: Routledge.
Lucas, G. (2012). Understanding the archaeological record. Cambridge: Cambridge University Press.
McCormack, D. P. (2008). Geographies for moving bodies: Thinking, dancing, spaces. Geography Compass, 2(6),
1822–1836.
McFarlane, C. (2011). The city as assemblage: Dwelling and urban space. Environment and Planning D: Society and
Space, 29(4), 649–671.
Nixon, L. (2006). Making a landscape sacred. Oxford: Oxbow.
Perec, G. (1997). Species of spaces and other pieces. London: Penguin.
Peterson, R. (2003). William Stukeley: An eighteenth century phenomenologist? Antiquity, 77(296), 394–400.
Piggott, S. (1985). William Stukeley: An eighteenth century antiquarian (2nd ed.). London: Thames and Hudson.
Rappaport, A. (1994). Spatial organization and the built environment. In T. Ingold (Ed.), Companion encyclopedia of
anthropology (pp. 460–502). London: Routledge.
Redman, C. L. (1974). Archaeological sampling strategies. Modular Publications in Archaeology, 55. New York:
Addison-­Wesley.
Resina, J. R., & Wulf, C. (2019). Repetition, recurrence, returns: How cultural renewal works. Lanham, MD: Lexington
Books.
Richards-­Rissetto, H., & Landau, K. (2014). Movement as a means of social (re)production: Using GIS to measure
social integration across urban landscapes. Journal of Archaeological Science, 41, 365–375.
Roberts, L. (2012). Mapping cultures: A spatial anthropology. In L. Roberts (Ed.), Mapping cultures: Place, practice,
performance (pp. 1–25). Houndmills, Basingstoke, Hampshire and New York: Palgrave Macmillan.
Schnapp, A. (1996). The discovery of the past. London: British Museum Press.
Seigworth, G. (2000). Banality for cultural studies. Cultural Studies, 14(2), 227–268.
Snead, J. E., Erickson, C. L., & Darling, J. A. (2009). Landscapes of movement: Trails, paths, and roads in anthropological
perspective. Philadelphia: University of Pennsylvania Museum of Archaeology and Anthropology.
16 Mark Gillings, Piraye Hacıgüzeller and Gary Lock

Soja, E. (1996). Thirdspace: Journeys to Los Angeles and other real and imagined places. Oxford: Blackwell’s.
Spaulding, A. (1960). The dimensions of archaeology. In G. E. Doles & R. L. Cameiro (Eds.), Essays in the science of
culture in honor of Leslie A. White (pp. 437–456). New York: Crowell.
Squier, E. G., & Davis, E. H. (1848). Ancient monuments of the Mississippi Valley. Smithsonian Contributions to Knowl-
edge, 1. Washington. (Re-­published in 1998 with an extended Introduction by D. J. Meltzer).
Steward, J. (1950). Area research: Theory and practice. New York: Social Science Research Council.
Sweet, R. (2004). Antiquaries: The discovery of the past in eighteenth century Britain. London: Hambledon and London.
Thomas, J. (2015). The future of archaeological theory. Antiquity, 89(348), 1287–1296.
Thrift, N. J. (1996). Spatial formations London. Thousand Oaks, CA: Sage.
Thrift, N. J. (2008). Non-­representational theory: Space, politics, affect. London: Routledge.
Tilley, C. (1994). The phenomenology of landscape: Places, paths and monuments. Oxford: Berg.
Tilley, C. (2008). Phenomenological approaches to landscape. In B. David & J. Thomas (Eds.), Handbook of landscape
archaeology (pp. 271–276). Walnut Creek: Left Wing Press.
Trigger, B. G. (1989). A history of archaeological thought. Cambridge: Cambridge University Press.
Trigger, B. G. (1998). Sociocultural evolution: Calculation and contingency. Malden, MA: Blackwell Publishers.
US National Academies. (2006). Learning to think spatially: GIS as a support system in the K-­12 curriculum. Washington:
The National Academies Press. Retrieved March 2019, from www.nap.edu/read/11019/chapter/1
Verhagen, P. (2018). Spatial analysis in archaeology: Moving into new territories. In C. Siart, M. Forbriger & O.
Bubenzer (Eds.), Digital geoarchaeology: New techniques for interdisciplinary human-­environmental research (pp. 11–25).
Cham: Springer International Publishing.
Verhagen, P., Nuninger, L., & Groenhuijzen, M. R. (2019). Modelling of pathways and movement networks in
archaeology: An overview of current approaches. In P. Verhagen, J. Joyce, & M. R. Groenhuijzen (Eds.), Finding
the limits of the limes: Modelling demography, economy and transport on the edge of the Roman Empire (pp. 217–249).
Cham: Springer International Publishing.
Warf, B., & Arias, S. (2009). The spatial turn: Interdisciplinary perspectives. London and New York: Routledge.
Wickstead, H. (2019). Cults of the distribution map: Geography, utopia and the making of modern archaeology.
In M. Gillings, P. Hacıgüzeller, & G. Lock (Eds.), Re-­mapping archaeology: Critical perspectives, alternative mappings
(pp. 37–72). London: Routledge.
Willey, G. R. (1953). Prehistoric settlement patterns in the Virú Valley, Peru. Bulletin No. 155. Washington: Bureau of
American Ethnology.
Willey, G. R. (1966). An introduction to American archaeology. NJ: Prentice-­Hall.
Willey, G. R. (1974). The Virú Valley settlement pattern study. In G. R. Willey (Ed.), Archaeological researches in retrospect
(pp. 149–178). Cambridge: Winthrop Publishers.
Willey, G. R., & Phillips, P. (1958). Method and theory in American archaeology. Chicago: University of Chicago Press.
Willey, G. R., & Sabloff, J. A. (1993). A history of American archaeology (3rd ed.). New York: W.H. Freeman and
Company.
Wood, D. (2010). Rethinking the power of maps. New York: The Guilford Press.
Wood, D. (2012). The anthropology of cartography. In L. Roberts (Ed.), Mapping cultures: Place, practice, performance
(pp. 280–303). Houndmills, Basingstoke, Hampshire and New York: Palgrave Macmillan.
2
Preparing archaeological data
for spatial analysis
Neha Gupta

Introduction
In the final pages of Spatial Analysis in Archaeology, Ian Hodder and Clive Orton (1976, p. 245) remarked
that the ‘slow collection of large bodies of reliable data, [. . .] will allow spatial processes to be better
understood’. This scholarly work made explicit spatial concepts in the field of archaeology, and drew
attention to cross-­disciplinary conversations that archaeologists can have with geographers and social
scientists. Published in 1976, Hodder and Orton’s remarks might seem simple and banal, yet they under-
score two key facets in archaeology that hold true in today’s digital data-­r ich environment; first, that
archaeologists will re-­use archaeological data that were collected by other scholars at different times,
who employed different methods, tools and technologies; and second, that for archaeology to engage in
meaningful conversations on complex phenomena, we must have ‘large bodies of reliable data’ (emphasis
mine) which can be understood as a reliable archaeological database (Gupta & Devillers, 2017, p. 857).
Hodder and Orton (1976, p. 244) further note that spatial analytic techniques go ‘hand-­in-­hand’ with
the ‘collection of better data’ and remark on the value of ‘very detailed information’ but they do not
explicitly describe what reliable means, and how research design and the goals of a particular project
are linked to data quality and what role data quality might play a role in the analysis, interpretation and
re-­use of archaeological data.
Archaeologists increasingly face a scenario where the re-­use of archaeological data, particularly the
processing and analysis of digital archaeological data, is posing pointed challenges to the practice of
archaeology (Kansa, Kansa, & Arbuckle, 2014; Huggett, 2015). The social life of archaeological data
typically extends beyond specialists and is best understood in relation to local communities and society
as a whole. The changing relationship between archaeology and society is reflected in scholarship on
the abuse and misuse of archaeology (Silberman, 1989; Kohl & Fawcett, 1995; Meskell, 2005; Kohl,
Kozelsky, & Ben-­Yehuda, 2007), as well as on challenging inaccurate views of the human past (Wylie,
2002; Trigger, 2006). Archaeology (and archaeological data), therefore, can serve both public and
scholarly goals.
Recent interests in the preparation of archaeological data for further use, and the quality of archaeo-
logical data are influenced by two broader developments, namely; the growing use of digital and geo-
spatial tools and technologies in data acquisition (Dibble & McPherron, 1988; Levy & Smith, 2007); and
18 Neha Gupta

second, the exponential growth of communication tools, particularly Web 2.0 technologies that facilitate
collaboration and can encourage exchange and sharing of data between scholars, institutions and non-­
specialists (Kansa, 2011). The apparent democratization of archaeological data has renewed concerns
over the privacy of archaeological sites and the sharing of sensitive locational information (Bampton &
Mosher, 2001; Sitara & Vouligea, 2014).
Growing awareness of a digital data-­r ich environment (Bevan & Lake, 2013) has spurred calls for
Open Science in archaeology (Marwick, 2017), practices that aim to enable ‘reproducibility’, generate
‘scripted workflows’, ‘version control’, ‘collaborative analysis’, and encourage public availability of pre-
prints and data (Marwick et al., 2017, p. 8). Efforts in open archaeology are premised on the belief that
the ‘analytical pipeline’ in archaeological research has not been available for scholars to examine, critique
and re-­use, a situation that impacts the range and scope of archaeology (Marwick, 2017, p. 424). This
situation is reflected in the prevailing use of ‘point-­and-­click’ commercial software that obscures underly-
ing algorithms and assumptions. Moreover, archaeologists typically do not document (or do not report)
sufficient information on how and why particular decisions were made during cleaning and analysis, a
situation that presents significant challenges in replicating analytical methods and results, even when data
are available for re-­use.
Understanding the role of quality information in archaeology is pressing as digital geospatial data
acquired in the field are increasingly compiled with existing digitized information and together they are
combined into computational pipelines (Snow et al., 2006; Kintigh, 2006). Archaeologists are accumu-
lating large amounts of data through ‘real-­time’ digital documentation in the field (Vincent, Kuester, &
Levy, 2014), ‘mobilizing the past’ (Averett, Counts, & Gordon, 2016) and promoting ‘transparency’ in
field collection (Strupler & Wilkinson, 2017). Typically paperless, these efforts are thought to minimize
redundancy and human-­introduced errors in the recording of archaeological sites and archaeological data
(Austin, 2014) and potentially shorten the time interval between stages in the archaeological workflow
(Roosevelt, Cobb, Moss, Olson, & Ünlüsoy, 2015).
To manage, store and analyse these large amounts of digital archaeological data, archaeologists typically
harness geospatial technologies such as commercial Geographic Information Systems (GIS). However,
these spatial databases are known to have poor error management, a situation that can result in error
propagation that impacts subsequent analysis and the final result (Figure 2.1) (Hunter & Beard, 1992).
The widespread use of GIS in archaeology therefore can constrain broader assessments of archaeological
methods and the quality of data in terms of interpretation and re-­use. Moreover, processing of data and
analysis within computational pipelines is rarely documented and shared (Costa, Beck, Bevan, & Ogden,
2013, p. 450), limiting what is known on procedures and transformations and the overall quality of
sources (Evans, 2013).
In this chapter, I discuss the preparation of archaeological data for spatial analysis and re-­use from the
perspective of spatial data quality in the archaeological workflow. I draw from scholarship on geospatial
data quality to shed light on data-­centric issues in archaeology. I show that archaeological spatial analysis
can be improved through better documentation procedures and transformations of archaeological data
and discuss how these practices can facilitate deeper understanding of archaeological methods and prac-
tice and open new forms of research in archaeology. I argue that documentation on data cleaning and
tidying procedures, and version control can enable more rigorous research practice, and attune archaeolo-
gists to data-­centric imperfections in archaeological data.
Uncertainty in geographical data is the recognition that there exists a difference between a complex
reality and our conceptualization and measurement of that reality (Plewe, 2002). Our conceptualization
of reality is necessarily a generalization and abstraction, and thus an imperfect model. Uncertainty in
Preparing data for spatial analysis 19

Figure 2.1 Sources and types of errors in data collection and compilation, data processing and data usage that
result in final global error, adapted from Hunter and Beard (1992).

this model can be described as having three dimensions; space, time and theme. A map, for example, is
an imperfect model of a complex reality; a full scale (1:1) map of a town could never be rolled out, nor
would we make good use of such a document. Within this framework, we can better understand for
each of the three dimensions, elements of quality, including error, accuracy, precision, consistency and
completeness. Greater awareness of sources and causes of error and uncertainty can enable us to represent
and manage imperfections within spatial databases, which in turn can facilitate greater confidence in the
interpretation of digital archaeological data.
Quality issues are present in all data and throughout the research process, a situation that impacts
the interpretation of archaeological data and decision-­making (Figure 2.2). The preparation of
archaeological data for analysis is tied to data quality which, in turn, is related to research design
and the archaeologist’s intended purpose for those data. Data quality can be understood in terms of
internal and external. Internal quality is the ‘level of similarity between data produced’ and the ‘ideal
data’ (data without error) or ‘control data’. The ideal data are based on a set of specifications or rules
and requirements that define how objects will be represented, which geometries will represent each
type of object, the attributes that describe them and possible values for these attributes (Devillers &
Jeansoulin, 2006, p. 38). Therefore when a result differs to some degree from what a theory was
expecting, we have error, imprecision and incompleteness. External quality relates to data that were
produced and how they meet the needs of a particular user. Whereas data quality elements can be
measured separately for space, time and theme, an assessment of overall data quality requires care-
ful consideration on all three dimensions because they are interdependent. Greater emphasis is now
placed on ‘fitness for use’, which shifts focus to the needs of particular users and their intended use
of data (Chrisman, 2006).
20 Neha Gupta

Figure 2.2 The archaeological workflow in terms of a computational pipeline from data acquisition to unpub-
lished data, and re-­use. Problems of quality impact data in each stage. Black boxes in this workflow occur
wherever archaeologists employ software and tools whose code are unavailable to review and modify and that
do not enable documentation of transformations.

Scholarship on quality issues in archaeology ranges from ‘quality assurance’ (Banning, Hawkins,
Stewart, Hitchings, & Edwards, 2017), and ‘quality standard’ (Willems & Brandt, 2004) in field surveys to
verifying and validating the quality of computational models (Burg, Peeters, & Lovis, 2016), the use of
statistical techniques to address spatio-­temporal uncertainty (Zoghlami, de Runz, Akdag, & Pargny, 2012;
Kolar, Macek, Tkáč, & Szabó, 2015; cf. Fusco & de Runz this volume) and temporal uncertainty (Green,
2011; Bevan, Crema, Li, & Palmisano, 2013; Crema, 2012) to the quality of 3-­dimensional photogram-
metric models (Porter, Roussel, & Soressi, 2016). While fruitful, these efforts typically offer immediate
solutions for project-­specific problems and emphasize quantification techniques themselves, overlooking
data quality issues in processing digital archaeological data and their future usage.
Recent interest in the quality of digital archaeological data shifts the focus away from positional
accuracy as the primary concern in archaeology, as is reflected in works such as Dunnell, Teltser, and Ver-
cruysse (1986), Dibble and McPherron (1988), Wheatley and Gillings (2002), Heilen, Nagle, and Altschul
(2008), Atici, Kansa, Lev-­Tov, and Kansa (2013), Evans (2013), Kansa et al. (2014), Wilshusen, Heilen,
Catts, de Dufour, and Jones (2016), Cooper and Green (2016) and McCoy (2017). Managing and shar-
ing digital geospatial data are encouraging archaeologists to think in terms of data-­intensive methods and
‘big data’ i.e. data that are characterized by volume, velocity, variety, veracity, visualization and visibility
(McCoy, 2017, p. 76; Green, this volume). Large sets of data, such as those that are generated through
Preparing data for spatial analysis 21

multiple field seasons, are greatly impacted by error (Dunnell et al., 1986). A similar scenario might be
described for data collected through highly complex projects that involve the integration of multiple
sources of information that are heterogeneous spatially, temporally and thematically.
To improve data documentation and quality, Kansa et al. (2014) place emphasis on editorial and col-
laborative review of archaeological collections. They suggest that data cleaning early in the archaeologi-
cal workflow can avoid costly investments in terms of person hours, and publication delays later in the
process, particularly when the archaeologist who collected the data is not available to encode and link
individual field documents. Re-­use and comparison of existing digital archaeological collections is high-
lighting data ‘accuracy, reliability and completeness’ (Evans, 2013, p. 20), and the challenges in integrating
diverse data that do not have clear documentation (Cooper & Green, 2016). Most importantly, recent
scholarship greatly expands what archaeologists consider pertinent to the quality of data, and draw atten-
tion to elements such as error, accuracy, precision, consistency and completeness in digital archaeological
collections. These efforts reflect growing intellectual interest in re-­use of research data.
Research data management initiatives are now supported by governments. In the United States and
Canada as well as other countries, publicly funded research projects are required to lay out data man-
agement plans that ensure academic outputs are prepared for preservation and re-­use (National Science
Foundation, 2017; Tri-­Agency, 2016). The Canadian Tri-­Agency Statement for Principles on Digital
Data Management, for example, includes an overview of the responsibilities of researchers, research com-
munities, research institutions and research funders, as well as best practices in data management planning
throughout the research project lifecycle. These broader developments are encouraging archaeologists in
enabling sharing of digital archaeological data over Web-­based platforms. Data publishers such as Open
Context, and digital repositories such as US-­based, the Digital Archaeological Record (tDAR), UK-­based
Archaeology Data Service (ADS), and the Advanced Research Infrastructure for Archaeological Dataset
Networking in Europe (ARIADNE) offer new opportunities for data re-­use. These efforts reflect greater
control over metadata in archaeology and the potential for new forms of collaborative research.

Method
Consideration of data quality is invariably linked to research design and the goals of a particular project.
Until recently, spatial data quality in archaeology seemed to refer to positional accuracy which is com-
monly associated with tools and technologies such as Global Positioning Systems (GPS) and remotely
sensed imagery (Wheatley & Gillings, 2002). Emphasis on locational information comes as no surprise,
given archaeological interests in field-­based research and because of requirements in Cultural Resource
Management (CRM) and planning to inventorise archaeological sites (Heilen et al., 2008). Archaeologists
made great efforts to better model the spatial dimension in digital archaeological data, overlooking the
temporal dimension or chronology (Llobera, 2007; Rabinowitz, 2014). Yet archaeological data have spa-
tial, temporal and thematic dimensions, all of which must be considered in any evaluation of data quality,
and especially when data-­intensive methods are employed.
Archaeological field data are typically complemented with terrestrial imagery (e.g. ground penetrat-
ing radar, aerial, and satellite imagery) and the recovery of portable artefacts such as potsherds, tools and
skeletal material. Archaeological documentation of surface features such as earthworks, field walls, monu-
ments, pathways, rock images, and subsurface ones such as hearths, camps, and dwellings and their spatial
relationships with other recovered material culture can be thought of as a collection. In this conceptual-
ization, the archaeological database is differentiated from the archaeological record. The latter refers to material
culture that exists, whether it has been recovered or awaits investigation (Gupta & Devillers, 2017). The
archaeological database consists of collections that archaeologists have successfully recovered at different
22 Neha Gupta

times and places, and can be thought of as an imperfect model of a complex reality. A growing, more
reliable archaeological database can facilitate insights into human history.
In practice, archaeologists are increasingly digitizing and integrating new archaeological data with
archaeological collections stored in local and national repositories for combined analysis (Kintigh et al.,
2015). Yet repositories are themselves a product of the society in which they were created, and thus,
social, political, cultural and historical circumstances influence them. One might consider, for example,
why a particular collection is chosen for digitization, and how and why specific classes of data within
that collection are preserved and curated. These decisions can impact subsequent study of these research
data. Data-­intensive methods that integrate different collections take on their assumptions and limitations
(Atici et al., 2013), in addition to uncertainties in any new data (Allison, 2008).
Moreover, at some point in the life of archaeological data, regardless of their acquisition through
research or regulatory projects, they will be in the hands of experts who do not have direct access to
the original data collectors, their ‘contextual knowledge’ and field journals. In practice, the person who
acquired data on-­site during archaeological fieldwork typically also encodes them for further use. There-
fore, the encoder had pre-­existing knowledge of the spatial relationships in the data that enable linkages
between individual documents. Ideally, the same archaeologist analyses, interprets and presents results, a
situation that is typical in small academic projects. In the case of regulatory or CRM archaeology, digital
archaeological data, once acquired, might be transferred to data analysts and data managers.
With site information, aerial photographs, geophysical readings, and topographic surveys, an
archaeologist might prepare derived data products such as digital elevation models and files that store
the location, shape and attributes of archaeological features (points, lines, polygons). These data are
processed and analysed within a computational pipeline, the results of which are used to produce a
synthetic document that receives some form of peer review either as a technical report, or a scholarly
publication (Van der Linden & Webley, 2012). Such documents, particularly those produced under
regulatory frameworks, in turn can be the basis upon which scholars and policy makers make decisions
that impact local communities and society as a whole. Yet, in many cases, although not all, the data
themselves are not subject to review (Gobalet, 2001; Roebroeks, Gaudzinski-­Windheuser, Baales, &
Kahlke, 2017), nor is quality information on research data necessarily made explicit (McCoy, 2017,
pp. 4–5). This oversight can ‘hide serious logical and empirical faults in the underlying assumptions’
in archaeological practice (e.g. in CRM archaeology, the failure to detect archaeological sites despite
100% or full-­coverage survey) (Heilen et al., 2008, p. 1.1). This situation, however, does not mean that
the quality of data did not matter or that these data cannot be repurposed. Rather, recent scholarship
suggests that archaeologists are concerned about the quality of data, and have criteria upon which they
base their level of confidence. It should come as no surprise that insights into data management and
field methods are often gained through repurposing of existing data.
For example, Wells (2011) presents the integration of archaeological information from four state
historic preservation offices (SHPOs) in the United States. The SHPOs included in the study were
Kentucky, Illinois, Indiana and Missouri and each office stored and maintained archaeological data in
a GIS. The author notes that archaeological site records in each spatial database included similar basic
information. Wells examines the format, projection and coordinate system of location information (e.g.
polygon shapefile in Lambert conformal conic, measurements in feet) to assess interoperability across the
four sources. To bridge the four sources, the author devised six categories of attributes, including loca-
tion information, site identification, site type, definitions of one specific cultural affiliation, the quality of
previous investigations and an assessment of site informational quality (2011). Although Wells does not
explicitly define ‘quality’, he had clear criteria upon which to evaluate archaeological information such as
Preparing data for spatial analysis 23

cultural affiliations and Mississippian culture change. Specifically, he bases the strength of these ‘ontologi-
cal definitions’ on two factors, namely; the level of investigation at a site, as it offers consideration on how
far an ontological characterization can be extended (maximum intensity of previous investigations), and
second, the diversity of data structures used to represent the diversity of investigative approaches. Most
fundamentally, this approach highlights thematic information in assessing overall data quality. Wells shows
that careful evaluation of spatial and thematic accuracy enables thoughtful integration and meaningful
repurposing of archaeological site records.
In an era of cyber-­infrastructures, scholars are increasingly interested in ‘grey literature’, unpublished
reports prepared by professional archaeologists under regulatory frameworks as a source of archaeologi-
cal information. Some scholars have shed light on the ‘accuracy, reliability and completeness’ (Evans,
2013, p. 20) of these unpublished documents. In his examination of three sources on archaeological
field investigations in England – the National Monuments Record Excavation Index, the Archaeo-
logical Investigations Project, and Online Access to the Index of Archaeological Investigations – Evans
(2013) suggests that greater efforts are necessary to understand the limitations of unpublished reports.
The author examines and compares the three national spatial databases, and like Wells, considers
archaeological site records or ‘events’ within them. In his study, Evans (2013, p. 26) devised overarch-
ing nomenclature to incorporate the range of terminology that describes on-­site investigations such as
‘post-­determination/research’, ‘evaluation’ and ‘excavation’. The author then analysed the frequency
of reporting across these investigative approaches between 1990 and 2007. Evans also examined each
source for records on one county (Staffordshire) to ascertain gaps in coverage between them, challeng-
ing perceptions that national databases are complete and ‘authoritative’ (2013, p. 32). He concluded
that meta-­analysis highlight the uneven distribution of archaeological investigations, identifying
regions where investigations have been overlooked and where accepted data standards have not been
implemented (2013, pp. 21–22).
Similarly, in his examination of ‘integrative databases’, McCoy (2017, p. 77) remarks that ‘clear biases’
are evident when distribution of site records is presented on a map. The author defines integrative data-
bases as those that ‘continuously take in new information’, usually from a variety of sources, distinguish-
ing them from ‘archival databases’. Archival databases are those that ‘grow by accretion of distinct datasets’
(e.g. tDAR, ADS) (McCoy, 2017, pp. 75–76). The author suggests that the quality of geospatial data can
be thought of as ‘how well the dataset conforms to established best practices’ (2017, p. 78). To this end,
McCoy (2017, pp. 91–92) has proposed a ‘standalone quality report’ that describes the archaeological
geospatial data and how they were derived for tasks such as research, assessment and documentation. This
quality report would be a supplement to technical information in metadata. While potentially fruitful,
we currently do not know how effective metadata and quality reports are in archaeology, to what degree
quality information minimizes misuse of digital archaeological data and/or potentially enables their
re-­use.
In the English Landscapes and Identities project, Cooper and Green (2016, p. 289) seek to integrate
diverse ‘secondary digital datasets’ from national and regional archaeological repositories (Green, this
volume). These data include GIS-­based vector files, associated documents in portable document format
and spreadsheets, and in one case, records that were downloaded from a website (Cooper & Green, 2016,
p. 300). The authors treated ‘multiple and varied representations’ of an archaeological entity within dif-
ferent sources ‘as if it is accurate’ (Cooper & Green, 2016, p. 292). The authors note that the source data
have ‘diverse histories, contents and structures’ and are ‘riddled with gaps, inconsistencies and uncertain-
ties’ (Cooper & Green, 2016, p. 294). The authors do not offer insight into what they consider to be
‘inconsistencies and uncertainties’ or what impact error and imperfections have on potential re-­use of
24 Neha Gupta

these data. They do, however, suggest that thematic information in site records can shed light on spatial
relationships between archaeological sites, particularly when analysed at the ‘national level’. The authors
remark that at this scale of analysis, spatial precision becomes less important and emphasis shifts to the
‘spatial character’ of structures such as field systems including their length and orientation.
Heilen et al. (2008) examine data quality in American archaeology from the perspective of CRM
projects. In their study of military installations for the Department of Defense, they focused on survey
reliability, site location recording and site boundaries. They note that whilst overall accuracy of site loca-
tion recording improved with the use of GPS, this brought other concerns to the fore. They suggest that
with definitions and standards ‘came the expectation that data collected at different times, by different
contractors, would be equivalent in quality’ (p. 5.1). The authors remark that this assumption has resulted
in sites being ‘mischaracterized’. For example, small artifact scatters that were recorded at a location were
later identified as large village sites, and some sites were missed entirely. Furthermore, they highlight key
issues in the management of inventory data; specifically, that location information on archaeological sites
can be accurate, yet, the ‘size, shape, depths and importance’ changes with ‘environmental conditions’ and
‘academic debate’, suggesting the complexity of delineating these attribute values (2008, pp. 5.6–5.7).
They observe the problematic practice of deletion of ‘repeated’ site records in favour of the most recent
site inventory record. In this context, the authors recommend detailed records on the history of site dis-
covery and recording that include the equipment used, as well as details on field methods such as survey
intervals, transect size and shapes, shovel-­test design, and observations on erosion and visibility at the time
of field documentation. While the authors do not discuss how data managers would interpret this infor-
mation or how such an evaluation would impact the quality of inventory data, they do draw attention
to thematic and temporal accuracy in digital archaeological data. These efforts underscore institutional
practices as a factor in data quality in archaeology.
Data preparation is based on data quality, which in turn is fundamentally tied to research design and
a user’s intended purpose, as I have described above. In order for archaeologists to share the preparation
of high quality data, methods and analytic techniques, we must document how our data transformed
from one state to another. As noted, in digital environments, the use of commercial software within the
archaeological workflow often means that we impose ‘black boxes’ that prevent us from examining and
modifying underlying algorithms and code. In this context, a black box can refer to an instrument, device
or software that receives an input and delivers an output, yet its internal workings are unknown or poorly
understood by a scholar. This situation can cast doubt on the received output. When archaeologists give
up the opportunity to critically evaluate and improve existing tools and technologies, we, in effect, limit
the aims and scope of archaeology (Marwick, 2018). Much effort is put toward cleaning and tidying data
to make them machine understandable and re-­usable, yet these investments are lost in a computational
pipeline that is closed to scholarly review, modification and development.
In data-­intensive archaeology, three concepts are of prime importance; namely, scripted workflows,
versioning, and open and collaborative research processes. These concepts are central in preparing
archaeological data for data analysis and can be facilitated by data cleaning tools and techniques that offer
‘recipes’ or replicable steps. Some of techniques are applicable specifically to geospatial data while others
have a broader scope. These tools and techniques in turn, can create opportunities to disassemble black
boxes in archaeology’s computational pipeline. Documentation of data transformation and code sharing
can enable more rigorous archaeological research, while also opening intellectual space for collabora-
tion across disciplinary boundaries. I show how archaeologists might employ tools such as OpenRefine,
languages such as Python and versioning systems such as GitHub to document, manage and share digital
archaeological data and code. I emphasize that whilst these technologies might change, the goal remains
the same: dismantling black boxes in archaeology.
Preparing data for spatial analysis 25

Scripted workflows
Scripted workflows are a way to document the research process, which in turn, can disable black boxes
in archaeology. A script is typically a simple text file that consists of instructions to initiate and complete
tasks in a computational environment. These instructions can be combined with other instructions to
complete different tasks with the archaeological research process, or workflow. For example, an archaeolo-
gist might write a script to transform geographic coordinates (latitude, longitude) to Universal Transverse
Mercator (northing, easting) coordinates based on some specifications, save these transformed data to a
new file and display them on a map. In this case, the script serves not only as instructions for computa-
tional tasks, but also as a ‘very high-­resolution record’ of the research process that can be shared, examined,
modified and re-­used multiple times and by different scholars (Marwick, 2017, p. 432). This is crucial as
most commercial software do not enable documentation of the research process.
Scripted workflows have been utilized in different fields, including processing of geospatial infor-
mation from different sources for land classification (Leroux Lemonsu, Bélair, & Mailhot, 2009). In a
scripted workflow, the ‘process becomes public, transparent and reproducible’ (Thompson, Matloff, Fu, &
Shin, 2017). A scripted workflow can contain instructions for several, often sequential tasks within, and
throughout data processing, visualization and analysis and presentation. The key facet in a scripted work-
flow is its explicit description of process and code to enable transformation of data, a situation that can
facilitate insights into decisions that were made during processing, their potential impact on results, and
how best the data and code might be re-­used.

Versioning as data management


Version control can offer a way to manage digital archaeological data. Versioning presents a history of
digital objects and documentation on the creation of, and subsequent changes made on that digital object
(Leeper, 2015). The concept draws heavily from software development practices that enable large teams
of programmers in collaborating on code writing and keeping track of who has made which changes, as
well as when and why these changes were made.
In research contexts, scholars increasingly employ data versioning, whereby a new version of a data-
set is created when the structure, content or condition of an existing dataset is changed (ANDS, 2018).
These changes include reprocessing, corrections and when additional data are appended to existing ones.
A unique digital object identifier is assigned to each version of a dataset, enabling scholars to process and
cite specific versions and compare across different versions if necessary. While effective in documenting
changes in a dataset (e.g. lines in a text file), version control typically does not handle modifications in
metadata, that is, the explanation of why a particular data value was modified.
Archaeologists generally agree that data have ‘versions’, and that these collections should be managed
for future re-­use. Yet at present, there is little agreement on when a collection has changed, warranting a
new version, how best to document these changes and which versions to make publically available. These
are pressing issues when it comes to government-­sponsored data providers. Nonetheless, archaeologists
are encouraging the use of version control throughout the phases of the archaeological workflow, as I will
discuss in greater detail in a later section.

Open and collaborative research


Open science is a social development that is impacting the way scholars and scientists carry out research
and communicate their findings in the 21st century. Facilitated in part by Web 2.0 technologies, Open
26 Neha Gupta

Science aims to promote transparency, openness and reproducibility across scientific disciplines and
change the culture of research publication (Nosek et al., 2015). To that end, Nosek et al. (2015, p. 1424)
envision more rigorous publication policies, and they propose eight standards that open research com-
munication aspires to, including citation standards, data transparency (sharing), analytic methods (code)
transparency, research materials transparency, design and analysis transparency, preregistration of studies,
preregistration of analysis plans and replication. Recognizing that journals vary across disciplines and that
there are barriers to adopting the standards, each standard is measured on three different levels (Nosek
et al., 2015, p. 1425). The levels, increasing in stringency, are meant to facilitate gradual adoption of the
eight standards. Implementation is recognized by ‘badges’.
A growing number of scholars are interested in open research data as a way to practice ‘better science’
(Molloy, 2011; Foster & Deardorff, 2017). They draw attention to barriers to ‘maximum dissemination
of scientific data’ such as inability to access data, restrictions by publishers on data usage, and difficulties
in re-­use due to poor annotation, as well as cultural concerns over losing control over data and the lack
of incentives to make data re-­useable. While informative, these efforts tend to address communication
issues in research, overlooking deeper structural inequalities in academia and in society.
For example, in his call to humanize open science, Eric Kansa (2014, p. 32) draws attention to ‘under-
lying causes’ of dysfunction in research, beyond technical and licensing issues. He argues that broadening
the boundaries of open science to encompass ‘systematic study’ creates intellectual space for social sci-
ence and humanities scholars, enabling them to meaningfully engage with efforts in reforming research.
Kansa (2014, p. 36) rightly observes that archaeology relies on primary research data, and that recovered
material culture is not replaceable or renewable, yet archaeologists are often reluctant to disseminate and
archive research data. He suggests that these challenges reflect neoliberal values and problematic institu-
tional practices (Kansa, 2014, pp. 50–51). Most crucially, Kansa (2014, p. 52) remarks that a ‘high level
of collegiality and trust’ are necessary for truly opening the research process to a wider community, a
situation to which archaeologists can certainly relate. He suggests that open science can succeed when
real efforts are made to ‘dismantle a powerful and entrenched set of neoliberal ideologies and policies’
(Kansa, 2014, p. 54).
In this context, collaborative research, particularly with Indigenous and descendent communities is an
overarching theme in archaeology of the 21st century. Ownership of the past, including digital archaeologi-
cal data is emerging as a key concern amongst equity-­seeking groups in the United States, Canada, Australia
and New Zealand. In this context, Indigenous peoples want to generate knowledge about their ancestors,
and they are increasingly engaging with digital tools and technologies to challenge colonial practices that
prevented them from access to, and control of archaeology. This is particularly pressing in scenarios where
archaeology is practiced within a regulatory framework that privileges government-­and/or CRM-­led field
collection. Barriers to accessing primary research data persist for many Indigenous peoples and archaeolo-
gists, and these social issues are impacting how archaeological research is carried out (Gupta, Nicholas &
Blair, n.d.). These tensions will continue to influence the way ‘openness’ is practiced in archaeology.

Case studies

Iterative data cleaning


Data cleaning has gained currency in recent years with the growth of data science and studies (Rahm &
Do, 2000; Van den Broeck, Cunningham, Eeckels, & Herbst, 2005; Osborne, 2013). While archaeologists
are familiar with performing checks on digital data, there are, at present very few journal articles that
describe these procedures, suggesting that archaeologists generally overlook reporting having screened for
Preparing data for spatial analysis 27

extreme values, duplicate records, misspellings, missing values and other input errors. It should come as
no surprise then that archaeologists generally do not document the transformation of data in the compu-
tational pipeline, although this situation is changing (Kansa et al., 2014; Marwick et al., 2017; Strupler &
Wilkinson, 2017; Marwick, 2018; more broadly, see Shawn Graham’s open lab notebook). The common
thread in each of these works is the aim to lay bare ‘point-­and-­click’ procedures, while documenting what
worked and what did not.
A key aspect in data cleaning is its iterative nature, i.e. that the analyst must go through a number of
transformations and cleaning routines that are often non-­linear, and tailored to specific analytic goals
and quality specifications. Interactivity and visualization are important as an analyst works through
cleaning routines, and data cleaning systems typically offer user interfaces that enable an analyst to
write cleaning sequences, preview them on a portion of the data, and then apply these instructions to
whole sets of data. The instructions are saved and can be un-­done or extracted at any step. The clean-
ing sequences can also be applied to other data and offer a real-­time history of transformations. This
kind of documentation can be readily reviewed, shared, modified and repurposed. Below, I offer an
example through OpenRefine, an open source, standalone, desktop application that supports iteration
with a spreadsheet style interface.
In his study of data quality, privacy and ‘geospatial big data’, McCoy (2017, p. 79) examines the
case of publically available and ‘professional’ or privately maintained and restricted archaeological site
records. Specifically, he evaluates the frequency and density of reported fortifications across New Zea-
land in three sources, and complements them with LiDAR images for one particular fortification called
Puketona Pa. The author employed spreadsheet software and ArcGIS for his analysis. The professional
database of archaeological site records, developed and maintained by the New Zealand Archaeological
Association, is available only through prior authorization and is not considered here. The two publi-
cally available sources are a radiocarbon database maintained by the Waikato Radiocarbon Lab with
1671 records, and location information on fortifications maintained by Land Information New Zea-
land. In a GIS, these data are represented as a point, a single location defined by a set of geographic
coordinates. Thematic information such as the name of an archaeological site, the site identification
number, site type, the radiocarbon date, the material that was sampled, and the source of the sample
are added as ‘attributes’ to the point.
For each source, McCoy describes the methods and analytic techniques he employed, yet there is lim-
ited documentation on his data cleaning and processing. This is somewhat surprising given his remarks
that ‘filtering, classifying and coding temporal information’ was the most time-­intensive part of the
analysis (McCoy, 2017, p. 84). Elsewhere, the author notes that the radiocarbon data were downloaded as
a Google keyhole markup zip (kmz) and then ‘transferred’ to the commercial GIS software, ArcMap 10.3.
However, McCoy (2017, p. 83) notes that information on ‘site type, and material dated did not migrate
smoothly’, a problematic situation because these two fields were central in processing temporal values.
To correct this, the author manually searched the online database for lab identification numbers and ‘re-­
attach[ed]’ the missing information to all 1,671 site records.
I offer an alternative processing sequence on OpenRefine for McCoy’s radiocarbon data that trans-
forms the information for use in a GIS without manual searching and re-­attachment of missing data
fields. I note that ArcMap and other GIS software, such as QGIS have available tools that convert between
Google’s keyhole markup language (kml) and shapefiles (shp). These procedures are ‘point-­and-­click’
within GIS software, and by default, the software does not retain a history of transformations in a project.
In OpenRefine, the cleaning sequence created can be applied to other data in need of similar processing
and most importantly, for the purpose of this study, it serves as documentation of data transformations
that are typical in data-­intensive methods. For example, a recipe can be created for identifying missing
28 Neha Gupta

Figure 2.3 Parsing options in OpenRefine for a file in keyhole markup language.
Note that the information of interest is within the placemark tags.

values, extreme or anomalous values, and resolving them. More complex tasks such as linking codes or
shorthand (e.g. LBK for Linearbandkeramik and ‘grv’ for grave) from field journals and code books can
be facilitated through a cleaning sequence. The cleaned data can be exported in Comma Separated Value
(csv) format which are easily read in GIS software.
The radiocarbon data were directly accessed through the Web link supplied in McCoy (2017, p. 82)
(www.waikato.ac.nz/nzcd/C14kml.kmz). OpenRefine enables parsing of data in extensible markup lan-
guage (xml), which is a standard used in Google’s keyhole markup language (kml) (Google Developers,
2018) and is therefore interoperable (Figure 2.3). The placemark is an object that contains three elements;
namely a name, a description and a point that specifies the position of the placemark on the Earth’s sur-
face using a pair of coordinates (longitude and latitude). Additional thematic information is added to the
placemark as ‘description’, as well as styling for the icon and text. Thus, each placemark object contains
information that is of greatest interest.
Once parsed, the data are displayed within a spreadsheet-­style interface with rows and columns
where they and the values within them can be cleaned (Figure 2.4). Column names are based on tags
<tag> </tag> within the placemark object, and include information that is not relevant for further analy-
sis. Visual inspection of the data show empty rows and unnecessary columns that can be removed. More
importantly, thematic information (e.g. site name, site type, etc.) are all parsed into one column (Placemark –
description), and they have tags that will cause problems in further analysis. But information within the tags
is needed and must be separated into individual columns for use in a GIS. When performed manually on
over 1600 records, such an undertaking can easily result in input error and unintended modifications and
Preparing data for spatial analysis 29

Figure 2.4 A spreadsheet style interface on OpenRefine that shows information in columns.
Note that radiocarbon and site information is within tags and will require cleaning.

deletions. With OpenRefine, it is possible to write a cleaning sequence that can be previewed on part of the
data, and then applied to all records. In this case, the cleaning sequence is shown in Box 2.1:

1) Remove empty rows


2) Remove empty columns
3) Remove <p> tags and replace </p> with a ‘;’ #this replaces closing tags with a semicolon
4) Replace <br>, with a space
5) Split columns based on separator ‘;’ (semicolon) #this creates individual columns from descrip-
tion column
6) Text transform based on split at ‘:’ (colon) #this retains values after the colon, applied to all
columns
7) Column addition based on text length #this creates a new column called Easting based on
derived values from coordinates column, repeated for Northing

Once the sequence is implemented, the data are manipulated accordingly. The figure below shows the
resulting data (Figure 2.5), along with the history of the cleaning sequence (Figure 2.6). Note that the
cleaning sequence is available as description and as code. The code can be extracted and applied to other
data that need similar processing. The ‘clean’ data can be exported as a cross-­platform spreadsheet format
(csv) that can be read routinely by GIS software.

Figure 2.5 The cleaned version of the file ready with coordinates for mapping.
30 Neha Gupta

Figure 2.6 The cleaning sequence or ‘recipe’ for converting kml into comma separated value (csv) format. The
code can be exported, modified and re-­used.

This brief case study did not replicate McCoy’s manual processing and re-­attachment of values and
thus, it cannot offer any specific measure of duration of that task nor compare it with processing time in
OpenRefine. Automating transformations can reduce the potential for mistaken entry or deletion within
a spreadsheet. Moreover, in using a platform that enables writing of a cleaning sequence, we gain clear
documentation of data transformations and facilitate potential re-­use of these procedures. This resulting
routine can be reviewed, modified and repurposed for processing other geospatial data.

Processing field data


As suggested above, archaeologists routinely use a vast range of survey tools to produce local maps and
to document the spatial and stratigraphic relationships between archaeological features and artefacts. A
total station or electronic theodolite takes high-­precision distance and angle measurements that enable
3-­dimensional recording. The measurements are based on a ‘local’ or arbitrary coordinate system where
the origin (0,0) is the point location of the total station. Once set up in the field, archaeologists can col-
lect information relatively quickly and can encode each point with a description (thematic information).
Accuracy of measurements is highly dependent on levelling the instrument and recent models contain
additional software to perform levelling and adjustment calculations. The instrument typically comes
with propriety software that enables recording, management and calculation of distance and angle mea-
surements. Until recently, measurements recorded on a total station could not be automatically tied to
geographic space, i.e. to a location on the Earth’s surface and as a result, control points were required to
enable additional processing that would tie measurements to real geographic space. Real-­Time Kinematic
systems now bring Global Navigation Satellite System positioning with total station surveying and a suite
of sophisticated point-­and-­click software to calculate position data derived from satellites. These process-
ing software are typically available only for licensed users and are not available for review.
Preparing data for spatial analysis 31

Figure 2.7 An overview of the location and estimated size of the study area in Saint-­Pierre, France.

I present the case of a survey on the island of Saint-­Pierre, France, where initial archaeological field-
work was carried out using a total station and a handheld Global Positioning Systems (GPS) unit. I
document efforts to transform data collected in a local coordinate system to a global system in a situa-
tion where only two control points are available. Transformation of locational information into a global
coordinate system can facilitate the integration of field data with other sources, and can enable spatial
analysis of archaeological data.
The study area is located on the eastern coast of the island of Saint Pierre, France (Figure 2.7). A field
survey with a total station and handheld GPS unit was carried out as part of an archaeological project at
Memorial University of Newfoundland, Canada, to identify historical (18th century – onwards) settle-
ment on this part of the island. The initial survey team consisted of three archaeologists. Field collection
in the study area (measuring approximately 100 × 200 m), was organized into two surveys; one focused
on archaeological features visible on the surface, and the second focused on recording topography at
regular intervals, with the intention to bring these data into a GIS and examine them with historical
maps and other documents. For example, the archaeological features can be made into polygons (where
appropriate) with thematic information, enabling measurement of size and shape of surface features. This
information can be used to assess survey strategy, and offers a historical document prior to site excava-
tion. Therefore, a geographically referenced model of the topography and archaeological features was
highly desirable.
The first survey on archaeological features (features) resulted in 178 points, and the second survey
on topography (landscape) consisted of 343 points (Figure 2.8). All measurements were made in
32 Neha Gupta

Figure 2.8 A map showing points from two surveys that were collected on a total station. Location of the
total station or origin is represented as a star, survey on archaeological features on surface is marked in green,
and the survey of topography is in brown. A colour version of this figure can be found in the plates section.
Note that the feature survey data is rotated.

metres. Both surveys had the same origin and back sight for registration. Coordinates for the origin
and back sight were recorded on the handheld GPS unit with an error of +/-­5 metres. During initial
processing of the data in a local coordinate system, we immediately identified a significant problem
with the first survey. The survey points were rotated to some degree and had to be corrected prior to
being transformed to a global coordinate system. The team recognized that a mistake in registering
the total station set-­up was likely the source of this error, yet the second survey did not share this
dislocation. Because the team had used an old model total station that came with limited support for
processing survey measurements, the project directors decided it was necessary to develop an equation
to adjust the archaeological features based on known locations in geographic space i.e. the origin
and back sight.
The situation, however, was not ideal as there were only two control points (origin and back sight)
available and most transformations require a minimum of three control points. For example, the
CHaMP Topo Processing tool developed by Wheaton, Garrard, Whitehead, and Volk (2012) at Utah
State University for use in ArcMap is available for local to global coordinate transformations. However,
this tool was not used because of its three control point prerequisite. To transform the survey data,
Maria Yulmetova (2018), a student at Memorial University of Newfoundland developed a Python
script that calculated the rotation factor to transform local coordinates into Universal Transverse
Preparing data for spatial analysis 33

Figure 2.9 The survey points overlaid on a scanned map that is geo-­rectified to WGS-­UTM 21. A Python
script was developed to enable rotation and transformation of points in a local coordinate system to a global
coordinate system (UTM) using two known coordinate pairs. A colour version of this figure can be found in
the plates section.

Mercator (northing and easting) using two control points. In practice, the script generated modified
UTM coordinates that could be aligned with two different sources: a scanned topographic map from
the National Institute of Geographical and Forestry Information (IGN-­F), France (Figure 2.9), and on
imagery from Google Earth. The validation was based on visual inspection of the overlap between
survey measurements and features visible on the topographic map. With survey data corrected, it was
possible to more closely examine archaeological features, and their estimated size and shapes alongside
historical documents.
The script reflects a step towards creating a tool that archaeologists can use for transformations of
survey data that have a limited number of control points. Because most tools for processing survey data
are proprietary, they are not available for scholarly review or modification, as was needed in this scenario.
When the underlying code is available, it is possible to alter and customize default criteria and this in
turn, can enable more appropriate decision-­making and more rigorous research practice in archaeology.
More fundamentally, scripted workflows offer archaeologists the chance to engage more deeply with the
range of tools and technologies they employ throughout the archaeological workflow, and open cross-­
disciplinary collaborations with geographers, cartographers and computer scientists in disabling black
boxes. By sharing their scripted workflows, archaeologists can encourage re-­use, modification and refine-
ment of ‘recipes’ and data processing tools and technologies. Furthermore, greater attention to examining
and modifying code can create intellectual space for training and empowering undergraduate and gradu-
ate students for archaeology of the 21st century (Marwick, 2017).
34 Neha Gupta

Data management with version control


In this final section, I discuss data management and version control through platforms such as GitHub.
Strupler and Wilkinson (2017) offer an implementation of a ‘distributed version-­control data manage-
ment platform’ for field survey using Git. Git is a dedicated version control system for tracking changes
in digital files, and facilitates coordination of tasks amongst multiple collaborators. Version control sys-
tems are common in programming contexts where they are used to trace errors and track any updates in
code. In a version control system, the repository is where files and their histories live. The person who
creates the repository is its owner and is responsible for integrating merges and resolving any conflicts
into that repository. Each file in a repository is authored, and because each change (however minute) is
logged, it becomes the responsibility of a repository owner to provide a description of that change. For
example, after correcting a series of misspellings in a spreadsheet called ‘artifacts’, the repository owner
is ready to close or ‘commit’ those changes. The owner will ideally describe these changes ‘corrected
misspellings’ as the record of this particular commit. When the history of ‘artifacts.csv’ is examined, the
owner can review each commit through time, and select it by name and ‘revert’ to a previous version
of ‘artifacts’.
Versioning functionality proves highly effective in projects that have multiple authors, each of whom
submits changes to a single file. In that case, authors will clone or ‘fork’ the original repository, where
they make changes on their own copy of ‘artifacts.csv’ and then request to merge their versions into
the original repository. Onus falls on the owner of the original repository to review the changes before
accepting (or rejecting) a merge request. For instance, this form of data management can be most helpful
for archaeologists who manage site inventory records that are updated from time to time, but require a
history of site discovery (Heilen et al., 2008).
In Strupler and Wilkinson’s survey, the data were ‘born-­digital’ and the authors employed the version-
ing system, Git, to manage their field collection records. They remark that these data ‘must have history’
(2017, p. 283) as it is a necessary part of ‘improving the quality’ of the study and overall results. This is key
as the authors argue that error correction and other modifications are better tracked within processing,
and can be linked to individual contributors. The functionality further enables comparison between any
two or three file versions which can highlight the source of duplicate entries, for example, or conflicts in
particular values. These issues must be reconciled at source and therefore potentially minimize the likeli-
hood of undetected errors propagating through a computational pipeline.
The authors organized their field collection into sub-­projects in a main repository, such as ‘survey’,
‘survey-­design’, ‘survey-­data’, ‘gis-­static’, ‘gis-­tools’, ‘photo-­archive’ and ‘team’ (2017, p. 290). The sub
project ‘survey-­data’, for example, consisted of two sources of data, a direct input from a field walker’s GPS
unit, and digital forms in which a field walker reported points of interest observed during a walk. The
advantage of this data management is the relative ease with which digital data can be processed off-­site,
while minimizing accidental loss, corruption or deletion of those data. Harnessing the history of changes
in any file or sub-­project with a repository can enable authors to more easily detect errors in data, correct
them and track subsequent work. That said, most versioning systems (e.g. Git, GitHub) have a learning
curve and would require each team member to have some familiarity with the platform and manage-
ment practices. More fundamentally, for use in the field and off-­site, a server and institutional support
(i.e. financial support and expertise) are often prerequisites (Strupler & Wilkinson, 2017, p. 301). These
are especially necessary to enable long-­term use of collected data, and their re-­use. Strupler & Wilkinson
(2017) do not discuss the long-­term use of their survey data, or how these data might be re-­used by a
scholar who was not part of the original project, yet their study offers an example of distributed, col-
laborative and version controlled management of data in archaeological field projects.
Preparing data for spatial analysis 35

Conclusion
Preparation of archaeological data for further analysis, curation and re-­use is tied to data quality, which in
turn, is integral to research design and an archaeologist’s intended use of archaeological data. Geospatial
technologies such as GIS are routinely used in archaeology to manage, store and analyse large amounts of
digital archaeological data. However, these spatial databases are known to have poor error management,
a situation that can result in error propagation that impacts on subsequent analysis and the final result.
The widespread use of GIS in archaeology therefore can constrain a broader assessment of archaeological
methods and the appropriateness of data in terms of interpretation and re-­use. Furthermore, processing
of data and analysis within computational pipelines is rarely documented and shared, a situation that
limits what is known on procedures and transformations and the overall quality of archaeological data.
Without clear documentation of how data were processed, archaeologists impose black boxes within the
archaeological workflow that prevent examination of how data were transformed from acquisition to
their final presentation and publication. This situation is especially problematic when point-­and-­click
software is uncritically utilized. Archaeologists must become ready to dismantle black boxes at a moment
when greater amounts of ‘born digital’ archaeological data, are being generated.
Recent interests in the preparation of archaeological data and the quality of those data are influenced
by the growing use of digital and geospatial tools and technologies, and the rapid growth of communica-
tion tools that facilitate exchange and sharing of data between scholars, institutions and non-­specialists.
Archaeologists are accumulating large amounts of data through real-­time digital documentation in the
field, that are paperless and these efforts are thought to minimize redundancy and human-­introduced
errors in the recording of archaeological sites and archaeological data. These efforts can also potentially
shorten the time interval between data acquisition, processing, analysis and presentation and publication.
A growing awareness of a digital data-­r ich environment is encouraging archaeologists to think in
terms of data-­intensive methods and big data whilst highlighting that greater efforts are needed to docu-
ment and report how decisions were made on cleaning, analysing and publishing data. This situation
presents challenges and opportunities for archaeologists. Calls for Open Science in archaeology reflect
these tensions, and offer a way forward in terms of promoting the generation of scripted workflows, ver-
sion control for data management and collaborative research. Recognition that the interests and needs of
social groups differ in terms of ownership of the past are attuning archaeologists to the role of institutional
practices in data quality issues.
Greater and more stringent control over metadata is enabling archaeologists in documenting their data
creation methods, sampling techniques and contextual information that facilitates the re-­use of digital
archaeological data. Metadata typically include authorship information, basic project and site descrip-
tions, keywords, chronological ranges and geographical coverage (e.g. bounding box coordinates). The
data publisher Open Context, for example, has shown leadership in preparing digital data for re-­use,
including data cleaning, such as performing basic checks on received data to correct data entry errors and
inconsistencies in classification fields, as well as more involved transformations to translate code books
and reconcile them with tabular information (Kansa et al., 2014, p. 60). We do not yet have sufficient
information on how metadata are being used beyond search, browse and filtering for specific records.
Nonetheless, recent developments show that archaeologists are aware of data quality issues and are actively
taking steps to communicate the level of confidence they have in their analysis and interpretation of
archaeological data.
The apparent democratization of archaeological site information has renewed concerns over privacy
and the security of sensitive locational information. Publishing archaeological information presents sig-
nificant challenges and opportunities. Conventional wisdom is that archaeological data collected in the
36 Neha Gupta

field contain sensitive locational information, and that sharing locations of archaeological and histori-
cal sites can facilitate, if not result in the destruction of those sites through looting. Looting and illegal
trafficking of archaeological artefacts and human bones is an issue observed in many places (Brodie,
Doole, & Renfrew, 2001; Huffer & Graham, 2017). These concerns are often heightened in national
contexts where tensions over ownership of the past exist between archaeologists and local communities
and Indigenous peoples. Yet recent developments in geovisual analytics demonstrate that scholars can
meaningfully analyse data even when they contain sensitive location information (Andrienko et al.,
2007). Archaeologists are now putting greater efforts into examining how to share sensitive archaeologi-
cal information, and making explicit scenarios in which such efforts are inappropriate. These efforts are
reflected in conference sessions at the 2018 Society for American Archaeology meetings, such as the
‘Futures and Challenges in Government Digital Archaeology’ symposium organized by Jolene Smith,
and a forum, ‘Keeping Our Secrets: Sharing and Protecting Sensitive Resource Information in the Era of
Open Data’, that was chaired by David Gadsby and Anne Vawser. The ethos of ‘openness’ is encouraging
archaeologists to better understand possibilities in and potential implications of publishing archaeological
data on the Web.
The re-­use of digital archaeological data require scholarly efforts in cleaning and better documenta-
tion of these procedures and transformations. This scholarship is being promoted to facilitate deeper
engagement with archaeological methods, which in turn, can open new forms of research in archaeol-
ogy. Archaeologists are increasingly extracting geographical information from historical documents and
repurposing these data for spatial analysis (Murrieta-­Flores & Gregory, 2015). Employing sophisticated
techniques such as Natural Language Processing, archaeologists draw out place-­names in historical texts
and incorporate them into GIS software. Tools such as geoparsers that automate annotation of texts, and
geo-­reference place-­names (e.g. create pairs of coordinates) are now being developed for specific corpora.
Platforms such as ORBIS (2018), a geospatial network model of the Roman World developed at Stanford
University and the Pelagios Commons (2018), an online community that enables linked open data on
historical places, are highlighting the range and scope of interdisciplinary scholarship. These efforts often
emphasize collaborative code development and code sharing.
Growing numbers of archaeologists are employing programming languages such as R and Python in
documenting their research processes. Scripted workflows and code sharing is facilitated by Web-­based
platforms such as GitHub and Jupyter notebooks. For example, the Open Digital Archaeology Textbook
and Environment (Graham et al., 2018), an open-­access digital textbook makes extensive use of Jupyter
notebooks to share code and data for teaching purposes (https://o-­date.github.io/support/notebooks-­
toc/). Most crucially, the digital environment offers scholars and learners a platform to read and experi-
ment with code writing. Notebooks of particular interest include one on spatial analysis developed by
Rachel Optiz (https://mybinder.org/v2/gh/ropitz/spatialarchaeology/master), as well as one on process-
ing public data such as Light Detection and Ranging (LiDAR) that are published by local and national
institutions. These notebooks greatly extend the potential and possibilities for scripted workflows in
spatial analysis, within and without traditional GIS software.
Greater attention is now given to archaeological site records as historical documents, and the his-
tory of site discovery as a way to assess data quality. In this context, archaeologists are making greater
effort to employ version control systems that log changes in files and potentially reduce the likelihood
of errors going undetected through the archaeological workflow. Because all files in versioning systems
are authored, it is possible to organize and manage multi-­authored projects on these platforms. As such,
documentation of changes to a digital object offers a history of that object, and a way to track error and
its propagation through the archaeological workflow. Implementing good documentation practices into
the archaeological workflow can enable better data quality. Yet version control systems present barriers
Preparing data for spatial analysis 37

in terms of implementation; expertise and institution support and resources are necessary prerequisites.
Nonetheless, version control systems offer great potential for archaeologists who manage site inventory
information that changes and where archaeological data are managed by experts who do not have access
to the original data collectors and their contextual knowledge. These challenges underscore the need for
better data management techniques in archaeology more broadly.
With more stringent control over metadata, there is enormous scope for data-­intensive methods in
archaeology. Federal funding agencies are placing greater emphasis on data management plans for funded
projects and these developments are creating an environment in which archaeological data are being more
closely scrutinised for sharing on Web-­based platforms. As a result, greater amounts of better documented
data are available for re-­use in archaeology, which in turn, can facilitate a better understanding of the
human past. More fundamentally, these efforts are creating opportunities for new forms of research in
archaeology that can promote collaboration with anthropologists, historians, cognitive scientists, geog-
raphers and computer scientists, which in turn, can have broader implications in the social sciences and
humanities.

References
Allison, P. (2008). Dealing with legacy data: An introduction. Internet Archaeology, 24. http://dx.doi.org/10.11141/
ia.24.8.
Andrienko, G., Andrienko, N., Jankowski, P., Kraak, M-­J., Keim, D., MacEachren, A. M., & Wrobel, S. (2007).
Geovisual analytics for spatial decision support: Setting the research agenda. International Journal of Geographical
Information Science, 21(8), 839–857.
Atici, L., Kansa, S. W., Lev-­Tov, J., & Kansa, E. C. (2013). Other people’s data: A demonstration of the imperative of
publishing primary data. Journal of Archaeological Method and Theory, 19, 1–19.
Austin, A. (2014). Mobilizing archaeologists: Increasing the quantity and quality of data collected in the field with
mobile technology. Advances in Archaeological Practice, 2(1), 13–23.
Australian National Data Service (ANDS). (2018). Data versioning. ANDS. Retrieved October 2018, from www.
ands.org.au/working-­with-­data/data-­management/data-­versioning
Averett, E. W., Counts, D. B., & Gordon, J. (2016). Introduction. In D. B. Counts, E. W. Averett, & J. Gordon
(Eds.), Mobilizing the past for a digital future: The potential of digital archaeology. Retrieved from http://dc.uwm.edu/
arthist_mobilizingthepast/
Bampton, M., & Mosher, R. (2001). A GIS driven regional database of archaeological resources for research and CRM
in Casco Bay, Maine. Bar International Series, 931, 139–142.
Banning, E. B., Hawkins, A. L., Stewart, S. T., Hitchings, P., & Edwards, S. (2017). Quality assurance in archaeological
survey. Journal of Archaeological Method and Theory, 24(2), 466–488. https://doi.org/10.1007/s10816-­016-­9274-­2
Bevan, A., Crema, E., Li, X., & Palmisano, A. (2013). Intensities, interactions, and uncertainties: Some new approaches
to archaeological distributions. In A. Bevan & M. W. Lake (Eds.), Computational approaches to archaeological spaces
(pp. 27–52). Walnut Creek, CA: Left Coast Press.
Bevan, A., & Lake, M. W. (2013). Computational approaches to archaeological spaces. Walnut Creek, CA: Left Coast Press.
Brodie, N., Doole, J., & Renfrew, C. (Eds.). (2001). Trade in illicit antiquities: The destruction of the world’s archaeological
heritage. Cambridge: McDonald Institute for Archaeological Research.
Burg, M. B, Peeters, H., & Lovis, W. A. (Eds.). (2016). Uncertainty and sensitivity analysis in archaeological computational
modeling. Switzerland: Springer.
Chrisman, N. (2006). Development in the treatment of spatial data quality. In R. Devillers & R. Jeansoulin (Eds.),
Fundamentals of spatial data quality (pp. 22–30). Newport Beach, CA: ISTE.
Cooper, A., & Green, C. (2016). Embracing the complexities of “Big data” in archaeology: The case of the English
landscape and identities project. Journal of Archaeological Method and Theory, 23(1), 271–304. doi:10.1007/
s10816-­015-­9240-­4
Costa, S., Beck, A., Bevan, A. H., & Ogden, J. (2013). Defining and advocating open data in archaeology. In G. Earl,
T. Sly, A. Chrysanthi, P. Murrieta-­Flores, C. Papadopoulos, I. Romanowska, & D. Wheatley (Eds.), Archaeology
38 Neha Gupta

in the Digital Era: Papers from the 40th annual conference of computer applications and quantitative methods in archaeology,
Southampton, 26–29 March, 2012 (pp. 449–456). Amsterdam: University Press.
Crema, E. (2012). Modelling temporal uncertainty in archaeological analysis. Journal of Archaeological Method and
Theory, 19, 440–461.
Devillers, R., & Jeansoulin, R. (2006). Spatial data quality: Concepts. In R. Devillers & R. Jeansoulin (Eds.), Funda-
mentals of spatial data quality (pp. 31–42). Newport Beach, CA: ISTE.
Dibble, H. L., & McPherron, S. P. (1988). On the computerization of archaeological projects. Journal of Field Archaeol-
ogy, 15(4), 431–440. https://doi.org/10.2307/530045.
Dunnell, R. C., Teltser, P., & Vercruysse, R. (1986). Efficient error reduction in large data sets. Advances in Computer
Archaeology, 3, 22–39.
Evans, T. N. L. (2013). Holes in the archaeological record? A comparison of national event databases for the historic
environment in England. The Historic Environment: Policy & Practice, 4(1), 19–34. https://doi.org/10.1179/1756
750513Z.00000000023.
Foster, E. D., & Deardorff, A. (2017). Open Science Framework (OSF). Journal of the Medical Library Association :
JMLA, 105(2), 203–206. doi:10.5195/jmla.2017.88.
Gobalet, K. W. (2001). A critique of faunal analysis: Inconsistency among experts in blind tests. Journal of Archaeologi-
cal Science, 28(4), 377–386.
Google Developers. (2018). What is KML? Retrieved March 2018, from https://developers.google.com/kml/
Graham, S. Open lab notebook. Retrieved March 20, 2018, from https://electricarchaeology.ca/
Graham, S., Gupta, N., Smith, J., Angourakis, A., Carter, M., & Compton, B. (2018). The open digital archaeology
textbook environment. Retrieved from https://o-­date.github.io/draft/book/
Green, C. (2011). It’s about time: Temporality and intra-­site GIS. In E. Jerem, F. Redő, & V. Szeverényi (Eds.), On
the road to reconstructing the past: Computer applications and quantitative methods in archaeology (CAA): Proceedings of the
36th international conference, Budapest, April 2–6, 2008 (pp. 206–211). Budapest: Archaeolingua.
Gupta, N., & Devillers, R. (2017). Geographic visualization in archaeology. Journal of Archaeological Method and Theory,
24(3), 852–885.
Gupta, N., Nicholas, R., & Blair, S. (n.d). Post-­colonial and indigenous perspectives in digital archaeology. In E.
Watrall & L. Goldstein (Eds.), Digital heritage and archaeology in practice. University Press of Florida. Retrieved from
http://dhainpractice.anthropology.msu.edu/
Heilen, M. P., Nagle, C. L., & Altschul, J. H. (2008). An assessment of archaeological data quality: A report submitted in
partial fulfillment of legacy resource management program project to develop analytical tools for characterizing, visualizing, and
evaluating archaeological data quality systematically for communities of practice within the department of defense. Department
of Defense Legacy Resource Management Program, Technical Report 08–65, Statistical Research Inc., Tuscon, AZ.
Hodder, I., & Orton, C. (1976). Spatial analysis in archaeology. New York: Cambridge University Press.
Huffer, D., & Graham, S. (2017). The insta-­dead: The rhetoric of the human remains trade on Instagram. Internet
Archaeology, 45. https://doi.org/10.11141/ia.45.5
Huggett, J. (2015). Digital haystacks: Open data and the transformation of archaeological knowledge, In A. T. Wil-
son & B. Edwards (Eds.), Open source archaeology: Ethics and practice (pp. 6–29). Walter de Gruyter GmbH & Co KG.
Hunter, G. J., & Beard, K. (1992). Understanding error in spatial databases. Australian Surveyor, 37(2), 108–119.
https://doi.org/10.1080/00050326.1992.10438784
Kansa, E. C. (2011). Introduction: New directions for the digital past. In E. C. Kansa, S. W. Kansa, & E. Watrall
(Eds.), Archaeology 2.0: New approaches to communication and collaboration (pp. 1–25). Los Angeles, CA: Cotsen
Institute of Archaeology Press.
Kansa, E. C. (2014). The need to humanize open science. In S. Moore (Ed.), Issues in open research data (pp. 31–58).
Ubiquity Press. doi:10.5334/ban.c.
Kansa, E. C., Kansa, S. W., & Arbuckle, B. (2014). Publishing and pushing: Mixing models for communicating
research data in archaeology. International Journal of Digital Curation, 9(1), 57–70. https://doi.org/10.2218/ijdc.
v9i1.301
Kintigh, K. (2006). The promise and challenge of archaeological data integration. American Antiquity, 71(3), 567–578.
Kintigh, K., Altschul, J. H., Kinzig, A. P., Limp, W. F., Michener, W. K., Sabloff, J. A., . . . Lynch, C. A. (2015).
Cultural dynamics, deep time, and data: Planning cyberinfrastructure investments for archaeology. Advances in
Archaeological Practice, 3(1), 1–15.
Preparing data for spatial analysis 39

Kohl, P. L., & Fawcett, C. (Eds.). (1995). Nationalism, politics and the practice of archaeology. Cambridge: Cambridge
University Press.
Kohl, P. L., Kozelsky, M., & Ben-­Yehuda, N. (Eds.). (2007). Selective remembrances: Archaeology in the construction, com-
memoration and consecration of national pasts. Chicago: The University of Chicago Press.
Kolar, J., Macek, M., Tkáč, P., & Szabó, P. (2015). Spatio-­temporal modelling as a way to reconstruct patterns of past
human activities. Archaeometry, 58(3), 513–528. doi:10.1111/arcm.12182
Leeper, T. J. (2015). Collecting thoughts about data versioning: Contribute to Leeper/data-­versioning development by creating
an account on GitHub. Retrieved October 2018, from https://github.com/leeper/data-­versioning
Leroux, A., Lemonsu, A., Bélair, S., & Mailhot, J. (2009). Automated urban land use and land cover classification for
mesoscale atmospheric modeling over Canadian cities. Geomatica, 63(1), 13–24.
Levy, T. E., & Smith, N. G. (2007). On-­site GIS digital archaeology: GIS-­based excavation recording in southern
Jordan. In T. E. Levy (Ed.), Crossing Jordan: North American contributions to the archaeology of Jordan (pp. 47–58).
Oakville, CT: Equinox Publishing.
Llobera, M. (2007). Reconstructing visual landscapes. World Archaeology, 39(1), 51–69. http://doi.org/10.1080/
00438240601136496
Marwick, B. (2017). Computational reproducibility in archaeological research: Basic principles and a case study of
their implementation. Journal of Archaeological Method and Theory, 24(2), 424–450.
Marwick, B. (2018). Using R and related tools for reproducible research in archaeology. In J. Kitzes, D. Turek, &
F. Deniz (Eds.), The practice of reproducible research: Case studies and lessons from the data-­intensive sciences. Oakland, CA:
University of California Press. Retrieved from www.practicereproducibleresearch.org/case-­studies/benmarwick.
html
Marwick, B., d’Alpoim Guedes, J., Barton, C. M., Bates, L. A., Baxter, M., Bevan, A., . . . Wren, C. D. (2017). Open
science in archaeology. The SAA Archaeological Record, 17(4), 8–14.
McCoy, M. D. (2017). Geospatial big data and archaeology: Prospects and problems too great to ignore. Journal of
Archaeological Science, 84, 74–94.
Meskell, L. (2005). Archaeology under fire: Nationalism, politics and heritage in the eastern Mediterranean and Middle East.
London: Routledge.
Molloy, J. C. (2011). The open knowledge foundation: Open data means better science. PLoS Biology, 9(12),
e1001195. doi:10.1371/journal.pbio.1001195
Murrieta-­Flores, P., & Gregory, I. (2015). Further frontiers in GIS: Extending spatial analysis to textual sources in
archaeology. Open Archaeology, 1(1), 166–175.
National Science Foundation. (2017). Dissemination and sharing of research results. Retrieved February 2018, from www.
nsf.gov/bfa/dias/policy/dmp.jsp
Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., & Buck, S. (2015). Promoting an
open research culture. Science, 348(6242), 1422–1425. doi:10.1126/science.aab2374
ORBIS: The Stanford geospatial network model of the roman world. (2015). Stanford University Libraries. Retrieved
October 2018, from http://orbis.stanford.edu/
Osborne, J. W. (2013). Best practices in data cleaning: A complete guide to everything you need to do before and after collecting
your data. Retrieved from http://srmo.sagepub.com/view/best-­practices-­in-­data-­cleaning/SAGE.xml
Pelagios Commons. (2018). Linking the places of our past. Retrieved October 2018, from http://commons.pelagios.
org/
Plewe, B. (2002). The nature of uncertainty in historical geographic information. Transactions in GIS, 6(4), 431–456.
Porter, S. T., Roussel, M., & Soressi, M. (2016). A simple photogrammetry rig for the reliable creation of 3D artifact
models in the field lithic examples from the Early Upper Paleolithic sequence of Les Cottés (France). Advances
in Archaeological Practice, 4(1), 71–86.
Rabinowitz, A. (2014). It’s about time: Historical periodization and linked ancient world data. ISAW Papers, 7(22).
Retrieved March 2018, from http://dlib.nyu.edu/awdl/isaw/isaw-­papers/7/rabinowitz/
Rahm, E., & Hai Do, H. (2000). Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin,
23(4), 3–13.
Roebroeks, W., Gaudzinski-­Windheuser, S., Baales, M., & Kahlke, R.-­D. (2017). Uneven data quality and the earliest
occupation of Europe: The case of untermassfeld (Germany). Journal of Paleolithic Archaeology, 1(1), 5–31. https://
doi.org/10.1007/s41982-­017-­0003-­5
40 Neha Gupta

Roosevelt, C. H., Cobb, P., Moss, E., Olson, B. R., & Ünlüsoy, S. (2015). Excavation is destruction digitization:
Advances in archaeological practice. Journal of Field Archaeology, 40(3), 325–346. https://doi.org/10.1179/2042
458215Y.0000000004
Silberman, N. A. (1989). Between past and present: Archaeology, ideology, and nationalism in the modern Middle East. New
York: Holt. Retrieved from http://hdl.handle.net/2027/heb.02303.0001.001
Sitara, M., & Vouligea, E. (2014). Open access to archeological data and the Greek law. In A. Sideridis, Z. Kardasi-
adou, C. Yialouris, & V. Zorkadis (Eds.), E-­democracy, security, privacy and trust in a digital world. e-­Democracy 2013.
Communications in Computer and Information Science, 441. Cham: Springer.
Snow, D. R., Gahegan, M., Giles, C. L., Hirth, K. G., Milner, G. R., Mitra, P., & Wang, J. Z. (2006). Cybertools and
archaeology. Science, 311(5763), 958–959.
Strupler, N., & Wilkinson, T. C. (2017). Reproducibility in the field: Transparency, version control and collaboration
on the project panormos survey. Open Archaeology, 3(1). https://doi.org/10.1515/opar-­2017-­0019
Thompson, P. A., Matloff, N., Fu, A., & Shin, A. (2017, August). Having your cake and eating it too: Scripted
workflows for image manipulation. ArXiv:1709.07406 [Eess]. Retrieved from http://arxiv.org/abs/1709.07406
Tri-­Agency Statement of Principles on Digital Data Management. (2016). Retrieved February 2018, from www.
science.gc.ca/eic/site/063.nsf/eng/h_83F7624E.html?OpenDocument
Trigger, B. (2006). A history of archaeological thought (2nd ed.). New York: Cambridge University Press.
Van den Broeck, J., Argeseanu Cunningham, S., Eeckels, R., & Herbst, K. (2005). Data cleaning: Detecting, diagnos-
ing, and editing data abnormalities. PLoS Medicine, 2(10), e267. https://doi.org/10.1371/journal.pmed.0020267
Van der Linden, M., & Webley, L. (2012). Introduction: Development-­led archaeology in northwest Europe: Frame-
works, practices and outcomes. In L. Webley, M. Van der Linden, C. Haselgrove, & R. Bradley (Eds.), Development-­
led archaeology in Northwest Europe proceedings of a round table at the University of Leicester 19th–21st November 2009
(pp. 1–8). Oxford: Oxbow.
Vincent, M. L., Kuester, F., & Levy, T. E. (2014). OpenDig: Contextualizing the past from the field to the web. Medi-
terranean Archaeology and Archaeometry, 14(4), 109–116.
Wells, J. (2011). Four states of Mississippian data: Best practices at work integrating information from four SHPO
databases in a GIS-­structured archaeological Atlas. Society for American archaeology e-­symposium. Retrieved
from http://visiblepast.net/see/americas/four-­states-­of-­mississippian-­data-­best-­practices-­at-­work-­integrating-­
information-­from-­four-­shpo-­databases-­in-­a-­g is-­structured-­archaeological-­atlas/
Wheatley, D., & Gillings, M. (2002). Spatial technology and archaeology: The archaeological applications of GIS. London:
Taylor & Francis.
Wheaton, J. M., Garrard, C., Whitehead, K., & Volk, C. J. (2012). A simple, interactive GIS tool for transforming
assumed total station surveys to real world coordinates: The CHaMP transformation tool. Computers & Geosci-
ences, 42, 28–36.
Willems, W. J. H., & Brandt, R. (2004). Dutch archaeology quality standard. Den Haag: Rijksinspectie voor de
Archeologie.
Wilshusen, R. H., Heilen, M., Catts, W., de Dufour, K., & Jones, B. (2016). Archaeological survey data qual-
ity, durability, and use in the United States. Advances in Archaeological Practice, 4(2), 106–117. https://doi.
org/10.7183/2326-­3768.4.2.106
Wylie, A. (2002). Thinking from things: Essays in the philosophy of archaeology. Berkeley, CA: University of California
Press.
Yulmetova, M. (2018). Python script: Transformation of local coordinates to global coordinates. Retrieved March 2018, from
https://github.com/MariaYulmetova88/Transferring-­local-­coordinates-­to-­UTM-­using-­the-­GPS-­coordinates
Zoghlami, A., de Runz, C., Akdag, H., & Pargny, D. (2012). Through a fuzzy spatiotemporal information system for
handling excavation data. In J. Gensel, D. Josselin, & D. Vandenbroucke (Eds.), Bridging the geographic informa-
tion sciences: International AGILE’2012 Conference, Avignon (France), April 24–27, 2012 (pp. 179–196). New York:
Springer.
3
Spatial sampling
Edward B. Banning

Introduction

The purpose of sampling and very brief history


Although many archaeologists had already recognized that their work involved some kind of sampling,
it was not until the 1960s that they began to discuss this explicitly. Vescelius (1960) emphasized the role
of sampling in making estimates of artifact proportions in a population on the basis of a sample. Popula-
tions (sometimes called “universes”) are sets that contain the totality of entities (sites, artifacts, spaces)
that interest us, while samples are subsets of those populations from which we hope to make useful and
accurate estimates of the populations’ characteristics (called “parameters”), such as average house size or
the ratio of burned to unburned bone, or that we want to compare statistically with some other popula-
tion or populations.
Soon after Vescelius, Binford’s (1964) discussion of research design electrified North American archae-
ologists and soon made sampling central to their concerns. This had the advantage of persuading many
archaeologists to think more carefully about the ways they framed and investigated research questions
but, paradoxically, also fossilized some aspects of archaeological practice and led to some misunderstand-
ing of sampling generally and spatial sampling in particular. Binford began a trend toward enshrining
formal sample design as an inevitable aspect of archaeological research, starting immediately with Root-
enberg’s (1964) sampling programme and leading to categorical statements in the 1970s to the effect that
“regional archaeology cannot be undertaken in the absence of formalized probability sampling methods”
(Judge Ebert, & Hitchcock, 1975, p. 118).
This historical trajectory had several unfortunate outcomes. One was the misconception that sampling
was merely a regrettable necessity because it was impractical or perhaps unethical to examine entire popu-
lations. Another was that to be “scientific,” archaeologists must use geometrically rigid sampling frames
and arbitrary sampling fractions, a “cookie-­cutter” approach that many archaeologists found attractive by
1980. Another was the loss of research programs that were either narrowly targeted in order to solve very
particular problems, or broad and extensive. Another was the belief that by experimenting on artificial
spatial populations, often with computer simulations, we could determine the “best” way to sample just
about anything. The most serious outcome was inflexibility and an inability to appreciate the role of prior
42 Edward B. Banning

information in sample design (Hole, 1980); in fact, many archaeologists either summarized then ignored
useful information at their disposal or explicitly excluded it from their research plans.
Eventually, a backlash against simplistic and poorly conceived sampling designs led to the misplaced
rejection of sampling altogether, based on the idea that “such ‘samples’ often fail to capture the true vari-
ability present in the archaeological record” (Tartaron, 2003, p. 23). The sudden popularity of “full coverage
surveys” (Fish & Kowalewski, 1990) yielded sets of data that addressed some kinds of research questions more
effectively than a typical sample could but, in many cases, were actually still samples (Cowgill, 1990, p. 254),
albeit disguised as whole populations, at scales usually much smaller than those of older extensive surveys.
Because of the impression that the result was a whole population, often these projects paid no explicit atten-
tion to survey intensity, sample size, sampling error, or bias in the estimates that we might base on them (cf.
Cowgill, 1990; Plog, 1990; Tartaron, 2003). In part, this rejection of sampling resulted from the tendency to
confuse sampling with either searching (as for rare sites) or detection of spatial patterning (e.g., settlement
networks). Far too many introductory archaeology texts contributed to this confusion by describing sam-
pling as a method for “finding sites.” Conventional sampling is indeed a poor method for finding rare things
(Flannery, 1976, pp. 134–135; Redman, 1987, p. 251) or identifying extensive spatial structure (Banning,
2002, pp. 155–156). Archaeological sampling in a spatial context has almost disappeared from the scholarly
literature of the last three decades, despite the publication of Orton’s (2000) important book, mention of
sample design in standard references (e.g., Banning, 2002; Collins & Molyneaux, 2003; Drennan, 2010;
White & King, 2007), and the continued use of sampling in contract archaeology.
In reality, sampling is a tool that is useful in some situations, unhelpful in others, and that always
requires tailoring to the purpose at hand. The fact that poorly designed samples lead to incorrect con-
clusions or do not accomplish project goals (Redman, 1987, pp. 250–251) should not be an indictment
of sampling. We should also keep in mind that nearly a century of statistical research has explored the
nature of samples, their efficiency, statistics and sampling errors, making it unnecessary for archaeologists
to “reinvent the wheel.”

Method

The population and sampling frame


Simply put, a population is the set of all the elements whose characteristics we wish to summarize,
compare, or understand or, in some cases, the hypothetical (and in some sense infinite) set of all possible
observations we might make, while a sample is the subset of elements we actually examine or observa-
tions we actually make. However, it is not always obvious what the population is. Often archaeologists
think they have sampled a population of sites or artifacts when they have actually sampled areas on a map
or volumes of sediment in a site or feature. Sometimes we also need to distinguish between a “target”
population, the one in which we are really interested, and the “sampled” population, the one we are really
sampling, because various processes such as landscape erosion, differential preservation, and inaccessibility
make some parts of the target population unavailable. Zooarchaeologists and paleoethnobotanists make
even further distinctions, ranging from the “life assemblage” to the “fossil” and “sampled assemblages,”
that have broader application (Orton, 2000, pp. 41–42; Ringrose, 1993). One of the things that Binford
(1964, p. 428) and those who followed strongly encouraged was careful thought about what, exactly, our
populations and the elements that constitute them are.
The sampling frame is a way of structuring or ordering the population to facilitate our selection of a
subset from it. The sampling frame can be as simple as a list of all members of the population, such as a list
of artifacts on a museum’s shelves, from which we could select some artifacts for analysis. As archaeologists
Spatial sampling 43

have long noticed, this creates a paradox for most field archaeology in that it is usually impossible for us
to specify, in advance, what sites or artifacts exist in the population. Consequently, archaeologists have
favoured geometrical sampling frames, typically a rectangular grid arbitrarily imposed on a site or a region
from which they could select some rectangles for examination and ignore others. Early on, Binford (1964,
p. 428) claimed that “The units of the frame should be approximately equal in size,” and it was widely
assumed that using a regular grid ensured that this would be the case. This type of sampling frame became
de rigeur in North American archaeology, and increasingly archaeology elsewhere, in the 1970s and 1980s.
Although archaeologists have favoured rectangular sampling grids, triangular ones are actually more
efficient. This is not only true in terms of the likelihood of intersecting sites or features, the characteristic
that archaeologists have most emphasized (Banning, 2002, pp. 97–102; Kintigh, 1988; Krakker, Shott, &
Welch, 1983; Verhagen, 2013), but also for minimizing errors in predictions based on the sample. Thanks
to geometry, a triangular grid with the same density of points is more likely than a square grid to “hit”
theoretically circular, or even oblong, sites or features that are smaller than the grid interval. In addition, the
triangular grid, by minimizing the farthest distance from sampled to non-­sampled points, minimizes the
worst predictions of population characteristics (Thompson, 2012, pp. 302–303), such as artifact densities.
However, there is nothing sacred about square, triangular, hexagonal, or any other kind of geometric
sampling frames (Wobst, 1983). In fact, given that variability in both cultural and non-­cultural spatial
information is unlikely to correspond even remotely with the borders of such units, and that some parts
of these units might even be inaccessible to observation (Binford, 1964, p. 428; Hole, 1980), they are argu-
ably a rather poor choice. A unit that is nominally 500 m × 500 m, but a third of which consists of a lake,
a steep and eroded slope, or a shopping mall, is clearly not comparable to one that has none of these. More
“natural” or “non-­arbitrary” sampling frames, such as ones based on geological landscape elements at the
regional scale (e.g., Banning, 1996, 2002; Collins & Molyneaux, 2003, p. 21; Orton, 2000, pp. 3, 86; Sch-
langer, 1992; Stafford, 1995), or city blocks in an urban setting (e.g., Wallace-­Hadrill, 1990, pp. 153–156)
can be more useful and also much more relevant to the variables in which we are most interested.
In addition, while spatial sampling always involves two dimensions, it is important to recognize that
it can often have a third: the thickness, depth or vertical component of deposits within a feature, site, or
landscape (Orton, 2000, pp. 167–168). As with two-­dimensional sampling frames, this third dimension
can be geometrical and arbitrary, as with “arbitrary spits,” but it is usually better to use “natural” strati-
graphic boundaries whenever possible.
One of the pitfalls of spatial sampling frames is that it is easy to forget that they formally describe a
population that consists of spatial units, which are rarely congruent with sites, buildings or features, and
never with artifacts. It is important to remember that when we sample with spatial units as a way to get
at a population that consists of smaller things – like sites or artifacts – that these units contain, then we
are doing cluster sampling (see Cluster Sampling).

Sample size and sampling fraction


Sample size is the absolute number of elements from the population that we subject to examination while
sampling fraction is the proportion of the population in the resulting sample. Conventionally, we denote
the number of elements in the population as N and the number in the sample as n, while the sampling
fraction is n/N.
Archaeologists, having noticed that sample size has a large impact on the calculation of the Stan-
dard Error of estimates, often emphasize sample size at the expense of sampling fraction. However, the
Standard Error is not the only consideration, and shrinking the size of spatial sample elements just to
increase sample size artificially has unintended and undesirable consequences (Hole, 1980, p. 226), such
44 Edward B. Banning

as a tendency to result in many elements with “zero” observations, or by increasing the tendency for sites
or features to fall into more than one spatial element (an “edge effect”). In reality, we need to balance
the spatial size of sample elements with other considerations, such as whether they will have sufficiently
uniform character within their boundaries, whether they are likely to have non-­zero observations (for
cluster samples), whether accessibility or other practical factors will become a problem, how uniform or
“patchy” the sampled region is, and how much travel time or set-­up time the sample size would entail.
Fixed sample sizes are ones that involve deciding, in advance, how large the sample will be, typically
on the basis of the cost of collecting or analyzing the sample. When we can calculate these costs, we can
simply budget for a particular sample size. For example, in a regional survey, we might estimate that our
survey team can walk a total of 60 km each day; if we have budgeted 30 field days to complete a survey,
then we could survey the number of sample elements that 1800 km would cover at whatever intensity or
coverage (Banning, Hawkins, & Stewart, 2011) we would like to accomplish, although it is a good idea
also to account for set-­up and travel time between units. Alternatively, we might try to balance these costs
with the risk that the resulting sample will not meet particular objectives, such as being able to estimate
some parameter of interest or compare populations within a certain tolerance of precision, statistical
power, and confidence (see examples from McManamon, 1981 and Lee, 2012, below).

Simple random sampling


Simple random sampling (SRS) is a type of probability sampling that involves selecting a subset of the
population in such a way that each and every member of the population has an equal opportunity of
being included in the sample on each and every selection (“with replacement”) or just once (“without
replacement”). Practically, this involves using a random-­number table or software that generates random
numbers to select the spaces we will examine (Figure 3.1(a)). Archaeologists have tended to favour this
design, often under the misconception that it is the only one that will provide unbiased estimators of
population parameters. In most disciplines, researchers resort to SRS only when they have no information
at all on which to base a better design, a thankfully rare scenario in archaeology.

Systematic sampling
Unlike SRS, this design involves making only the first selection randomly, and then every other selection is
strictly determined by a “spacing” rule, such as taking every fifth element in a population ordered by the
sampling frame. In a spatial context, this usually means sampling by parallel, equally spaced transects, or
at the intersections of a regular grid, whether rectangular or not (Figure 3.1(b)). It is thus most common

Figure 3.1 Examples of random and systematic spatial samples using points, rectangles, and transects as the
sample elements. (a) random point sample, (b) systematic transect sample (walking north), and (c) systematic,
stratified, unaligned sample of small squares.
Spatial sampling 45

in archaeological survey by fieldwalking, ground-­penetrating radar, or magnetometry (transects) and


shovel-­testing, augering, soil resistivity survey or some other form of subsurface testing (grid), but it also
occurs in many instances of sampling surface artifacts within sites (e.g., Whitelaw, Bredaki, & Vasilakis,
2006). Systematic sampling ensures that observations are spatially spread out, thus minimizing effects of
spatial autocorrelation (the tendency for observations close in space to have closely similar values, see
Hacıgüzeller, this volume; see also Lloyd and Atkinson, this volume), but it can be sensitive to periodic
patterns in the population itself. For example, it would be risky to use a systematic design to sample an
ancient city that was itself laid out on a regular grid.

Stratified sampling
Stratified sampling takes prior information into account so as to ensure that important aspects of the
population’s variability are reflected in the sample. This involves subdividing the population into sub-
populations (or “strata”) that are different from one another in some meaningful way. Strata that are
highly arbitrary are useless or worse. From the statistical perspective, strata should differ significantly in
the parameters of interest; consequently, it is important to compare strata after sampling is complete to
ensure that the sample design was successful in differentiating these subpopulations. Some of the more
obvious and meaningful grounds on which to base stratification include differential probability of site
discovery (regionally) or feature preservation (in sites), or likely land use (regionally) or activity area (sites).
Stratifying by soil type or altitude only makes sense if we have an explicit theory as to why sites on red
soils or in highlands might differ in their character or distribution from ones on grey soils or in lowlands,
for example, while stratifying a site by cardinal directions (northwest quarter, etc.) would likely only make
sense by accident. Many archaeological applications of stratified sampling in the 1970s and 1980s failed to
justify their basis for stratification and it may have done little to improve inferences based on the samples
(Wobst, 1983, pp. 59–60). It is better to stratify within sites, for example, when we have information that
would lead us to believe that certain areas within them were predominantly industrial, cultic or adminis-
trative, while others were predominantly residential. As noted below, a version of stratified sampling can
also be useful in the context of testing specific spatial hypotheses.
Commonly, stratified sampling is proportional; that is, the sampling fraction in each of the strata is the
same. This can lead to problems when the strata themselves differ markedly in size because the smaller strata
might have too small sample sizes, while sampling the larger ones might be unmanageable or wasteful. In
such cases, and sometimes for other reasons, samplers may employ disproportional stratified sampling, in
which the sampling fraction varies from one stratum to another. Disproportional stratified sampling requires
weighting factors (ratio of sample variance to sample size) for the various strata to compensate for the fact
that the probability of selection differs from stratum to stratum, and calculating statistics to estimate popula-
tion parameters must take these weights into account (Thompson, 2012, pp. 141–146). Ideally, dispropor-
tional stratified samples provide more precise estimates by having larger sample size and less sample variance
within the smaller strata than would be the case for a proportional design. However, many archaeological
samples have accidental imbalances in stratum size that are not optimal and may lead to larger variance.

Other probability sampling designs


Archaeologists have experimented with other variations on the random sampling model. One that was
popular for a time is systematic stratified unaligned sampling. This is a spatial sampling design that involves
arbitrarily stratifying space into rectangular strata, then selecting random coordinates in such a way as
to ensure that each of these strata will have the same number of sampled elements, but that none of the
46 Edward B. Banning

elements will “line up” (Figure 3.1c). This strategy ensures a somewhat even sampling of the spatial popu-
lation without as artificial an arrangement as a purely systematic strategy would have, while also avoiding
problems that might result from spatially periodic patterns in the population, but it retains the disadvantages
of an artificially geometric sampling frame and omits the most important advantages of stratified sampling.
Another somewhat common sampling design is Probability Proportional to Size (PPS) sampling. In
this design, the population consists of spatial entities that vary in size (e.g., plots on a landscape, build-
ings in a site, or rock grains in a petrographic slide). Rather than randomly selecting these items directly,
sampling proceeds by imposing a set of randomly or systematically arranged points or line segments on
the area that encloses the population and then selecting every element that is intersected by a sampling
location (Figure 3.2; Orton, 2000, p. 186). Larger elements have a higher probability of intersection than
small ones (thus the PPS designation), which could result in biased estimates if we are interested in esti-
mating such parameters as average size unless we compensate for this effect. If, instead, we are interested

Figure 3.2 Example of a random Probability Proportional to Size (PPS) sample of agricultural fields used
as sampling elements. Any field that contains one or more of the random points is included in the sample
(hatched). Note how larger fields are over-­represented, but this may have practical advantages in fieldwork in
terms of survey costs.
Spatial sampling 47

in estimating something like the total population of human communities that occupied a region, or the
ratio of one artifact type to another, there is no bias as long as we can reasonably expect that the density
of human occupation or the artifact ratio is much the same in large and small spaces. PPS sampling, typi-
cally with a grid of points, has been used to identify places that will be subsampled with archaeological
survey (e.g., Kuna, 1998).

Cluster sampling
As mentioned above, whenever we use a sampling frame of spatial areas, whether geometrical or not, as a
means to select a subset of a population of some other kind of entity, such as sites, mounds, pits, features,
artifacts, seeds, or bone fragments, we are cluster sampling. In other words, the sampling elements are not
identical to the elements in the population of interest. For cluster samples, N and n are numbers of larger
sample elements or “clusters” in the population and sample respectively, while M is the number of things
in the population and m the number in our sample (Mueller, 1975; Orton, 2000, pp. 212–213; Read, 1975,
pp. 54–58; Thompson, 2012, pp. 157–166).
Calculating estimates of proportion, mean or density in such samples is not difficult. We can conceive of
the sample elements as n “clusters” and base the estimates on the total number of observations (m) across all
sampled clusters. So, the density of flakes from an excavation using a 1m grid as a sampling frame would just
be the total number of flakes found in all the sampled 1m2 units divided by the number of sampled units.
Calculating variance and standard error is somewhat more complicated, however, as we need to examine
how much each individual cluster deviates from the overall mean, proportion or density. The calculations
that most basic statistical and spreadsheet software provides unfortunately do not account for cluster sampling
and lead to biased estimates of dispersion or error. However, we can still calculate them properly by tracking
these deviations in a spreadsheet and applying appropriate formulae (Banning, 2000, p. 83; Drennan, 2010,
pp. 243–248; Orton, 2000, pp. 212–213). One other complication is the presence of “edge effects” when, for
example, a site falls partly within, but partly outside, an element of our spatial sample. Do we count it? Count-
ing whole sites in such cases leads to bias (i.e. the overestimation of site number and settled area). Among
the potential solutions is to count it as a fraction of a site, or only if its centre lies within the sample element.
Failure to deal with edge effects properly can lead to biased estimates from cluster samples.

Sequential and adaptive sampling


As alternatives to fixed-­sample-­size designs, adaptive and sequential sampling allow us to increase the
sample size until we satisfy some criterion. Archaeologists do not formally use sequential sampling as
much as they probably should, but a version of it called “sampling to redundancy” has seen some use in
zooarchaeology, paleoethnobotany, and artifact analysis (Leonard, 1987; Lyman, 1995). Sequential sam-
pling uses a “stopping rule,” which can be as simple as “stop sampling when the standard error is below
x” or “stop sampling when five sequential sample increments show no increase in diversity.” It is less
easy to use in a field context, since field seasons are rarely open-­ended, but could include repeating the
survey of a particular space until the posterior probability that it contains the material of interest, given
the amount of survey to date, falls below a particular threshold or below the probability of other spaces
(Hitchings, Abu Jayyab, Bikoulis, & Banning, 2013).
Adaptive cluster sampling is a little different. In the spatial case, this design requires us to add a certain
number of sampling units in the “neighbourhood” of a unit that has had a “positive” observation, typi-
cally meaning that it has at least one artifact (Orton, 2000, pp. 34–38; Thompson & Seber, 1996). The
result of this process is a set of “networks” that each consist of one or more interconnected or adjacent
48 Edward B. Banning

sampling units surrounded by “empty” sampling units. This is a very promising approach but, as with
cluster sampling, we need to be careful to use the correct statistics for this kind of sample; calculating the
mean and standard deviation for an adaptive sample requires us to account for the total number of sam-
pling elements in the population, the number of elements in the original sample, the number of networks,
and the number of elements in each network in order to avoid bias (Orton, 2000, p. 214; Thompson &
Seber, 1996, p. 96). Some North American jurisdictions now recommend or require professional archae-
ologists to use some version of adaptive sampling for survey by shovel-­testing (e.g., Ministry of Tourism,
Culture and Sport [MTCS], 2011, p. 33), even though these cases are usually intended for discovery, not
estimation, and do not always conform to the statistical form of adaptive sampling. Consequently, they
may not provide unbiased estimates of population parameters.
One of the misconceptions that many archaeologists hold is that the formal or statistical sampling
methods discussed above are appropriate for the discovery or detection of sites and features in space. As
already noted, their actual purpose is to make inferences about populations, or to compare population
characteristics, not to ensure our detection of any particular observation (Shott, 1985; Wobst, 1983). In
fact, it should be quite obvious that a method whose premise is that we can make inferences on the basis
of a small subset of potential observations would, by definition, omit the majority of the population. The
only type of sampling that can maximize our chances of finding something in particular, let alone finding
almost everything, is purposive selection.

Non-­probability sampling
Making inferences about populations – estimating their parameters, setting confidence intervals, or test-
ing statistical hypotheses about them – is relatively straightforward when we use any kind of probability
sampling, at least in the sense of avoiding uncontrollable bias. However, many archaeological samples were
not collected according to any formal probability sampling plan and there are even cases where it is more
effective or more reasonable to use a purposive (or authoritative) sample.
The simplest kind of non-­probability sample is the convenience sample. This type of sampling involves
just taking whatever observations come most readily to hand, and results in a sample that has a high risk
of providing biased estimators. For example, spatial convenience samples are often clustered in space (e.g.,
samples taken along roads where accessibility is high) so that they are susceptible to spatial autocorrelation
and may not be representative of a much larger population space. From a Bayesian perspective, convenience
samples may still be useful as long as there is no a priori reason to think that one group of potential observa-
tions will differ from another group in the population with respect to the variables of interest (a concept
called exchangeability, Buck, Cavanagh, & Litton, 1996, pp. 73–74). In the case of sampling by intervals along
a pre-­existing, straight road or pipeline corridor, for example, the resulting sample might be representative
of a population with respect to characterizing clay sources in the surrounding region as long as the road or
pipeline crosscuts the population space in a way that is not expected to favour particular soil types. However,
sampling along a road that follows the route of an ancient road or pathway would be a poor way to charac-
terize the proportions of pottery of different periods in the region, since sites and activities associated with
one period might cluster along the road while those of other periods have quite different spatial patterns.

Purposive selection and optimal searching


Rather more formal than a simple convenience sample is the purposive sample. Archaeologists have
tended to hold rather negative views of purposive selection despite the fact that it has historically been
extremely important in archaeology and is very common today. In part, this negative stereotype is due
Spatial sampling 49

to misunderstandings as to what it includes, how we can apply it, and for what objective, compounded
by a tendency to confuse it with convenience sampling. While it is true that convenience sampling is
likely to lead to biased estimates of population parameters, careful purposive designs can be much more
efficient than random ones when the goal is either to test specific hypotheses or to discover some kind of
“target,” such as a site or a feature, especially when the target is rare. They can also perform better than
probabilistic designs when prior information convincingly shows that sampling in some spaces would be
ineffective or wasteful.
Unlike convenience sampling, careful purposive selection involves many of the same steps as prob-
ability sampling, including creating a sampling frame and rules of selection. However, there are also
differences. Rather than obtaining a random or representative sample, purposive selection often has the
aim of targeting rare observations that a random sample would likely miss. In such cases, it approaches
a disproportional sampling of a stratified population so that spaces considered most likely to contain the
rare sites or artifacts receive greater density of examination. Such an approach is implicit in many state-­
mandated survey guidelines that allow much lower survey intensity, or even no survey at all, in areas of
“low potential” (e.g., MTCS, 2011, p. 28). In other instances, geomorphological indications of the places
most likely to contain sites of a particular age can be the main criteria for allocation of survey (Hitchings
et al., 2013). Among the best-­known applications of purposive survey are searches for shipwrecks; the
most efficient of these are informed by information on currents, reefs, ports, and the historically docu-
mented routes that ships favoured. However, it is notable that, in many of these cases, the goal is not to
make inferences about populations but merely to discover rare observations. As outlined below, purposive
sampling can also be important in the context of very specific hypotheses.
One of the hallmarks of purposive survey is its use of prior information. To return to the example of
sampling along a road, in the context of searching for a particular kind of site, say Roman forts and way-­
stations that can be expected to cluster along ancient roads, purposive survey along the known routes of
such roads would be an excellent way to improve the chances of finding most of the relevant sites.
In excavation samples, purposive selection can be the sensible choice for a second stage of sampling
after an initial probabilistic sample (Redman, 1973). In a site with rectilinear architecture, for example, a
systematic sample might reveal segments of walls that provide no good information on the sizes of build-
ings. Following up with purposive sample units guided by the directions of these walls can lead to the
discovery of building corners, thus allowing us to reconstruct the sizes of whole buildings at relatively low
cost. However, we should keep in mind that these are likely to be, in some sense, PPS samples.
The most sophisticated versions of purposive selection involve optimal searching (Koopman, 1980;
Stone, 1975). Optimal searches are designed to locate particular kinds of observations with the lowest pos-
sible search cost, making them well-­suited to the identification of rare kinds of sites or features, and they
make explicit use of prior information to guide the search. Searching of this kind can be a very useful
supplement to the more typical kinds of sampling – which provide information on the most common
or most representative phenomena – by providing information on rare phenomena that a conventional
sample would likely omit. However, it is important to keep in mind that observations obtained through
optimal searching should not be combined with those from probabilistic sampling to estimate population
characteristics without adjustments to account for the non-­probabilistic way in which they were acquired.
For example, a more conventional sample might be sufficient to establish an upper limit (“detection limit”)
on the number or density of a rare kind of site, while the purposive sample can provide information on the
characteristics of those rare sites or document their presence when the conventional sample omits them.
Another important use of purposive selection is in the context of hypothesis testing. Some kinds of
archaeological hypotheses make specific predictions about where certain kinds of archaeological remains
should occur, and where they should not. A random sample would be an extremely ineffective way to
50 Edward B. Banning

test such predictions, and it is obvious that it is better to target spaces, whether in regions or within sites,
where the hypothesis predicts such remains should or should not be present. This involves classifying the
sampling frame into three categories – spaces where we predict there should be certain kinds of observa-
tions, spaces where they should not occur, and all the rest – and then disproportionately target the “posi-
tive” or “negative” spaces, or both. In cases where we introduce probability sampling within these three
categories of space, we can more formally describe this approach as disproportional stratified sampling.
When the “positive” spaces are few, however, we would probably want to investigate all of them.
One tool that has greatly enhanced the potential and role of purposive selection is predictive model-
ling, typically within a Geographical Information System (GIS) (see Verhagen & Whitley, this volume).
Although the most obvious use of GIS in a sampling context is for purposive selection, it can also be
extremely useful to create, and later test, the strata in a stratified sample design.

GIS and sampling design


Because archaeologists use GIS so widely in their analysis of spatial problems, it makes sense to integrate
sampling into a GIS environment. One of the advantages of this is that a GIS model makes it easier for us
to use sampling frames that are a better fit to the reality on the ground. Archaeologists can use aerial or
satellite imagery in conjunction with a GIS to identify the boundaries of landscape elements or subdivi-
sions of sites that might be meaningful in terms of the probability of their containing particular kinds of
archaeological remains or the preservation of those remains, and use these as sample elements. Since such
units are likely to differ considerably in size and shape, the fact that the GIS makes it easy to calculate their
area and account for it in such measures as artifact density is also very helpful.
A GIS can also be a great asset in defining relevant strata for a stratified sample. The GIS can identify
spaces that share a large number of relevant characteristics (slope, aspect, distance to streams or conflu-
ences, nearness to least-­cost paths between major settlements, etc.), and define these as one stratum
alongside other strata that lack many or all of these characteristics. Scripts in a GIS can also be used to
select random locations, either proportionally among strata or with a defined sample size in each stratum.
In some cases, least-­cost paths determined in a GIS can serve to target purposive survey in much the
same way as Roman roads. In cases where we might expect subsidiary sites in a settlement system to occur
along routes between large settlements, for example, a high density of sampling in corridors that follow
the least-­cost paths between major settlements would be very effective.

Multistage sample designs


Rarely will a single sampling design satisfy all our requirements, so it is common to combine some of the
designs described above in an iterative way. This often involves starting out with a design that has a high
degree of randomness but reducing the role of randomness as information about the population of inter-
est accumulates (Redman, 1987, p. 259). However, it is important to keep in mind that making statistical
tests or parameter estimates should be based on the stage or stages that we can expect to be representative
of the target population, so as to avoid biased estimates or flawed conclusions.

Evaluating sampling results and sampling errors


Once you have selected your sample, collected data, and analyzed it with respect to your research ques-
tions, your sampling task is not over. It is essential to examine the success of the sampling design itself to
ensure that it was doing its job.
Spatial sampling 51

One thing to check is whether the sample size was adequate. Even if we attempted to determine an
appropriate sample size with one of the methods illustrated below, the attempt is not always successful.
For example, it is only prudent to check the Standard Error of the estimates based on the sample to ensure
that they are within our tolerances for whatever confidence level we have chosen.
It is also critical to check the degree to which the strata in a stratified sampling design have captured
relevant variability. If there are no statistically significant differences between the characteristics of the
strata, so that variances within strata are smaller than the variance in the whole population, then the strati-
fication was a failure. Among the ways to make this assessment would be to compare the stratum-­level
estimates of a relevant ratio-­scale parameter, such as site size or artifact density, using a statistical method
like analysis of variance (ANOVA).
However, in the case of archaeological sampling, we also need to look out for other kinds of errors.
In addition to the sampling error that is intrinsic to sampling, and well understood by statisticians, a
variety of non-­statistical errors can have impacts on the results. Among the most common of these
is error that results from inaccessibility of some observations. For example, the sample design for a
regional survey may select some spatial units in which it was impossible to detect sites or artifacts
because of safety concerns, because a landowner, military authority or other government agency
denied access, or because the material we needed to detect was inaccessible to our detection methods,
as is usually the case whenever it is deeply buried or hidden by modern development. Similarly, a
sample of spaces within a site could easily be missing data either because the relevant materials were
destroyed or poorly preserved relative to other parts of the site, or because excavations in parts of the
site did not go deeply enough to encounter materials that are contemporary with those encountered
in the rest of the site. This effect, which is analogous to “non-­response” in questionnaire-­type sur-
veys, has an impact on the real sample size for some kinds of remains and, in some cases, could lead
to distorted (biased) inferences of site or artifact characteristics or their distributions in space. One
way to assess this is the response rate, which is simply the number of sampling units for which we
have complete data divided by the total number of sampling units that were nominally in our sample
(cf. Fowler, 1984).

Detection effects
A simplifying assumption is that any potential observation within each sample element will always be
observed without error. However, in archaeological spatial samples, potential observations are often
overlooked because of their lower detectability (Casteel, 1972; Gordon, 1993; Krakker et al., 1983).
In excavations, for example, we routinely fail to detect some of the lithics, sherds, seeds or bone frag-
ments because they are very small or difficult to distinguish from natural stones. Consequently, if we
do not account for this effect, our values of m in a cluster sample will be too low, leading to biased
estimates. Archaeologists use methods such as screening sediments in their attempt to improve detect-
ability, but these are never perfect. In some instances, it may be reasonable to assume that detectability
for a particular kind of artifact or feature is constant over our sample elements and we can account for
it with reasonable estimates of detectability (Thompson, 2012, pp. 215–219; Orton, 2000, pp. 26–27,
39; Banning et al., 2011). The more serious case is when there is differential detectability across our
population, as when we are attempting to survey artifacts of particular colour and texture against a
background of highly variable environmental characteristics (Banning, 2002, pp. 48–49), or when
there are significant inter-­observer differences (Hawkins, Stewart, & Banning, 2003). To obtain
unbiased estimates of population characteristics, we then need to take these differences into account
(Thompson, 2012, pp. 224–225).
52 Edward B. Banning

Case studies

Stratified random sampling


A survey of a diverse and somewhat rugged landscape in southern Jordan, on the east side of the Jordan
Rift Valley, illustrates some of the advantages and potential pitfalls of stratified random sampling (MacDon-
ald, Herr, Quaintance, Clark, & Macdonald, 2012). The survey team stratified the large survey region into
three strata (Figure 3.3). Stratum 1 on the western side ranges from 1100 to 1500 m ASL and includes some
very rugged terrain, as severe erosion of the scarp along the Jordan Valley has dissected the landscape into
a large number of deep canyons and gullies, and mean annual rainfall today is about 50–150 mm. Stratum
2 to its east is a mountain range with elevations from 1500 to slightly more than 1700 m ASL, and mean
annual rainfall of about 100–300 mm. The steppic region of Stratum 3 is on the eastern plateau, sloping
down to the east from 1500 to 1200 m ASL, with mean annual rainfall also declining from about 120 mm
at its western margins to only about 50 mm where it grades into desert to its east. The sampling frame was
a square grid of 500 m × 500 m sampling elements imposed on the survey region and the team randomly
selected about 5% of these in each stratum: 27 in Stratum 1, 25 in Stratum 2, and 30 in Stratum 3. Then
survey teams walked 8 or 10 transects at approximately 50 m intervals across each 500 m unit to search for
“sites” and “diagnostic artifacts” (MacDonald et al., 2012, pp. 3–7).
One of the advantages of this approach is that it forces archaeologists to survey – or at least attempt
to survey – portions of the survey region that they might be predisposed to ignore, such as the deeply

Figure 3.3Map of the survey region of the Ayl to Ras an-­Naqab Survey in southern Jordan, with three strata
and 500 m × 500 m sample elements (after MacDonald et al., 2012, p. 6).
Spatial sampling 53

dissected Stratum 1. In principle, it also provides a framework for making estimates of such things as the
number, density, and average size of sites, or the proportions of different artifact classes, in each stratum.
However, it also illustrates quite well some typical problems. First, given the arbitrarily square sam-
pling elements and the very jagged borders between strata, it is not surprising that many of the sampling
elements either straddle the boundary between two strata, making it unclear to which they belong, or
lie partly outside all three strata (edge effects). Second, in six of the squares in Stratum 1, severe ero-
sion over some or all of their extent made it impossible or unsafe to survey them as thoroughly as in
other areas (MacDonald et al., 2012, p. 9). Third, even had survey been possible, many archaeological
traces will not have survived in these areas because erosion has carried them all away. Consequently,
coverage of some parts of the survey region, especially in Stratum 1, is actually less than the nominal
sampling fraction of 5%, providing the potential for biased estimates of population parameters. Fourth,
the fact that the transects within sampling elements were so widely spaced means that these elements
were themselves subsampled, so that coverage, especially for artifacts and sites less than 50 m in size,
would be much less than 5% and estimates of some parameters, such as site size, would be biased unless
we correct for these effects. Finally, travel costs between such widely scattered survey units over a large
territory are large. These are not criticisms but rather the realistic consequences of surveys that employ
low-­intensity subsampling and arbitrary, geometric sampling frames over highly diverse terrain. Strate-
gies to compensate for these challenges can include non-­geometric sample elements (see next section)
and adding random units whenever units initially selected prove inaccessible or too eroded to preserve
archaeologically interesting deposits.

Defining meaningful sampling units


Wallace-­Hadrill (1990) attempts to deal with the unpleasant realities of drawing a useful and reasonably
representative sample of houses at Pompeii, so much of which is ruined, poorly excavated, undocumented
and overgrown, while the best-­preserved houses are likely among the largest and most luxurious. Taking
advantage of the fact that Pompeii’s overall street plan is regular and quite evident, he made a selection of
insulae, or city blocks, in three areas of the site that were relatively well preserved and well documented,
including a total of 234 houses that provided a diverse cross-­section of Pompeiian society. While this
constitutes a purposive cluster sample, the lack of randomness in selection is justifiable on the grounds
that including ruined, overgrown and undocumented insulae would have made the sample effectively
useless. The sampling frame was the known and plausibly reconstructed set of insulae of the ancient city,
and selection was on the basis of good preservation and sound documentation. The resulting sample is
much more useful and reliable than a random one would have been as long as we consider carefully the
possible biases that could result: spatial autocorrelation (as the resulting sample tends to be clustered in
space), or tendency for different sectors of the site to vary functionally or socio-­economically, could both
result in biased inferences.
On a more regional scale, some surveys have eschewed the arbitrary grid in favour of more convenient
or more appropriate spatial sampling frames. In some of these surveys, the modern landscape provides no
obvious clues to units that would have been culturally meaningful in the distant past, but does provide
many landmarks that at least make it easier and faster for field crews to locate and survey those units than
would be the case if they were required to position themselves, even with GPS, at geometrically arbitrary
survey locations. One option is to use modern agricultural fields as a sample frame; this provides clearly
defined boundaries that sometimes may even have considerable antiquity.
In a Bohemian project (Kuna, 1998), fields were the primary sample elements, but were selected by
imposing randomly located coordinates on a map. This constitutes PPS sampling, as large fields are more
54 Edward B. Banning

likely to be “hit” by at least one point than are small fields. Survey teams then subsampled the selected
fields by transects 10m in width, subdivided into 10m segments, and at varying intervals. Because vis-
ibility within fields is usually less variable than between fields, fields are often much better as sample
elements than arbitrary geometric units in that they make it easier to control for variability in detection
probabilities.
Taking this approach further, some archaeologists have used “landscape elements” defined on geoar-
chaeological grounds as sampling units. Where the archaeology of interest has considerable time depth, it
is possible or even likely that the landscape elements currently visible on the modern surface differ mark-
edly in age, some being remnants of ancient land surfaces that have mostly disappeared through erosion
or been deeply buried by alluviation or colluviation. In such circumstances, it makes sense to define a
sampling frame that consists of all landscape elements that are likely to be remnants of the ancient land-
scape, and either ignore younger ones or treat them as different strata in a stratified sample. In the deeply
incised drainage system of Wadi Quseiba, northern Jordan, Hitchings et al. (2013) recognized that most
of the valley floor that existed in Neolithic times had eroded away, leaving only fragments in the form of
“terraces” stranded some way up the sides of the valley walls. The set of these terraces, along with some
flat or gently sloping plateaus on the valley margins, became the sampling frame for a purposive sample
that employed predictive modelling in a GIS and Bayesian allocation methods to optimize the survey’s
chances of finding Neolithic sites (Figure 3.4).

Figure 3.4 Map of a portion of the Wadi Quseiba survey region in northern Jordan, showing the ephemeral
stream channels (dashed) and the population of landscape elements or “polygons” (hatched) that constituted
the sampling frame for Stratum 2 of this survey (after Hitchings et al., 2013).
Spatial sampling 55

Determining appropriate sample size


McManamon’s (1981) survey in Cape Cod provides one of the first explicit attempts to deal with the issue
of ensuring that sample size is adequate for the survey’s objectives, which were to characterize the locations,
frequency and characteristics (chronology, activities and structure) of sites. After excluding inaccessible
portions of Cape Cod from the population, he stratified it so that Stratum I included land within 200 m
of streams and shores, to reflect prior information on preferred site location, further subdivided into three
substrata, and Stratum II was the remainder. The sample elements were 100 m × 200 m rectangles and, in
Stratum I, they were distributed at random points along the shores and streams and perpendicular to them,
so that their long axes extended to the edge of the stratum. In Stratum II, the rectangles were randomly
located with their long axes oriented North-­South. After conducting a 1% pilot sample, McManamon
estimated the sample size required in each stratum or substratum to achieve a relative error (Standard Error
divided by the mean) on site frequencies of 10% at a confidence level of 80%, using the formula:

n = ( st )2 / (rx )2

where n is the number of sample elements to be surveyed in a stratum or substratum, s is the standard
deviation from the 1% pilot sample, used as an estimate of the population standard deviation (s) for that
stratum, t is the Z-­score for 80% confidence (1.28), r is the relative error (0.1), and x̄ is the sample mean
from the pilot sample, used as a rough estimate of the population mean (μ). This allowed him to estimate
the required number of sample elements as 38 in Stratum IA, 22 in Stratum IB, and 151 in Stratum II.
With the data at hand, it was not possible to establish the required sample size for Stratum IC.
At the intra-­site scale, Lee (2012) gridded the floors of large semi-­subterranean houses at several
Korean Mumun sites in the Nam River Valley on a 1 m grid, and collected 10 litres of sediment from
each grid square and pit feature for paleoethnobotanical analysis. Taking the total of these at each site as
the population, and given the substantial labour costs of counting seeds in such volumes after flotation, she
used the same method as McManamon to estimate the sample size (number of grid squares and features)
needed to achieve a relative error on seed densities of 20% at a 90% confidence interval (t = 1.83 and
r = 0.2). This provided a basis for estimating the proportions of different taxa among the seeds of whole
sites, but sacrificed the spatial information that could have resulted from sub-­sampling the 10-­litre vol-
umes of all of the grid squares and features. However, seed density was too low to achieve good estimates
of population characteristics from the latter strategy.
In a similarly time-­consuming study of micro-­remains in Late Neolithic house floors in Jordan,
Ullah, Duffy, and Banning (2015) opted for systematic sampling of the floors to reflect spatial pattern
combined with sequential sampling of the micro-­remains in each sample element. Having collected
sediment samples from across a square grid imposed on each floor, they controlled for inter-­observer
error in counting the remains in each square by having a large group of students each count a small
sample, with replacement, of the screened sediment from all volumes taken from gridded house floors.
To balance the problems of sparse data and the time costs of counting, they experimented with various
volumes for the subsample elements until settling on 3 ml and then used sequential sampling to have stu-
dents count remains in these small volumes until the standard error on the density of the most important
micro-­remains levelled off (Figure 3.5). Having each grid context counted once by every student largely
controlled for varying student abilities at identifying particular classes of material, and makes it possible
to identify spatial variations across the house floors that provide hints of activity areas and site-­formation
processes, in addition to estimates of the proportions of micro-­remains in larger spatial units, such as
houses or activity areas.
56 Edward B. Banning

Figure 3.5 Decline in the Relative Standard Error (RSE) of micro-­refuse counts with increasing sample size in
the use of sequential sampling in Wadi Ziqlab. Sampling stopped after the three-­point slope was less than 0.03
for three consecutive measures of RSE (after Ullah et al., 2015, p. 1254).

Testing specific hypotheses


The Reese River Survey provides an early example of using survey data to test a specific hypothesis:
whether Steward’s (1938) account of Shoshonean settlement and economy fit the archaeological distribu-
tions in Reese River Valley.
In an early stage of this research, Thomas (1973) drew a 10% stratified random sample of the survey
region, with the main strata consisting of three “biotic communities” (sometimes subdivided), and sample
elements consisting of 500 m × 500 m squares. This resulted in a sample of 140 squares. Intensive survey
yielded some 3500 artifacts. Some, but not all, of the predictions of the adaptation of the Steward model
were confirmed, such as the tendency of settlements and stone circles to occur in substratum A1 and
stratum B.
On the basis of clues about settlement locations in Steward’s work, a subsequent paper (Williams,
Thomas, & Bettinger, 1973) defined seven more detailed criteria to characterize them, rather than just
the biotic strata: slope no greater than 5% (later amended to 10%), elevation no more than 250 m above
the valley floor, and location on a ridge or saddle, within the modern piñon-­juniper zone, no more than
1000 m from the piñon-­juniper ecotone, no more than 1000 m from a semi-­permanent water source
(later amended to 1200 m), and at least 100 m away from that water source.
Employing Beckner’s (1959) definition of a polythetic set, Williams et al. (1973, p. 227) predicted that
settlement sites should tend to occur at locations that satisfied five or six of these seven criteria, if Steward’s
model is appropriate. They then surveyed 500 m × 500 m random elements that had not been included in
the previous survey, and identified 65 sites. They proceeded to tabulate how many of these sites met each
of the criteria, and found strong statistical relationships that tended to confirm the hypothesis, with 97%
of the sites occupying predicted locations. Independent of this, they also examined the set of predicted
locations, which they had identified by poring over stereographic pairs of aerial photographs; 63 of the
74 predicted locations had sites, but only two sites occurred elsewhere.
In a later paper, Thomas (1975) returns to the earlier stratified random spatial sample of 500 m ×
500 m rectangular units although now treating the strata as different populations to be compared, each
with a simple random sample. There is also a return to examining the densities of projectile points and
Spatial sampling 57

waste flakes in the sample elements in the strata/populations, eschewing a site-­based approach. Statistical
analysis with respect to the hypotheses had equivocal results.
The polythetic stage of this research was a particularly good example of how sampling a region
could help archaeologists test a specific hypothesis about past human use of a landscape. However,
the Reese River project surveyed a large number of spaces that were likely irrelevant to the hypothesis
test, costing time that could have been invested in a larger sample size of spaces that had either high
or low probability of containing sites, or having high or low densities of projectile points and waste
flakes, if the Steward hypothesis was correct. In addition, the use of rectangular sample elements
almost certainly led to a lack of uniformity within each unit in the degree to which they satisfied
the polythetic criteria.
Williams et al. (1973) had already defined a more “natural” sample element than either sites or arbi-
trary squares. They note that it would be easy to identify spaces that qualify as sites, defining the spaces
by the edges of the artifact scatter, but it was impossible to do this for “non-­sites.” Actually, it would be
equally impossible to identify the site areas until after the survey was complete. Consequently, the most
efficient approach would have been to classify the entire map of the survey area by the polythetic criteria
to identify contiguous areas of space that satisfied at least five of the seven criteria, as well as spaces that
satisfied, say, less than three of those criteria. This would have been somewhat difficult to do in the early
1970s (although they did accomplish this, at least roughly, for the “positive” spaces on the stereographic
pairs), but fairly straightforward today using a GIS. It then would have been possible to restrict survey to a
sample of the spaces where the hypothesis predicts there should be finds, and a sample of spaces where it
predicts there should not be any (or the artifact densities should be much lower). If the polythetic criteria
are good predictors of site location, there should be a significant difference between the two sets of spaces
in their densities of sites or artifacts, with most of the sites and the highest densities in the “positive” set; in
fact, the “negative” set should have hardly any sites or artifacts at all. Spaces between these two extremes
would contribute little or nothing to the hypothesis test (although they might be important for other
reasons), so avoiding their survey, or sampling only a small proportion of them, would reduce the survey
cost or allow a larger sample size of more useful spaces.

Conclusion
Although some flaws and misunderstandings in the archaeological applications of sampling theory caused
early enthusiasm to give way to a near general scepticism, careful application of sampling theory continues
to allow archaeologists to draw some kinds of conclusions about populations and how they compare with
other populations with a high and well-­specified degree of confidence, but only if they are realistic about
how they define such populations, and the research questions about them, in the first place.
In the context of that scepticism, most archaeology, including archaeology in an interpretive or post-­
modernist vein, has continued to employ inferential statistics either formally or informally, but sometimes
lacks the careful attention to sample design that is necessary to draw sound conclusions from the data.
In a spatial context, most of our samples of artifacts are purposive or convenience cluster samples, often
from contiguous or highly clustered excavation areas within sites, that may not be representative of the
population of interest. While we will always need to make the most of the samples at our disposal, we
should at least be mindful of the potential biases in estimates we base on such samples.
Moving forward, defining the sampled population accurately will involve carefully considering the
site-­formation processes that have altered the target population, and the factors, including detectability
and accessibility, that may impede our ability to draw meaningful samples from that population. A return
to broader uses of formal sampling should include greater attention to cluster sampling, more careful
58 Edward B. Banning

thought to the appropriate size and shape of sample elements, greater use of samples designed to test
specific hypotheses, and a bigger role for carefully conceived stratified samples.
As Hole (1980, p. 226) aptly points out, there is no magic formula to decide what sampling strategies
are best. “The optimal strategy always depends on what is meant by the word optimal, the nature of the
particular archaeological application, and the data.”

References
Banning, E. B. (1996). Highlands and lowlands: Problems and survey frameworks for rural archaeology in the Near
East. Bulletin of the American Schools of Oriental Research, 301, 25–45.
Banning, E. B. (2000). The archaeologist’s laboratory. New York, NY: Kluwer Academic and Plenum Publishing.
Banning, E. B. (2002). Archaeological survey. New York, NY: Kluwer Academic and Plenum Publishing.
Banning, E. B., Hawkins, A., & Stewart, S. T. (2011). Sweep widths and the detection of artifacts in archaeological
survey. Journal of Archaeological Science, 38(12), 3447–3458.
Beckner, M. (1959). The biological way of thought. New York, NY: Columbia.
Binford, L. R. (1964). A consideration of archaeological research design. American Antiquity, 29(4), 425–441.
Buck, C. E., Cavanagh, W. G., & Litton, C. (1996). Bayesian approach to interpreting archaeological data. New York, NY:
John Wiley & Sons.
Casteel, R. W. (1972). Some biases in recovery of archaeological faunal remains. Proceedings of the Prehistoric Society,
38, 382–388.
Collins, J. M., & Molyneaux, B. L. (2003). Archaeological survey. Walnut Creek, CA: Altamira Press.
Cowgill, G. L. (1990). Toward refining concepts of full-­coverage survey. In S. K. Fish & S. A. Kowalewski (Eds.),
The archaeology of regions: The case for full-­coverage survey (pp. 249–259). Washington: Smithsonian Institution Press.
Drennan, R. D. (2010). Statistics for archaeologists: A commonsense approach, 2nd ed. New York, NY: Springer.
Fish, S. K., & Kowalewski, S. A. (Eds.). (1990). The archaeology of regions: The case for full-­coverage survey. Washington:
Smithsonian Institution Press.
Flannery, K. V. (1976). Sampling on the regional level. In K. V. Flannery (Ed.), The early Mesoamerican village (pp. 131–
136). New York, NY: Academic Press.
Fowler, F. J. (1984). Survey research methods. Thousand Oaks, CA: Sage.
Gordon, E. A. (1993). Screen size and differential faunal recovery: A Hawaiian example. Journal of Field Archaeology,
20(4), 453–460.
Hawkins, A. L., Stewart, S. T., & Banning, E. B. (2003). Interobserver bias in enumerated data from archaeological
survey. Journal of Archaeological Science, 30(11), 1503–1512.
Hitchings, P., Abu Jayyab, K., Bikoulis, P., & Banning, E. B. (2013). A Bayesian approach to archaeological survey in
north-­west Jordan. Antiquity, 87(336), project gallery.
Hole, B. L. (1980). Sampling in archaeology: A critique. Annual Review of Anthropology, 9, 217–234.
Judge, W. J., Ebert, J. I., & Hitchcock, R. K. (1975). Sampling in regional archaeological survey. In J. W. Mueller
(Ed.), Sampling in archaeology (pp. 82–123). Tucson, AZ: University of Arizona Press.
Kintigh, K. W. (1988). The effectiveness of sub-­surface testing: A simulation approach. American Antiquity, 53, 686–707.
Koopman, B. O. (1980). Search and screening: General principles with historical applications. New York: Pergamon Press.
Krakker, J. J., Shott, M. J., & Welch, P. D. (1983). Design and evaluation of shovel-­test sampling in regional archaeo-
logical survey. Journal of Field Archaeology, 10, 469–480.
Kuna, M. (1998). Method of surface artefact survey. In E. Neustpny (Ed.), Space in prehistoric Bohemia (pp. 77–83).
Praha: Institute of Archaeology and Czech Academy of Sciences.
Lee, G.-­H. (2012). Taphonomy and sample size estimation in paleoethnobotany. Journal of Archaeological Science, 39(3),
648–655.
Leonard, R. D. (1987). Incremental sampling in artifact analysis. Journal of Field Archaeology, 14, 498–500.
Lyman, R. L. (1995). Determining when rare (zoo-­)archaeological phenomena are truly absent. Journal of Archaeologi-
cal Method and Theory, 2(4), 369–424.
MacDonald, B., Herr, L. G., Quaintance, D. S., Clark, G. A., & Macdonald, M. C. A. (2012). The Ayl to Ras an-­Naqab
archaeological survey, southern Jordan 2005–2007. Boston, MA: American Schools of Oriental Research.
Spatial sampling 59

McManamon, F. P. (1981). Probability sampling and archaeological survey in the Northeast: An estimation approach.
In D. R. Snow (Ed.), Foundations of northeast archaeology (pp. 195–227). New York, NY: Academic Press.
Ministry of Tourism, Culture, and Sport [MTCS]. (2011). Standards and guidelines for consultant archaeologists. Toronto,
ON: Ministry of Tourism, Culture, and Sport.
Mueller, J. W. (1975). Archaeological research as cluster sampling. In J. W. Mueller (Ed.), Sampling in archaeology
(pp. 33–41). Tucson, AZ: University of Arizona Press.
Orton, C. (2000). Sampling in archaeology. Cambridge: Cambridge University Press.
Plog, F. (1990). Some thoughts on full-­coverage surveys. In S. K. Fish & S. A. Kowalewski (Eds.), The archaeology of
regions: The case for full-­coverage survey (pp. 243–248). Washington: Smithsonian Institution Press.
Read, D. W. (1975). Regional sampling. In J. W. Mueller (Ed.), Sampling in archaeology (pp. 45–60). Tucson, AZ:
University of Arizona Press.
Redman, C. L. (1973). Multistage fieldwork and analytical techniques. American Antiquity, 34, 265–277.
Redman, C. L. (1987). Surface collection, sampling, and research design: A retrospective. American Antiquity, 52,
249–265.
Ringrose, T. J. (1993). Bone counts and statistics: A critique. Journal of Archaeological Science, 20, 121–157.
Rootenberg, S. (1964). Archaeological field sampling. American Antiquity, 30(2), 181–188.
Schlanger, S. H. (1992). Recognizing persistent places in Anasazi settlement systems. In J. Rossignol & L. Wandsnider
(Eds.), Space, time, and archaeological landscapes (pp. 91–112). New York, NY: Plenum Press.
Shott, M. L. (1985). Shovel-­test sampling as a site discovery technique: A case study from Mchigan. Journal of Field
Archaeology, 12(4), 457–468.
Stafford, C. R. (1995). Geoarchaeological perspectives on paleolandscapes and regional subsurface archaeology. Journal
of Archaeological Method and Theory, 2(1), 69–104.
Steward, J. H. (1938). Basin-­plateau aboriginal sociopolitical groups. Bureau of American Ethnology Bulletin, 120.
Washington, DC: Smithsonian Institution.
Stone, L. D. (1975). Theory of optimal search. New York: Academic Press.
Tartaron, T. F. (2003). The archaeological survey: Sampling strategies and field methods. In J. Wiseman & K. Zachos
(Eds.), Landscape archaeology in southern Epirus, Greece (pp. 23–45). Hesperia Supplements, 32. Princeton: American
School of Classical Studies at Athens.
Thomas, D. H. (1973). An empirical test of Steward’s model of Great Basin settlement patterns. American Antiquity,
38, 155–176.
Thomas, D. H. (1975). Nonsite sampling in archaeology: Up the creek without a site? In J. W. Mueller (Ed.), Sampling
in archaeology (pp. 61–81). Tucson, AZ: University of Arizona Press.
Thompson, S. K. (2012). Sampling (3rd ed.). Hoboken, NJ: John Wiley & Sons.
Thompson, S. K., & Seber, G. A. F. (1996). Adaptive sampling. New York: John Wiley & Sons.
Ullah, I., Duffy, P., & Banning, E. B. (2015). Modernizing spatial micro-­refuse analysis: New methods for collecting,
analyzing, and interpreting the spatial patterning of micro-­refuse from house-­floor contexts. Journal of Archaeologi-
cal Method and Theory, 22(4), 1238–1262.
Verhagen, P. (2013). Site discovery and evaluation through minimal interventions: Core sampling, test pits and trial
trenches. In C. Corsi, B. Slapšak, & F. Vermeulen (Eds.), Good practice in archaeological diagnostics (pp. 209–225).
New York, NY: Springer.
Vescelius, G. S. (1960). Archaeological sampling: A problem in statistical inference. In G. E. Dole & R. L. Carneiro
(Eds.), Essays in the science of culture, in honor of Leslie A. White (pp. 457–470). New York, NY: Crowell.
Wallace-­Hadrill, A. (1990). The social spread of Roman luxury: Sampling Pompeii and Herculaneum. Papers of the
British School at Rome, 58, 145–192.
White, G. G., & King, T. F. (2007). The archaeological survey manual. Walnut Creek, CA: Left Coast Press.
Whitelaw, T., Bredaki, M., & Vasilakis, A. (2006). The Knossos urban landscape project. Archaeological Interpretation,
10, 28–31.
Williams, L., Thomas, D. H., & Bettinger, R. (1973). Notions to numbers: Great Basin settlements as polythetic sets. In
C. L. Redman (Ed.), Research and theory in current archaeology (pp. 215–237). New York, NY: John Wiley and Sons.
Wobst, M. (1983). We can’t see the forest for the trees: Sampling and the shapes of archaeological distributions. In
J. A. Moore & A. S. Keene (Eds.), Archaeological hammers and theories (pp. 37–85). New York, NY: Academic Press.
4
Spatial point patterns
and processes
Andrew Bevan

Introduction
Point pattern analysis typically refers to a suite of statistical methods that address the potentially complex
spatial relationships that might exist among real-­world phenomena (e.g. a spatial distribution of artefacts
across a house floor or of human settlements across a landscape), by simplifying these phenomena as 2D
points (occasionally extended to the 3D case). Sometimes the points can have categorical or numerical
labels (e.g. chronological phases or size/weight/area estimates), but often they do not, and the main ques-
tion of interest is what we might learn about the pure spatial structure of the point distribution itself. In
archaeology, such techniques have a long pedigree stretching back to the earliest informal interpretations
of artefact or site distribution maps (e.g. Crawford, 1912), to the rise of more quantitative archaeologi-
cal methods from the later 1960s and 1970s (Clarke, 1968; Hodder & Hassell, 1971; Hodder & Orton,
1976; Clarke, 1977; Hodder, 1977), to the extra spatial support provided by Geographical Information
Systems (GIS) from the early 1990s onwards (Allen, Green, & Zubrow, 1990; Ladefoged & Pearson, 2000;
Wheatley & Gillings, 2002 pp. 114–120; Conolly & Lake, 2006 pp. 112–186), to the greater flexibility
offered by simulation-­based techniques today (e.g. Crema, Bevan, & Lake, 2010; Nakoinz & Knitter,
2016). Archaeology has been an enthusiastic borrower of methods from neighbouring subject areas such
as ecology, statistical science or quantitative geography, but has also been forced to address some distinc-
tive challenges raised by its own uncertain, time-­compressed and patchy evidence. This chapter reviews
some of the key concepts associated with point pattern analysis and point process modelling, whilst also
making some suggestions about which techniques are usually more effective in addressing archaeological
problems (for more detailed treatment beyond archaeology, please see Illian, Penttinen, Stoyan, & Stoyan,
2008; Gelfand, Diggle, Fuentes, & Guttorp, 2010; Diggle, 2013; Baddeley, Rubak, & Turner, 2015).1

Method

Spatial intensity
A simple, traditional way to progress from a first ‘eye-­balling’ of a spatial point distribution to a
more formal characterisation of it is to measure the density of the points in the distribution within a
Spatial point patterns and processes 61

carefully-­defined study area. Informally, archaeologists often get away without defining the exact spatial
region within which they are making their observations, but a crucial first step in any more formal treat-
ment is to define this zone explicitly (e.g. as a square, rectangle, irregular polygon, etc.). The stricter tech-
nical term for this measurement of density within a crisp analytical window is the first-­order spatial intensity
of the point distribution. The simplest summary is a single number (often the Greek latter l is used to
refer to this) expressing the average intensity (aka expected value) of points per unit area. Figure 4.1(a)
shows an example of a random distribution of 100 points in a 10 × 10 unit (for the sake of argument let
us assume these units are in metres) so l=1. More localised impressions of spatial intensity are possible
if, for example, we divide this study area up into 25 grid squares or quadrats each 2 × 2m. A basic first
question to ask is how many points might we expect, by chance and all other factors being equal, to fall
into each of these quadrats (given each quadrat has an equal chance of receiving points)? On brief first
reflection, you might be forgiven for assuming that most quadrats might get about 100/25 = 4 points
each? However, if the distribution is random, the actual observed count of points per quadrat should
not uniformly be 4, but rather should follow a theoretical Poisson distribution, with a lot of quadrats
exhibiting fewer points than 4 and a few exhibiting considerably more (Figure 4.1(b–c)). This theoreti-
cal premise is central to point pattern analysis: in a wholly random distribution of points, the observed
intensity of points per unit area should conform to a Poisson distribution (this is sometimes referred to as
complete spatial randomness or CSR, and the random behaviour that created the pattern is often referred to
as a Poisson point process). A more formal quadrat test of Figure 4.1(a) for example confirms that it does
not depart significantly from what we would expect if the pattern were generated by a random Poisson
process (p [probability value]=0.44).
A further, related theoretical starting assumption is that any given point pattern will behave in a
homogeneous way across a given study area. However, many real world point distributions depart from
this starting assumption and are inhomogeneous, exhibiting systematically fewer or greater numbers of
points in certain parts of the study area than in others. An archaeological example might be a prehistoric
house-­floor where more artefacts are found on the northeastern side of the floor than elsewhere. In such
observed cases of an inhomogeneous pattern, we typically assume that an external trend (e.g. a past cul-
tural behaviour that favoured artefact use in the northeastern part of a house or a preservation bias leading
to better survival of archaeological finds in this part of the house) is influencing the changing intensity
of points across the study area. As an example, Figure 4.1(d) shows a new pattern of 100 points where
there is now a clear first-­order trend towards more points in the upper-­r ight part. A quadrat count in
this second case produces a frequency distribution which does not look Poisson in shape and a quadrat
test confirms this (Figures 4.1(e–f), p=0.001). In many cases, departure from this null model of a Poisson-­
distributed, random, homogenous point pattern (aka CSR) is so obvious we probably do not need to do
such a test. Indeed practical experience suggests that very few real-­world archaeological examples are ever
wholly random (there is almost always some sort of first-­order trend leading to an uneven distribution).
A popular and useful related method for summarising the first-­order intensity of a point pattern is a
kernel density surface (aka kernel density estimation, or KDE). The principle behind this approach is similar
to the quadrat count map in Figures 4.1(b) and 4.1(e), but instead of the grid squares, a small window or
‘kernel’ is moved systematically across each part of the study area. At each location as the kernel moves, the
number of points inside the kernel are counted up and this total is then mapped at the temporary location
of the kernel centre, before the latter moves to a new location. The result, once the kernel has been moved
across the entire study area and a count at each specified location has been calculated, is often a raster (pixel-­
based) map where each pixel expresses the intensity of the point pattern in that local neighbourhood. The
chosen kernel shape need not be square: GIS packages, for example, often provide kernel density estimates in
a crisply defined circular region (i.e. of a fixed radius), sometimes with a distance-­weighting so that points
Figure 4.1 Examples of the first-­order spatial intensity of a point pattern and its summaries: (a) a random point
distribution (n = 100, the study area is notionally 10 × 10 map units in size), (b) a quadrat count of the same,
(c) the histogram of observed quadrat counts and the expected Poisson distribution if the pattern is random,
(d) an inhomogeneous point distribution where the intensity of points is higher in the top-­r ight corner, (e) a
quadrat count of the same, (f) the histogram of observed quadrat counts and the expected Poisson distribution if
the pattern is random, (g) kernel density estimate of the inhomogeneous pattern in (d) using a Gaussian kernel
with a standard deviation of 0.5 map units, (h) the same, but with a kernel standard deviation of 1 map unit,
and (i) the same, but with a kernel standard deviation of 2 map units.
Spatial point patterns and processes 63

falling near the centre of the circle contribute more to the density estimate than those falling on the edge.
In contrast, an even more useful choice for a range of statistical applications is a continuous two-­dimensional
Gaussian kernel, which takes into account all points in each calculation (i.e. the kernel does not have an
abrupt outer boundary) and weights each point with a distance decay from the centre of the kernel that is
shaped like a normal (bell-­shaped) distribution. This offers a statistically stricter estimate that maintains the
correct total intensity of the point pattern across the study area. A nice analogy for this stricter approach is
offered by Baddeley et al. (2015, p. 168) who suggest thinking of each point in the pattern as a square of
chocolate. A hairdryer (the continuous, distance-­decaying kernel) is passed over the study area systematically
to melt each bit of chocolate a little. The result is that all of the chocolate bits turn into a smoother, slightly
melted surface, but one that still retains peaks of chocolate where there were more bits in the first place, and
one that conserves the original total mass of chocolate.
An important further choice in kernel density estimation is the size or ‘bandwidth’ of the kernel. In
Figures 4.1(g–i), kernel density surfaces are shown for three different Gaussian bandwidths corresponding
to 0.5, 1 and 2 map units, revealing finer-­and coarser-­scale information about the first-­order patterning.
Automatic methods for selecting a statistically appropriate bandwidth do exist (Baddeley et al., 2015,
pp. 171–172) and are initially attractive in taking this otherwise often arbitrary decision out of the hands
of the user, but it is worth noting that such bandwidth optimisation routines do not always agree with
each other or produce the most visually convincing results (for example in the current example, three
different bandwidth selectors would suggest strikingly different optimal Gaussian bandwidths: s = 0.5,
1.3 and 3.9), so the best strategy is often to explore more than one bandwidth and then justify your choice
in some manner (either by noting that it arises from consistent results amongst automatic methods or
because it reflects a meaningful behavioural scale for the pattern and process under study).
Beyond choices of kernel size and shape, it might be asked why one would bother with a kernel den-
sity surface at all when you could just eye-­ball the spatial pattern of dots on the map itself (or use some
other statistical summary)? Indeed, sometimes kernel density maps are published as ‘analysis’ when they
do not really seem to be adding to the authors’ argument. Nevertheless, there are several reasons why
KDE are often an important starting point: (a) when different bandwidths are explored, they formalise
our comparison of finer-­versus coarser-­scale first order intensity, (b) they can summarise situations where
there are too many points in the study area to eye-­ball easily, where there are lots of nearly superimposed
points or where points need to be treated unequally (with weights although this should not be confused
with situations where the points are sampling locations and interpolation is far more appropriate, see
below), and (c) when a strict kernel density estimator is used, the non-­parametric summary of inhomo-
geneous spatial intensity that it provides can be repurposed for more complex statistical treatments, for
instance as part of inhomogeneous point process models or relative risk surfaces (see below).

Spatial interactions
So far we have considered the so-­called first-­order properties of a point pattern, those to do with the general
spatial intensity (density) of the points, and especially whether this intensity is homogeneous or not across
the study area. In cases, where the pattern is not homogeneous, we might anticipate an external factor at
work: the vast literature of archaeological site location modelling (sometimes just called ‘predictive mod-
elling’, see Kvamme, this volume; Verhagen & Whitley, this volume), for example, is based firmly on the
assumption that a spatial pattern of past human settlements across a landscape is often non-­random and that
site locations can be predicted by external variables such as ‘steepness of terrain’, ‘soil fertility’ or ‘distance
to the nearest river’. However, a crucial second aspect of point patterns is the fact that the existence of one
point at a certain location in the study area may well increase or decrease the chance that another point
64 Andrew Bevan

occurs nearby. This is a second-­order property of the point pattern, associated with the spatial interactions
that might exist between two or more points (it might also be called the pattern’s covariance structure).
Where non-­random second-­order patterning is observed, the assumption is usually that some process of
‘attraction’ encourages more clustering (aka clumping) of points than we would expect by chance or some
process of ‘inhibition’ is at work that discourages the points from getting too close to one another and leads
to a ‘regular’ (aka ‘uniform’ or ‘dispersed’) spatial pattern. Figures 4.2(a–c) show three simulated examples
of point patterns with, respectively, no second order effects, regular spacing and strong clustering of points.

Figure 4.2 Three hypothetical distributions (a–c) and how they manifest as K functions (d–f) and pair correla-
tion functions (g–i), the x-­axes in (d–i) are in metres (see the scalebar in Figure 4.2(c)) and refer to the radius
of the circles around each point within which the respective K or Pair Correlation Function (PCF) statistic is
calculated; the critical envelope encompasses 95% of 999 simulations.
Spatial point patterns and processes 65

Table 4.1 Nearest neighbour (NN) test for the three hypothetical distributions in Figure 4.2(a–c) (with an edge
correction applied, as proposed by Donnelly (1978)).

Pattern Mean NN Exp. Mean R (aka NNI) p-­value

Random 0.5481 0.5222 1.0497 0.392


Regular 0.7041 0.5222 1.3483 0.002
Clustered 0.2008 0.5222 0.3845 0.002

There is also a long tradition of archaeologists addressing second-­order patterns. For example, case studies
from the 1970s onwards sought to formalise our assessment of whether settlements were more evenly spaced
in the landscape than we might expect by chance, or whether certain artefact categories were clustered
together2 on a house floor (e.g. Hodder & Orton, 1976 pp. 38–51). A popular, but in truth highly problem-
atic, exploratory statistic and significance test for this in archaeology from the 1970s onwards was to calculate
the distance between each point and its nearest neighbour and compare the mean nearest neighbour distance
to what we might expect by chance given the sample size of points and size of the study area (borrowed from
ecology, Clark and Evans (1954), and known as a nearest neighbour test or Clark and Evans test). Table 4.1
summarises the results we would get for this test for the patterns in Figure 4.2(a–c). An r-­value or nearest
neighbour index of around 1 suggests a random pattern whereas r > 1 suggest a more regular pattern and r
< 1, a more clustered one, with the accompanying p-­value (sometimes a z-­score is quoted instead) indicat-
ing how significant this departure might be from our null hypothesis that the spacing of the points arose by
chance. For the three examples in Figures 4.2(a–c), the nearest neighbour test correctly identifies a random
pattern, significant regularity and significant clustering respectively.
A first problem with the calculation of nearest neighbour distances however (and for many other
methods discussed below) is the uncertainty introduced by the edges of the defined study area, in those
cases where we know the observed point pattern extends beyond the study area but is undocumented. In
such situations, we may over-­estimate the nearest neighbour distance for a point falling near the edge of
the study area, because its real closest neighbour lies just outside but is not known. Fortunately, such edge
effects can be corrected in a variety of ways (as are those in Table 4.1), even if no single optimal solution
exists for all applications. A second, more severe limitation of the above nearest neighbour test is the fact
that it assumes second-­order interactions only ever exist at one scale (the smallest spatial scale, operating
between nearest neighbours), whereas a host of real-­world examples and simulated datasets demonstrate
that points might exhibit meaningful clustering or regularity at multiple scales (or indeed appear random
at one of these different scales, but not at others). A commonplace human example in many parts of the
past and present world is a landscape of villages, where houses might be spaced a small, regular distance
apart (e.g. to allow for gardens and private family space), but cluster together at a medium scale into
hamlets and villages (e.g. to enable various forms of social cooperation), only for those villages perhaps
to be spaced out from one another at the coarse scale (e.g. to share out farmland and other resources).
A variety of more complex methods have been developed since the late 1970s (especially Ripley,
1977), that respond to this challenge and are better designed to characterise random, regular and clustered
behaviour over different scales (with these methods only appearing in archaeology over the last 10–15
years, e.g. Bevan et al 2010; Orton, 2004; Bevan & Conolly, 2006; Vanzetti, Vidale, Gallinaro, Frayer, &
Bondioli, 2010; Nakoinz & Knitter, 2016). Perhaps the most well-­known of these multi-­scale methods
is the K-­function which is constructed by considering each point in the pattern in turn and, for each
point, measuring the intensity of other points that fall in ever-­expanding circular regions around it. The
K function is then a summary of this procedure: the mean point intensity for all circular neighbourhoods
66 Andrew Bevan

of a particular radius. For simpler cases, it is possible to anticipate what the theoretical shape of the K
function would be if the point pattern were wholly random in nature (e.g. if it were Poisson distributed),
but as we shall see several times below, a more flexible approach to assessing this question is to simulate
many random sets of points (a Monte Carlo simulation, for the general approach, see Robert & Casella,
2004) and produce a ‘critical envelope’ that encompasses all (or a chosen percentage) of the simulated K
functions. If the observed K function falls above or below this envelope at a particular circle radius, then
it can be treated as departure from what we might expect by chance and therefore possibly worthy of
further attention.3 Figures 4.2(d–f) provide examples of the K-­functions produced by the three simulated
point patterns in Figures 4.2(a–c). The results confirm that the first pattern is random at all scales, that
the second is regularly-­spaced at small scales up to about 0.5m (a correct identification of the minimum
spacing imposed by the simulation) and that the third pattern is strongly clustered.
In fact, there are many alternatives to K-­functions for measuring second-­order patterning and each
one has certain strengths and weaknesses. As an example, it is worth making a direct comparison with the
results of calculating a pair correlation function (PCF) for the same three distributions (Figures 4.2(g–i)).
A PCF is calculated in a very similar way to the K-­function, but with the important difference that its
circular radii are non-­cumulative (i.e. each one is a doughnut-­shaped ring or annulus that excludes the
previous circle [with a slight smoother then applied], whereas the K function uses cumulative circles). The
results agree closely with those provided by the K-­function, except in the clustered case where the PCF
offers better information about the scale of clustering by suggesting that the pattern becomes random
again at circle radii >1.5 or 2. This correctly identifies the approximate parameter used to create this
simulated pattern in the first place (and the fact that the K function does not identify it reflects its use
of a cumulative approach that is sometimes swamped by the presence of strong smaller-­scale clustering).
The above examples so far have only addressed patterns with a single scale of second-­order effect, but
in contrast Figure 4.3 provides an example with multiple-­scales of second-­order effect. More specifically,

Figure 4.3 Multi-­scale second-­order effects: (a) a simulated process with small-­scale regularity and medium-­
scale clustering (b) the pair correlation function of a (the x-­axis is in metres and refers to the radius of the
circles around each point within which the pair correlation statistic [y-­axis] is calculated; the critical envelope
encompasses 95% of 999 simulations).
Spatial point patterns and processes 67

for this example, a point simulation is constrained to produce spatial patterning with both a minimum
spacing of 25cm between points (i.e. small-­scale, strong regularity) and then moderate clustering at
medium scales up to 2m. It is reassuring that the resulting PCF calculated for this example correctly
identifies both these patterns.

Marked point patterns and relative risk


As noted above, many if not most point pattern analyses operate with a ‘pure’ spatial configuration of
points that we assume are effectively all the same (e.g. a study of the spacing of hillforts, where we might
choose to ignore any differences of hillfort size, type or date). That said, there are certainly instances where
subtler attributes are of major interest, or where we may wish to compare and contrast two different types
of point distribution. In such cases, we can make use of an increasingly powerful range of approaches that
address ‘marked’ point patterns where the marks (aka ‘attributes’ or ‘labels’, or in some instances ‘weights’)
involved can be categorical (e.g. “Iron Age” versus “Roman” artefact distributions) or numerical (e.g.
hillfort surface areas measured in hectares). With the added complexity of modelling that this introduces,
often it is best to keep things simple and consider (a) the case of the distribution without attention to
marks first (e.g. all hillforts regardless of size differences), then (b) univariate examples on their own (e.g.
a point pattern of just ‘large’ hillforts), and thereafter (c) simple pairs of comparisons (e.g. the spatial
relationship between ‘small’ and ‘large’ hillforts), and if deemed desirable only then a continuous marked
pattern (e.g. the spatial distribution of the raw size estimates of all hillforts).
One fairly common technique is a bivariate K function (or other related bivariate functions), where
one set of points are labelled ‘Type A’ (e.g. Roman villas) and the other ‘Type B’ (e.g. Iron Age settle-
ments) and then the mean intensity of Type B points is measured within each of a series of concentric
circular neighbourhoods around points of Type A. We can again use Monte Carlo simulation to produce
a critical envelope of expected distances between two types of points, but with an illuminating twist in
terms of how we might control the simulation. For example if we have an observed spatial distribution
of 50 points, within which there are, 20 Type A points and 30 Type B points, we might (a) scatter, 20
Type A points and 30 Type B points at random in the study area and check how the resulting simulated K
function compares to the observed K function (and do this many times to build up a critical envelope). Or
alternatively (b) we can start with the fixed locations of the observed points and simply shuffle which, 20
are labelled Type A and which 30 are labelled Type B. Which of these two alternative simulation strate-
gies you should choose depends on what kind of inference you wish to draw about your data. The latter
approach that permutes the marks and keeps the locations fixed is often a more robust option in cases
when the point distribution is inhomogeneous and/or where there might be some missing evidence.
In any case, when we are indeed dealing with a point pattern of discrete events, there are plenty of
instances where we might wish to compare two types of point with each other or to compare a subset
of points of a certain type to a parent distribution of all types. In such cases, the notion of a ‘rela-
tive risk’ surface (or map) becomes very useful (Kelsall & Diggle, 1995; for archaeological examples,
Bevan, 2012; Smith et al., 2015 which also demonstrates that significance contours can be calculated
for such maps). There are various slightly different versions of this calculation, of which we will focus
on a simple one that expresses the spatially-­varying probability of a point distribution of Type A for
example, given the total point pattern (e.g. of Types A–E). First, construct a kernel density surface
of Type A points with a specific bandwidth and then construct a second kernel density surface with
the same bandwidth for points of all types. Then divide the raster map of the first surface by the sec-
ond, the result is a final raster map of values between 0 and 1, which express which local parts of the
study area exhibit an unusually high, unusually low or about average proportion of Type A points.
68 Andrew Bevan

Figures 4.4(a–b) shows a hypothetical example of such an approach, demonstrating that the deliber-
ately manufactured first-­order trend of higher point intensity in the top-­r ight of the study area in Fig-
ure 4.1(d) is correctly recovered via the probability surface (i.e. by a kernel density of the Figure 4.1(d),
divided by a kernel density of both the Figure 4.1(a) and 4.1(d) points). Figures 4.4(c–d) offers a real

Figure 4.4 Mapping the spatial probability of a point subset: (a) a hypothetical example of 200 points that
combines the random and inhomogeneous point patterns from Figures 4.1 (a, d). (b) the local spatial probabil-
ity (out of 1) of finding a point from the inhomogeneous surface Figure 4.1d, with a much higher probability
to the top-­r ight (c) UK Portable Antiquities Scheme data showing Iron Age gold and silver coins of ‘Dobunni’
style, and (d) the local spatial probability of finding gold coins with an area of much higher probability on the
western borders (the hatched area is an arbitrarily excluded zone where the overall number of coins of any
material is too low to allow a meaningful probability estimate).
Spatial point patterns and processes 69

world example showing how the same method can identify an interesting difference in gold versus
silver coins for an assumed late Iron Age tribal area in western England (associated with ‘Dobunni’-­
style coinage), where the gold coins appear preferentially deposited at the margins of a possible tribal
territory (perhaps related to border disputes and/or mercenary payments) in contrast to the silver coins
which appear to have circulated more evenly within the tribal core region. These insights are possible
despite likely spatial biases in the modern recovery of Iron Age coins (Bevan, 2012 pp. 500–504, with
further references). A useful extra step to avoid misleading results is to truncate the probability surface
along its edges at an arbitrary minimum findspot density (i.e. do not calculate the probability where
the denominator is too low, such as the hatched area in Figure 4.4(d)).
As a final note in this section, it is worth addressing a common point of confusion with regard
to point patterns that have marks or attributes: the difference between a discrete point-­event and a
point-­based sample. If we have objects such as settlements or artefacts that are discrete in space and
time and can each be simply represented by a point with an accompanying mark representing some-
thing like size, then the point pattern and marked point pattern methods discussed in this chapter are
entirely appropriate. If, however, we have points that merely represent sampling locations from a wider
continuous field of possible measurements (that could in principle have been made anywhere in the
study area), then alternative methods are needed. For example, you might measure soil chemistry or
layer-­depth information at multiple borehole sites across a valley or estimate surface pottery density in
a set of circular samples across a ploughed field. In these two examples, the point samples are part of a
wider continuous pattern and a mixture of simple interpolation techniques, regression and geostatisti-
cal methods such as kriging are far more appropriate (Conolly, this volume, Hacıgüzeller, this volume;
Lloyd & Atkinson, this volume).

Conditional simulation and model-­fitting


So far, the above discussion has focused on empirical statistical descriptions of a point pattern, such as the
mapping of a kernel density surface (to summarise first-­order intensity) or the plotting of a K-­function
(to express second-­order properties of clustering or regularity among points). We have also discussed
simpler examples of Monte Carlo simulation used to produce ‘critical envelopes’ that help to assess how
meaningful the observed departures of a point pattern are from what might be expected by chance.
Beyond these descriptions, there are also more complex approaches to point patterns, that typically fall
under the term point process models and in particular allow complex fitting of theoretical first and second
order models to observed (e.g. archaeological) data. Such approaches exploit the power of both (a) model
selection methods (i.e. those that seek to optimise the goodness-­of-­fit between an observed point pattern
and various first order predictor variables or second-­order interaction parameters) and (b) conditional
simulation where we can produce realisations of hypothetical point patterns that are partly random, partly
constrained (e.g. constrained by the presence of an inhomogeneous trend or a minimum point-­spacing),
and check how well they correspond to what we actually observe (the most important workhorse func-
tions are known as Markov Chain Monte Carlo simulations). There are at least two roles that we might
envisage for such models in archaeology: (a) to narrow down the possible cause-­effect relationships for
an archaeological distribution (e.g. of finds, settlements etc.) by jointly fitting the observed distribution
both to first-­order and second order characteristics (e.g. Eve & Crema, 2014), and (b) to simulate mul-
tiple, plausible ‘full’ sets of finds or settlements for a given episode in the past, in situations where our
archaeological knowledge is patchy and we therefore wish to perform sensitivity analysis on the results
(e.g. Bevan & Wilson, 2013). Two related archaeological examples will be considered in the case study
section below.
70 Andrew Bevan

Uncertain and incomplete data


In many applications of point pattern and point process techniques outside of archaeology, we know
about the observable phenomenon exhaustively across the study area. For example, for the spatial pattern
of trees growing in a forest, we might be able to map every single tree accurately. In contrast, archaeologi-
cal observations are typically far more partial, imperfect spatial records of past activity, due both to variable
preservation and to uneven modern-­day investigation or recognition. A distribution of metal finds across
a region, for instance, might be biased by where modern construction projects, metal detecting activity
and/or archaeological interventions have taken place, as well as by how well certain soils preserve metal. A
distribution of archaeological artefacts across a ploughed field might be spatially biased by the depth and
direction of modern ploughing or amongst other things by the variable skill with which field surveyors
might recognise the artefacts in question and pick them up off the field surface. A distribution of past
human settlements from a particular chronological period can be biased by where people have looked
most carefully for evidence of that period, how well sites of that period survive and can be dated, etc. All
these biases intervene before we can start thinking about the observed point pattern as a realisation of a
purely ancient process or behaviour.
How can valid inferences about first-­ and second-­order patterns be made in the presence of such
bedevilling problems? Well, first, we should manage our expectations and realise that certain kinds of evi-
dence may not be good enough to support a complex statistical model, so some initial quality assessment
is always necessary. Second, we should realise that even with a better resolved dataset, after performing
an analysis we may ultimately be left with several competing models for how the observed archaeologi-
cal point pattern has arisen (a situation which is sometimes called one of ‘equifinality’), but at least we
will probably have a reduced number of possible models to think about. Third, there is always a case for
building up from simple to more complex approaches: very simple methods such as referring in a publi-
cation to the typical (mean or median) nearest neighbour distance among a set of archaeological features
(or better yet a plot of the whole nearest neighbour distribution) will always remain useful, both because
the approach is easy to communicate to a wider audience and because the results are relatively robust to
missing data. For example, a useful numerical description in a publication might be: “the typical spacing
between my observed archaeological settlements appears to be 5km” even in situations where it is hard
to take this observation any further. Beyond such simple statements, relative risk methods are repeatedly
useful because they can explore variations in the first order intensity of particular kinds of point, without
assuming such points arose on an ‘even playing field’ (see above). They thereby offer a useful case-­control
approach similar and complementary to the use of case-­control in site predictive models (Verhagen &
Whitley, this volume). In contrast, K functions and related second-­order measures (L functions, pair cor-
relation functions, etc.) are often uninformative or inappropriate for initial data exploration, because they
assume a first order homogeneity (an ‘even playing field’) that we know from the outset is usually not
there in the archaeological record. They can however often still be very useful as a later stage of analysis,
once certain spatial inhomogeneities have been modelled.

Case study
There are as yet very few published case studies that fit point process models to archaeological data (for a
preliminary example using Iron Age I sites in the West Bank, see Bevan et al., 2013), so it is worth explor-
ing an additional example here. An interesting example is provided by two complementary perspectives
on recent settlement on the Greek island of Antikythera (Bevan & Conolly, 2013). On the one hand,
standing building evidence (e.g. old houses and other shelters) is what people traditionally think of as
Spatial point patterns and processes 71

defining the location, size and shape of historical period villages, but on the other hand, it is often through
surface artefact scatters that survey archaeologists seek to infer the same parameters for past episodes of
settlement. An important, under-­appreciated question, then, is the degree to which the former and the
latter exhibit the same patterning, in the rare cases where both kinds of evidence are available. An inten-
sive survey across Antikythera’s entire extent (20.5 sq. km) revealed 407 standing buildings (excluding
special purpose installations such as windmills) dating approximately to the 18th-­earlier 20th century
AD (hereafter called the ‘Recent’ period, see also Bevan et al., 2004) and 1,644 diagnostic pottery sherds
of the same period (Figures 4.5(a–b)).
Both kinds of evidence are clearly clustered in only certain parts of the island and typically they seem
to coincide, as we might hope and expect. Pottery is likely to be discarded within people’s houses and next
to them in middens, for example, but it might also find its way further out into agricultural fields via pro-
cesses such as manuring. Previous regression modelling identified significant correlations between Recent
period sites (as identified by the pottery) and landscape variables such as access to lower lying, flat (and
better arable) land in the island’s softer geologies and proximity to freshwater springs (Bevan & Conolly,
2013 pp. 106–109). Here only two key variables are retained – the amount of local flat land within
500m radius neighbourhood and distance to the two main freshwater springs (hereafter “flat land” and
“spring distance”) – not least because they seem to capture most of the first order variation we observe
(Figures 4.5(c–d)). For completeness given the methodological focus of this chapter, Figures 4.6(a–b)
shows PCF for the two observed datasets, with a critical envelope drawn from entirely random points. As
expected and obvious visually, both patterns are highly clustered at all spatial scales, such that the two plots
are largely unnecessary. Inset maps in both figures however give an idea of what the random simulations
that make up the critical envelope look like (completely spatially random).
More informative, is the result when a first-­order regression model is built by correlating the intensity
of observed points with the two covariates, flat land and spring distance. Both covariates are significant
and are retained in the resulting multivariate logistic models for both the buildings and the pottery. The
critical 95% envelopes of the PCFs can now be recalculated for simulations conditioned on these first-­
order covariates (Figures 4.6(c–d)), with the result that the observed PCFs are now much closer to the
simulated ones. Even so, there is a suggestion of additional clustering of buildings up to about a 200m
interaction distance, and at even larger distances for the pottery. A third and final modelling stage there-
fore fits an additional clustering component, above and beyond the first order trend modelled by the
covariates.4 When the critical enevelopes are now recalculated, the observed PCFs for both the buildings
and the pottery now fall almost entirely inside (Figures 4.6(e–f)), especially in the case of the buildings.
More informally, example simulations of the two models produce results that look close in character to
the observed villages and the observed pottery distributions, with finds in plausible locations and with
plausible levels of clumping (map insets in Figures 4.6(e–f)). The fit of the first order covariates can now
also been adjusted in light of the additional fitted clustering component, and the final results suggest flat
land and spring distances are similarly strongly correlated with the buildings, but that for the pottery
flat land was the more important of the two (and spring distances retained in the model but of marginal
significance) (Table 4.2).
What can be learnt from such an exercise? It is reassuring that both pottery and buildings exhibit simi-
lar patterns with respect to the wider environment and also show signs of additional clustering. Whilst
we might further expect regularity in the spacing between villages in a landscape (i.e. not just short-­range
clustering but also medium-­range ‘inhibition’ due to competition over resources), in this particular case
the clustering alone is enough to account for the pattern. Perhaps more importantly, the pottery displays
stronger clustering than the standing buildings over short distances (up to 75m), but then also weaker
extended clustering at larger distances than the standing buildings. This suggests we should be a little
Figure 4.5 Archaeological survey evidence from the Greek island of Antikythera: (a) individual houses and
field huts of approximately 19th-­early 20th century AD date, (b) surface pottery of approximately 19th-­early
20th century AD date collected during fieldwalking of the whole island, (c) access to flat land (a count of how
many flatland cells are within a radius of 500m), and (d) distance to the nearest large freshwater spring (square
root-­transformed).
Figure 4.6 PCFs for three stages of model fitting, calculated in the same way for both buildings and surface pot-
tery (the x-­axis is in metres and refer to the radius of the circles around each point within which the pair correla-
tion statistic [y-­axis] is calculated; the critical envelope encompasses 95% of 999 simulations): (a–b) observed PCF
(black line) and 95% critical envelope (grey shaded area) constructed from random simulations of a homogeneous
Poisson process (i.e. a null model of complete spatial randomness); (c–d) the same as (a–b), but with envelopes
now constructed from conditional random simulations from a fitted first order model using the two covariates in
Figures 4.5(c–d); and (e–f) the same as (c–d), but with envelopes now constructed from simulations conditioned
on both the two first order covariates and an additional clustering model.
74 Andrew Bevan

Table 4.2 Summary results of final models (after adjustment for the correlation of the clustering component).

Dataset Covariate Coefficient Std Error Z-­value

Buildings (Intercept) –4.38881 1.83189 –2.39578


Flat Land 2.35623 0.58034 4.06012
Spring Distance –8.98929 1.98107 –4.53760
Log Gaussian Cox Process, spherical covariance model: variance at distance zero=1.85, scale=296.7
Pottery (Intercept) –6.47897 3.31123 –1.95667
Flat Land 3.07563 0.97361 3.15899
Spring Distance –5.31404 3.52597 –1.50712
Log Gaussian Cox Process, exponential covariance model: variance at distance zero=3.50 scale=72.3

cautious about our interpretation of the extent of pottery scatters as proxies for settlement footprints,
because (to offer an interpretation) highly localised, dense pottery dumps may under-­represent village
spaces whilst wider manuring and other pottery dispersal in the wider landscape may over-­represent them
(with a more accurate inference about what constitutes a ‘village’ probably lying somewhere in-­between).

Conclusion
The above discussion is a necessarily rapid survey of the theoretical underpinnings, methodological chal-
lenges, and analytical opportunities associated with point pattern analysis in archaeology. While impor-
tant developments over the last few years have enhanced the applicability of these methods, a range of
challenges still remain. These methodological struggles are rarely new or trivial, but the way they arise
with respect to the deceptively simple point pattern, serves nicely to elucidate wider challenges of theory,
method and communication in archaeology overall.

Acknowledgements
I have benefitted for many years from discussion with a range of spatially and statistically inclined colleagues
and students at University College London (although any remaining problems with it are of course my
own). Thanks especially to Mark Lake and James Conolly with whom I have co-­taught relevant courses
and/or co-­published on several occasions, as well as to Enrico Crema, Alessio Palmisano and Eva Jobbova
for further in-­class assistance and insight. Thanks also to Denitsa Nenova for reading a draft of this chapter,
Daniel Pett for providing the PAS dataset and Joanita Vroom for her work on pottery from Antikythera’s
more recent past (for both the pottery and buildings datasets, see DOI: 10.5284/1024569). Many thanks to
Mark Gillings, Piraye Hacıgüzeller and Gary Lock for some fine editorial suggestions as well. Although soft-
ware solutions do come and go, it is worth noting here that the best place to conduct point pattern analysis
and point process modelling is currently the R statistical environment (R Development Core Team, 2011),
especially with the spatstat package (Baddeley & Turner, 2005; Baddeley et al., 2015).

Notes
1 In order to address as wide an audience as possible, this chapter tries to limit the amount of new statistical jargon
that it introduces, but inevitably some technical vocabulary is necessary. Important concepts are italicised the first
time they appear, while other jargon terms are sometimes mentioned more cursorily in single quotation marks.
Spatial point patterns and processes 75

The use of formalised statistical notation for concepts or particular analytical functions has been avoided: this
strategy of course has both strengths and weakness, but readers who prefer these formalisms are referred to the
four textbooks cited alongside this note.
2 While this chapter deals with how clustered points might be modelled, it does not deal with the related topic of
how one assigns points in a distribution to a particular clustered group. The latter is the spatial version of a wider
cluster-­definition problem for which there are many spatial and aspatial applications (e.g. Maddison, this volume).
3 It is worth emphasising (a) that this Monte Carlo envelope does not provide a ‘confidence interval’ for the true
value of the K function, nor (b) does the default pointwise approach used in most software provide a ‘global sig-
nificance test’ for interaction distances, because of the statistical pitfalls of testing multiple hypotheses at once (i.e.
multiple circle radii: Baddeley et al., 2015, pp. 233–236, for further details).
4 This paper largely avoids introducing the considerable technical terminology used for speaking about point
interaction models and covariance, but it is worth noting that for this case study, a Log Gaussian Cox Process
(LGCP) was fitted with spherical model for the buildings, and another with an exponential model for the
differently-­shaped clustering of the pottery (in both cases also specifying an inhomogeneous trend via the two
mapped covariates). LGCPs are well-­suited to situations where the causes of the additional clustering might
involve missing first-­order variables and/or a mixed set of second-­order interactions. It is also possible to fit
cluster processes that are more explicit about the kind of point interactions involved (e.g. parent-­offspring
models of Neyman-­Scott type, see the kppm() function in the R spatstat package and Baddeley et al., 2015,
pp. 473–479, for further details).

References
Allen, K. M. S, Green, S. W., & Zubrow, E. B. W. (Eds.). (1990). Interpreting space: GIS and archaeology. London, UK:
Taylor and Francis.
Baddeley, A. J., Rubak, E., & Turner, R. (2015). Spatial point patterns: Methodology and applications with R. Boca Raton,
US: Chapman and Hall and CRC.
Baddeley, A. J., & Turner, R. (2005). Spatstat: An R package for analyzing spatial point patterns. Journal of Statistical
Software, 12(6), 1–41.
Bevan, A. (2012). Spatial methods for analysing large-­scale artefact inventories. Antiquity, 86, 492–506.
Bevan, A., & Conolly, J. (2006). Multi-­scalar approaches to settlement pattern analysis. In G. Lock, & B. Molyneaux
(Eds.), Confronting scale in archaeology: Issues of theory and practice (pp. 217–234). New York, US: Springer.
Bevan, A., & Conolly, J. (2013). Mediterranean Islands, fragile communities and persistent landscapes: Antikythera in long-­term
perspective. Cambridge: Cambridge University Press.
Bevan, A., & Wilson, A. (2013). Models of settlement hierarchy based on partial evidence. Journal of Archaeological
Science, 40(5), 2415–2427.
Bevan, A., Crema, E. R., Li, X., & Palmisano, A. (2013). Intensities, interactions and uncertainties: Some new
approaches to archaeological distributions. In A. Bevan & M. Lake (Eds.), Computational approaches to archaeological
spaces (pp. 27–51). Walnut Creek, US: Left Coast Press.
Bevan, A., Frederick, C., & Krahtopoulou, N. (2004). A digital Mediterranean countryside: GIS approaches to the
spatial structure of the post-­Medieval landscape on Kythera (Greece). Archaeologia e Calcolatori, 14, 217–236.
Clark, P. J., & Evans, F. C. (1954). Distance to nearest neighbour as a measure of spatial relationships in populations.
Ecology, 35, 445–453.
Clarke, D. L. (1968). Analytical archaeology. London, UK: Methuen.
Clarke, D. L. (Ed.). (1977). Spatial archaeology. Boston: Academic Press.
Conolly, J., & Lake, M. (2006). Geographical information systems in archaeology. Cambridge: Cambridge University Press.
Crawford, O. G. S. (1912). The distribution of early Bronze Age settlements in Britain. The Geographical Journal,
40(2), 184–197.
Crema, E., Bevan, A., & Lake, M. (2010). A probabilistic framework for assessing spatio-­temporal point patterns in
the archaeological record. Journal of Archaeological Science, 37(5), 1118–1130.
Diggle, P. (2013). Statistical analysis of spatial and spatio-­temporal point patterns. Boca Raton, US: CRC and Taylor and
Francis.
Donnelly, K. (1978). Simulations to determine the variance and edge-­effect of total nearest neighbour distance. In
I. Hodder (Ed.), Simulation studies in archaeology (pp. 91–95). New York, US: Cambridge University Press.
76 Andrew Bevan

Eve, S., & Crema, E. (2014). A house with a view? Multi-­model inference, visibility fields, and point process analysis
of a Bronze Age settlement on Leskernick Hill (Cornwall, UK). Journal of Archaeological Science, 43, 267–277.
Gelfand, A., Diggle, P., Fuentes, M., & Guttorp, P. (2010). Handbook of Spatial Statistics. London, UK: CRC and Taylor
and Francis.
Hodder, I. (1977). Spatial studies in archaeology. Progress in Human Geography, 1, 33–64.
Hodder, I., & Hassell, M. (1971). The non-­random spacing of Romano-­British walled towns. Man, 6, 391–407.
Hodder, I., & Orton, C. (1976). Spatial analysis in archaeology. Cambridge, UK: Cambridge University Press.
Illian, J., Penttinen, A., Stoyan, H., & Stoyan, D. (2008). Statistical analysis and modelling of spatial point patterns. New
York, US: Wiley-­Interscience.
Kelsall, J., & Diggle, P. (1995). Non-­parametric estimation of spatial variation in relative risk. Statistics in Medicine,
14, 2335–2343.
Ladefoged, T., & Pearson, R. (2000). Fortified castles on Okinawa Island during the Gusnku Period, AD 1200–1600.
Antiquity, 74, 404–412.
Nakoinz, O., & Knitter, D. (2016). Modelling human behaviour in landscapes: Basic concepts and modelling elements. New
York: Springer.
Orton, C. (2004). Point pattern analysis revisited. Archaeologia e Calcolatori, 15, 299–315.
R Development Core Team. (2011). R: A language and environment for statistical computing. Vienna: R Foundation for
Statistical Computing. Retrieved from www.R-­project.org/
Ripley, B. (1977). Modelling spatial patterns. Journal of the Royal Statistical Society B, 39(2), 172–212.
Robert, C., & Casella, G. (2004). Monte Carlo statistical methods (2nd ed.). New York, US: Springer.
Smith, B., Davies. T., & Higham, C. (2015). Spatial and social variables in the Bronze Age Phase 4 cemetery of Ban
Non Wat, Northeast Thailand. Journal of Archaeological Science Reports, 4, 362–370.
Vanzetti, A., Vidale, M., Gallinaro, M., Frayer, D. W., & Bondioli, L. (2010). The iceman as a burial. Antiquity, 84,
681–692.
Wheatley, D., & Gillings, M. (2002). Spatial technology and archaeology: The archaeological applications of GIS.
London, UK: Taylor and Francis.
5
Percolation analysis
M. Simon Maddison

Introduction
Percolation analysis is a technique for identifying and demarking clusters within a set of spatially arranged
points. The percolation theory was originally developed in the 1940s as a means of describing gelation
processes in materials. This occurs as small branching molecules chemically bonded to become progres-
sively larger macro-­molecules, through the formation of more and more bonds (Stauffer & Aharony,
1991) and is attributed to the work of Flory (1941) and Stockmayer (1943). An everyday example is the
change that happens to an egg when it is boiled.
There are two key aspects of this early work. First, it is based on a cellular lattice model, whereby each
cell is either occupied or not, and may or may not have neighbours. The second aspect is that the process
is applied on a sequential basis. Given a configuration of occupied cells, the process is applied step by
step, so that on the first step any given cell forms a cluster with its nearest neighbours, on the second step
the process is reapplied to the newly clustered neighbours and so on. Early research interest was in this
development within the lattice and the conditions that would allow it to progress (Stauffer & Aharony,
1991). For low occupation density within the cellular lattice clusters will be limited and will not spread
far. However, at a critical density the cluster can grow across the lattice indefinitely, as in a boiled egg. It
was on the conditions that would allow this to occur that this early work was focused.
It was later that Broadbent and Hammersley (1957) dealt with the lattice progress mathematically (see
following) and gave the theory and method its name. Frisch and Hammersley (1963) describe the mecha-
nism as fluid spreading through a medium and draw a clear distinction between percolation and diffusion;
the behaviour is determined respectively either by the nature of the medium or of the fluid. These two
approaches offer significantly different mathematical challenges, but it is the medium that is of interest in
this case, hence the technique and term used is percolation.
Other applications perhaps better explain the percolation name, as the same model theory can be
used to describe, for example, the percolation of water through porous stone, hydrogen through solids, or
natural gas through porous rocks. The conditions sought are those that allow this to happen throughout
the material. A conceptually accessible application is for the propagation of fire through a forest. Fire will
spread from tree to tree within a cluster when they are close enough, but at a critical density the fire can
spread indefinitely. The temporal aspect of the theory is important as it can be used to model the length
78 M. Simon Maddison

of time before the fire burns itself out, based on the number of steps it takes to propagate (Stauffer &
Aharony, 1991).
In summary, the percolation theory is a way of mathematically describing clusters of spatially arranged
points and analysing related behaviour. A cluster is based on a defined distance threshold, so that for any
given point all neighbouring points falling within this threshold are part of the cluster. The test is then
re-­applied for each of these neighbours, and any further points meeting this criterion are also deemed
to be part of the cluster (see following). The test is based on the distance between points as defined by
the number of cells between them (Stauffer & Aharony, 1991) and it is important to note, as can be seen
from the examples quoted, that this can be applied at any scale, from the molecular to the geographical
and beyond.

Method
Extending this widely diverse range of applications, percolation theory has more recently been used in
geography to identify metropolitan areas, based on population density. The City Clustering Algorithm
(CCA) has been developed out of the percolation theory by Rozenfeld, Rybski, Gabaix, and Makse
(2011) using British population data recorded for geographical cells. Described as ‘discrete CCA’, this is
based on a lattice model as described above and illustrated in Figure 5.1; the cellular structure and cluster
development are shown, illustrating the step by step approach of the cluster being identified. The top

Figure 5.1 Discrete City Clustering Algorithm applied to population density in the UK (Rozenfeld et al.,
2008, Figure 1). This shows the step by step approach of the cluster being identified on a given lattice. Top
left shows a populated lattice and top right a cell is chosen as the starting point, and its immediate neighbours
are then incorporated (bottom left). In the final bottom right quadrant the process has been reapplied to those
neighbours as well.
Percolation analysis 79

left quadrant shows a populated lattice. In the top right quadrant a populated cell is arbitrarily chosen as
the starting point, and its immediate neighbours are then incorporated (bottom left). In the final bottom
right quadrant the process has been reapplied to those neighbours as well. This approach using population
density has been carried forward by Arcaute et al. (2015).
This technique was further developed by Rozenfeld et al. (2011) to apply to US population data,
which was not available on a cellular basis as it is in Britain. In this development, the City Cluster-
ing Algorithm was modified to operate within a continuous (two-­dimensional) space and use the
Euclidean distance between points, as opposed to distance within a cellular lattice for the ‘discrete
CCA’. The technique is described as ‘continuum CCA’, shown in Figure 5.2, where an arbitrary
point is selected as a start. Any point falling within a defined threshold distance ‘l’ becomes part of
the cluster, and the process is then re-­applied to each of these points in turn, until the cluster grows
no further.
Arcaute et al. (2016) have more recently adopted this technique for defining urban areas, using
the density of street interconnections rather than population. Interestingly from the archaeological
point of view, they note that this approach reveals patterns that have evolved over millennia through
the influence of culture, politics, administration and trade, expressed as the modern patterns of streets
and roads. They have also developed analytical techniques for identifying transition points in cluster
growth as the distance threshold is progressively increased (see below). Before we go on to consider
the application of percolation techniques in archaeology it is important to note that the focus of these

Figure 5.2 Continuum City Clustering Algorithm (CCA) (Rozenfeld et al., 2011, Figure 2). With Con-
tinuum CCA, the technique is applied in a continuous space (as opposed to a lattice) and neighbours are
defined as falling within a given radius ‘l’. The technique is applied sequentially, starting with an arbitrarily
selected point (top left quadrant), and is then applied repeatedly to the newly included neighbours until the
cluster grows no more.
80 M. Simon Maddison

studies has not been purely academic; they have the potential for directly influencing regional develop-
ment policy, for example.
The first application of the ‘percolation method’ to the spatial analysis of archaeological data came
with the work of De Guio and Secco (1988) who were studying landscapes of power in Mesopotamia.
Importantly and fundamentally they recognised it as a technique of pattern recognition to identify
‘natural groupings’, which might be compared with other hypotheses and sources of information. They
however used a model that incorporated several weighting parameters for each site pair in addition to the
simple Euclidian distance between them; these were mainly based on estimates of site population size, and
include ‘weighted density’, ‘demographic energy’ and ‘dominance’. Unlike Euclidian distance, this intro-
duces significant dependencies on estimates and a degree of subjectivity. Possibly due to this complexity
their work does not appear to have been taken up or more widely used.
More successful have been approaches that built upon the CCA studies discussed earlier. For example
Arcaute, Brookes, Brown, Lake, and Reynolds (forthcoming), have collaborated in applying the same
technique to sites extracted from the Domesday Book (e.g. The Domesday Book Online). They have used
Domesday vills and the administrative territories of hundreds, wapentakes and shires as recorded in 1086
AD to peel back the palimpsest of regions, territories and administrative boundaries in order to reveal
Domesday administrative organisation and relate it to modern Britain, through studies based on road
intersections. This builds on earlier work on the Anglo-­Saxon state by Brookes and Reynolds (2011) and
the Landscapes of Governance Project.1 Likewise, Brown (2015) has developed a GRASS GIS routine for
percolation, again using the same core technique, and applied it to the database of rural settlements in
England created by Roberts and Wrathmell (2000). The approaches developed in the Domesday work
have been influential, directly inspiring the hillfort analysis that forms a case study in this chapter (Mad-
dison, 2016, 2017).
For the study of hillforts in Britain and Ireland, which forms one of the case studies described below,
a suite of programs has been developed to perform percolation analysis (Maddison, 2016) in the R sta-
tistical programming language.2 This is based on core code provided by Elsa Arcaute originally written
for the Domesday study. It uses an algorithm very similar to the continuum City Clustering Algorithm
developed by Rozenfeld et al. (2011) described above. In this approach clusters are identified by creating
a ‘graph’ of nodes based on distances within a defined radius. The process can be repeated for a range
of different percolation radii (threshold distances), given as parameters. It should be noted that whilst
this technique works satisfactorily for datasets of a few thousand points, as in the case of the study of
hillforts in Britain, it is not able to cope with the many tens of thousands of points that make up datasets
such as the UK street intersection analysis. To deal with these volumes of data bespoke solutions have
been developed using the C programming language (Arcaute et al., 2016). Brown (2015) implemented
a similar approach through a GRASS GIS function (written in C) with some variance to optimizing the
data handling, described below. In all these bespoke solutions, data are processed in the form of x and y
coordinates for each point (e.g. the location of a hillfort or vill site), and typically read in from a .csv file,
including a unique identifier for each.
Examples of cluster plots (for hillforts) are shown in Figure 5.7. The colour indicates the size
ranking of the cluster, with red being the largest cluster, blue the next and so forth. Only the largest
15 clusters are coloured, lesser ranked clusters are shown in grey, so that all sites are still plotted. This
provides a qualitative indication of clusters, which will be discussed further below, but it is invalu-
able to also have some way of identifying the significant clustering transitions as the percolation radius
increases. After exploring various different approaches Arcaute et al. (2016) developed the percolation
transition graph building upon earlier experiments in their city (Arcaute et al., 2015) and Domesday
studies (Arcaute, Ferguson, Brookes, & Reynolds, 2014). An example of this is shown in Figure 5.3,
Percolation analysis 81

Figure 5.3 Percolation transition plot – max. cluster size vs. percolation radius.
Source: Taken from the hillfort case study, the vertical axis shows the normalized maximum cluster size plotted for each percolation
radius. In this example it shows a ‘super-­cluster’ forming at a percolation radius of 35km as well as other larger transitions such as
those at 12km. The chart enables transitions of potential interest to be identified and further investigated.

from the hillfort case study. The vertical axis shows the normalized maximum cluster size plotted for
each percolation radius. In this example it shows a ‘super-­cluster’ forming at a percolation radius of
35km. This enables transitions of potential interest to be identified and further investigated. It is worth
noting that transitions will not necessarily occur for the same values in different regions. This may
depend on topography and site density for example.
This method of identifying clusters within a set of spatial points, based on Euclidian distance, is
similar to the Density Based Spatial Clustering of Applications with Noise (DBSCAN) (Sander, Ester,
82 M. Simon Maddison

Kriegel, & Xu, 1998; Schubert, Sander, Ester, Kriegel, & Xu, 2017). The dbscan provides two levels of
cluster membership, and defines clusters based on a minimum number of points that must be within
the given radius. Points that satisfy this condition are ‘core’, and those that are not but are reachable via
core points are ‘density-­reachable’ or ‘non-­core’. There is also a category of ‘outliers’ or ‘noise’ which are
neither. Depending on the given value of the minimum number of points necessary to make a cluster
(k) this reduces the effect of single points or thin chains of points linking clusters. Percolation analysis is
essentially a reduced case of this with the minimum number of points per cluster k being 2, and therefore
having only a single category of cluster membership, that is ‘core’.
Note that there is no sense in which percolation analysis seeks to explain the existence of clusters,
rather it is a descriptive process that inductively highlights underlying patterns. The question then is
whether these clusters are in some sense a relict of former socio-­political entities, and other evidence can
then be sought to establish if this is the case.
It is also important to remember that the percolation analysis has evolved from statistical techniques
in materials science where it is applied to sets of spatially distributed points, which are notionally identi-
cal and where there is no interest in distinguishing particular individuals. The same could be said for
road intersections when applied in geography. However, when used in archaeology then there is poten-
tially very great interest in the individual points, reflecting as they do distinct archaeological entities; for
example when clusters merge as the radius threshold increases, specific sites may play a key role, acting as
a link to join them together and the ready identification of such sites is therefore of great value. This is
currently a limitation of the statistical functions described above, as has been recognised both by Brown
(2015) and Maddison (2017), and is a highly relevant topic for development for specifically archaeologi-
cal applications.

Case studies
Three case studies are presented below, which provide examples of the application of percolation analysis
to archaeological and historical databases; they illustrate the value and the further potential of the tech-
nique as well as providing some clear pointers as to where it might be developed in the future.

Domesday vills
Arcaute et al. (forthcoming) applied the continuum city clustering algorithm to Domesday vill sites, in
order to identify patterns of settlement in England at Domesday. The hypothesis driving the analysis was
that there was a hierarchy of territories, defined by the degree of interconnectivity between vill sites, for
which percolation is an ideal analytical tool. The dataset comprises 13,448 sites and were analysed using a
suite of programs written in R (see following). The cluster transitions are neatly shown in Figure 5.4, with
thumbnails of the clusters plotted over an outline of England and Wales. Clusters have been computed for
percolation radius incremented in steps of 0.1km, reflecting the geographical scale of the data. The step
size was determined empirically; using a larger step size of for example 1km would mean missing some
of the potentially interesting intermediate clusters. Using a smaller step size would result in additional
computation and plots of little value. By 8.6km, all sites fall within a single cluster.
The plots for radii of 3km and 2.9km overlaid on an outline of Domesday counties show clusters that
conform well to these boundaries (Figure 5.5). At 3.2km clusters also correspond well with the known
boundaries of older kingdoms derived by traditional historical methods, including those of Mercia, the
East Angles and Kent, as well as some other counties (Figure 5.6(A)). Subdivisions of the Iron Age and
Roman Province of Dumnonia are also identifiable. The visual power of the cluster plots, overlaid on
Percolation analysis 83

Figure 5.4 Percolation cluster transitions for Domesday settlement. Evolution of the largest cluster in the per-
colation process of Domesday settlement, overlaid on the transition plot (as in Figure 5.3). Maps of the clusters
at the distance threshold for each transition are depicted. Each vector point colour represents membership,
when two or more nodes are close enough to be part of the same cluster. A colour version of this figure can
be found in the plates section.

the Domesday county boundaries is striking, particularly as it corresponds so well with evidence from
historical sources of different periods.

Rural settlement in England


Brown (2015) has developed a percolation function for GRASS GIS and applied it to mid-­19th century
rural settlements in England, as identified by Roberts and Wrathmell (2000), in an analysis that also offers
a useful comparison to the results of Domesday vills study discussed above. The database consists of
10,513 settlement nucleation points created from Roberts and Wrathmell’s work by Lowerre (2012) and
is available from English Heritage (Lowerre, 2018).
Roberts and Wrathmell sought patterns of provinces, sub-­provinces and regions, based on settlement
density and topographical features in England as recorded in the mid-­19th century. One of their main
results was to identify a ‘central province’ of settlement and land use, as had earlier been proposed by
Rackham (1976). The method used was predominantly qualitative and based on many different sources
of information, including land use, topographical features and woodland cover as well as settlement
nucleation.
Figure 5.5 Domesday vill clusters at 3km and 2.9km overlaid on English coastline and Domesday counties
(generated from datasets provided by Stuart Brookes).

Figure 5.6 Domesday vill and 19th-­century settlement clusters. (a) Domesday vill clusters at 3.2km overlaid
on coastline and Domesday counties, generated from Domesday vill datasets provided by Stuart Brookes; (b &
c) Roberts and Wrathmell’s 19th Century Settlement Nucleation dataset at 3km (Brown, 2015, p. 37) and at
3.5km overlaid by Roberts and Wrathmell’s central province (Brown, 2015, p. 57).
Percolation analysis 85

Brown applied his percolation function to this database, using a radius step size of 500m. As above for
the Domesday example, this was chosen empirically to strike a balance between unnecessary computation
and output charts, and sufficient granularity to observe the development of clusters. By 13km all points
are included within a single cluster. Regions start to appear at 2.5km and agglomerate to three much
bigger clusters at 3km (Figure 5.6(B)), with a ‘central province’ convincingly appearing at a radius of
3.5km (Figure 5.6(C)). As the radius increases, Cornwall and part of Devon as well as Cumbria remain
independent at 5km as other areas are absorbed. There are some differences to Roberts and Wrathmell’s
central province, but this highlights areas for focused investigation in order to identify possible explana-
tions. There is much more than can be covered here, but the use of percolation analysis for this dataset
clearly provides great value not only in corroborating some of the core conclusions of Roberts and
Wrathmell’s work, but also highlighting nuances as well as deeper and older patterns worthy of further
detailed investigation.

Hillforts in Britain and Ireland


The Atlas of Hillforts in Britain and Ireland project (Lock & Ralston, 2017, 2019, forthcoming) provided an
excellent opportunity to undertake a very broad analysis of the distribution of hillforts using percolation
methods, based as it was upon a comprehensive and authoritative database with verified coordinates for
each site.
There have been various attempts at studying hillforts in terms of their spatial distribution, an early
example being in Cyril Fox’s ‘Personality of Britain’ (Fox, 1938, p. 30) which provided maps for quali-
tative analysis, covering a very wide range of evidence and environmental factors. Many others have
approached the subject since, with Newcomb (1970) probably the first to use a rigorous analytical tech-
nique, nearest neighbour analysis, manually applied to the hillforts of Penwith in Cornwall.
In summary, early work on the spatial distribution of hillforts was implemented through often beau-
tifully drawn maps, but perhaps of necessity was qualitative and intuitive in interpretation. It was from
the late 1960s that computational techniques started being applied, most notably by Hogg (1972) and
what quickly became apparent was the importance of drawing on detailed data and the wider context
and understanding of the sites analysed in order to complement quantitative analysis. As is well known,
computation itself offers no magic answer to providing a comprehensive picture of the past. It is also clear
that the strength of such approaches is that they are generally repeatable, accessible for review, revision,
and further development.
It was decided to use percolation analysis to identify any ‘natural’ or intrinsic groupings of hillforts that
might exist, and the R code was developed for this purpose, see below. Once identified, these groupings
could then be compared with topography and geographical regions, as well as other historical data sets.
The dataset size is smaller than the other two examples discussed, with 2985 confirmed sites in Britain
and 347 in all of Ireland. The sites also are more difficult to categorise, having no historical records, and
many are likely to be have been lost or destroyed forever. To give an indication of the scale of this issue
c.650 candidate sites in Britain were excluded as they were not conclusively thought to be hillforts. Also,
it is important to recognise that the hillforts do not represent a ‘snapshot in time’; sites were created in
different periods and may have been altered extensively over many centuries or fallen out of use. Another
difference is that the overall distribution covers the entirety of Britain and Ireland, including their islands
and the Isle of Man, so is much more widely dispersed as well as being highly non-­uniform. For concise-
ness, the discussion in this case study is focused on Britain.
Percolation was run in radius increments of 1km, although the concentration of sites in southeast Scot-
land warranted the use of smaller step sizes of 0.1km for radii between 2 and 4km. As described earlier
86 M. Simon Maddison

Figure 5.7 Hillfort clusters in Britain, at (a) 34km, (b) 12km and (c) 9km percolation radius.

these values were established empirically as a balance between distinctiveness and lack of differentiation
of the generated cluster plots. By 35km all sites form a single large cluster, except for the Hebrides,
Shetlands, Isle of Man and Scillies. Apart from southeast Scotland, the most interesting transitions occur
in the 6–13km range (see Figure 5.3). Above this there are few transitions and the clusters are very large.
Below this range the clusters fragment excessively, although they may be of value for very local studies.
Plots for 34km, 12km and 9km radii are shown in Figure 5.7.
At 34km (Figure 5.7(A)) the predominantly Scottish cluster includes sites in England as far south as a
line roughly between Morecambe and Flamborough Head, with more southern sites forming the largest
cluster. Looking at England as the radius values reduce, sites in the Pennines and the east progressively
break out of the bigger cluster, and at 14km, the southwest peninsula forms its own cluster in Cornwall
and part of Devon, with other clusters appearing in the southeast.
The plot for 12km (Figure 5.7(B)) shows for example Cornwall and Devon/part of Somerset as
individual clusters, and a cluster along the Chilterns. The plot for 9km (Figure 5.7 (C)) shows northwest
Wales, the Clwydian Range, southwest Wales, the Gower, central Wales and the Marches and two clusters
on the northwest and the southeast of the Severn Valley, the latter being the Cotswolds. Some of these
clusters have been the subject of more detailed analysis (Maddison, 2016, 2017) and two are discussed
below.
To illustrate, two clusters in Britain are selected for discussion, namely central Wales and the Marches,
and the Cotswolds and lower Severn Valley. These have been identified from the cluster plots in Fig-
ure 5.8 and Figure 5.9. They have been plotted using a GIS on a topographical map, and the site size in
hectares has been used to scale the symbols. Specific sites and key rivers are also indicated.
Figure 5.8 Central Wales cluster at 9km with sites plotted according to area, and the rivers Wye and Severn.
88 M. Simon Maddison

Cotswold cluster at 10km radius with sites plotted according to area, and the rivers Wye, Severn
Figure 5.9
and Thames.

The Central Wales and the Marches cluster is situated around a high hilly region, with the upper
stretches of the Severn being a key feature. It incorporates sites that lead around the south to the upper
Wye, as well as the large site at Titterstone Clee over the river Teme to the east.
There are two dominant sites on the upper reaches of the Severn River; Llanymynech Hill and Y Bre-
iddin. Llanymynech is one of the largest sites in Britain at 57ha, and is located where the rivers Vyrnwy,
Tanat and Cain reach the Severn Plain. It was very important as a source of copper, zinc and lead through
the Iron Age, and there is evidence of metal working dating earlier than this. Y Breiddin at 28ha is located
on the southern side of the Severn, shortly below where it is joined by the river Vyrnwy. These larger
sites suggest roles as dominant control points or entrepots for goods moving from the hilly hinterland
to the plains and low lands beyond, down the River Severn as Brown (2008, pp. 196–204) has argued in
detail building on from the ideas of Sherratt (1996).
Figure 5.9 shows the Cotswold cluster at 10km radius located quite distinctly in the topography of
the hills, with two large and ten other sites on the north-­west edge of the escarpment overlooking the
Severn Valley, Gloucester and Cheltenham. The upper reaches of the Thames are also shown to the south,
with Trewsbury Hillfort next to its source, as well as the Fosse Way, a railway and the Thames and Severn
Canal. One other site close to the Severn is Towbury Hill Camp, next to the M50 Severn bridge 4km
north of Tewkesbury. The location of the larger sites reflects not only the longevity of their importance,
through incorporation of much older monuments, for example from the Neolithic and Bronze Ages at
Percolation analysis 89

Nottingham Hill Camp (shown), but also their possible role in trade, being positioned on key waterways
and routes which have continued in importance up to modern times (e.g. Roman road, canal and railway,
M50 river crossing).
These brief analyses strongly hint at groups with distinct regional identities, tied to topographical
regions, with the potential for specific sites to be important either in terms of landscape dominance or as
key linking sites through important transshipment routes. The Cotswold group is also notable for fitting
in well within the old Gloucestershire county boundary. Studies of other clusters (Maddison, 2016) draw
comparable results. Together these suggest that further detailed comparative work using additional data
attributes from the Atlas database, (e.g. architectural features) and different data sources (e.g. finds records)
might together build a strong case for identifying clearly defined regions for the period.

Implementation details
The following notes describe some key implementation details for the case studies described above.
The Domesday vills study used R code developed by Elsa Arcaute. This was then further developed
by the author for the hillfort study. The studies on street intersections in Britain used bespoke programs
written in C to handle the large datasets of many tens of thousands of points, which the R code could not.
The x and y coordinates for each point, along with a unique identifier, are read in from a .csv file. Dis-
tances between points are computed and stored in a sparse matrix by systematically working through each
point to every other point. An upper distance limit is set as a parameter, and values above this are not stored;
this keeps the file size from becoming unnecessarily large. Initial runs with the hillfort dataset for instance
showed that by a threshold of 40km almost all such sites in Britain fall within a single cluster; there was
therefore no point in storing the distance for points further apart, such as between those in Cornwall and
Scotland, for example. This improves speed of processing and keeps data files manageable in size.
For a given percolation radius, the matrix is reduced to those point pairs where the inter-­point dis-
tances are less than or equal to the radius. A graph generation process is then applied to this sub-­matrix,
and the clusters computed (a cluster comprising at least two points); each point is assigned a cluster
identifier for that particular percolation radius. The R functions used are: graph.edglist() to create the
graph from the pair list matrix, and clusters() to generate the clusters from the generated graph (R igraph
manual pages – Connected components of a graph, 2015). This process is repeated for each required value of
the percolation radius.
For mapping, the clusters are ranked by size, and a colour sequence assigned, so that clusters can be
displayed with a colour coding according to rank (see Figure 5.7 for example).
Further R code has been developed to generate the data in map form using overlay boundary outlines,
provided as shape files, and accommodating the appropriate geographical reference frame. In the hillfort
case study earlier, these frames are different for Ireland and the British mainland.
Comparison with the R implementation of DBSCAN for hillforts has shown that the results are
indeed the same for k=2, with visual inspection of the clusters and the cluster transition plots show-
ing them to be identical. The recent R implementation of dbscan was used (Hahsler, Matthew, Arya, &
Mount, 2017) and integrated into the percolation analysis package described above, so the same source
data, parameters, mapping and analysis programs could be used. According to the R documentation
(dbscan), this implementation is significantly faster than earlier versions published in the fpc package
(Hennig, 2015) and works with larger datasets.
There is also a GRASS GIS function for clustering computations, v.cluster which includes a DBSCAN
method (Metz, 2015), but, to date, this has not been compared with the R implementation described and
may not handle large datasets.
90 M. Simon Maddison

Brown (2015, pp. 22–24) took a different approach in that he implemented a GRASS function, in C.
He similarly starts with the computation of a distance matrix for every point pair. However, he then con-
verts the matrix to an edge list, and sorts the list on the basis of edge length. Clusters are then identified by
a membership algorithm which assigns cluster membership to points whose associated edge distances are
less than the defined percolation threshold. A key benefit of this approach is that the cluster identity (an
integer) is consistent as it grows through different radii (until it is either the largest cluster or is absorbed
into a larger cluster). This is unlike the R solutions above where the cluster identity is arbitrarily assigned
for each computation. This means it is much easier to inspect and interpret clusters over a range of radii
and makes it more accessible for archaeological applications.

Conclusion
Percolation analysis is a technique for identifying clusters within a set of spatial points. The examples
described above have used simple Euclidean distance, avoiding unnecessary complexity. In this way it is
a simple application of Tobler’s Law (Tobler, 1970) in that relatedness increases with proximity. The case
studies suggest that percolation analysis has great potential within archaeology for exploring and investi-
gating spatial data for evidence of past socio-­political entities and distinct regions with their own identity.
In the first two studies historical datasets presenting tight snapshots of particular points in time have
been used for comparison with other evidence to explore hierarchies of territory. For the Domesday vills
this has provided corroboration of some county/administrative boundaries, as well as highlighting the
relicts of older kingdoms that no longer existed. The enduring character of these boundaries suggest the
factors that created them in the first place continued to be influential in later periods and in some cases,
up to modern times. Where there are differences, as in the central province of 19th-­century settlements,
then they provide focus for more detailed investigation to better understand those specific areas. The fact
that there is good historical evidence from other sources for comparison, and that percolation analysis
provides not only corroboration but leads to other channels of investigation, gives a robust validation for
the application of the technique.
The hillfort study is different in a number of ways. There is no historic evidence for comparison, and
the dataset embodies sites that were created, developed and abandoned over different periods and over
very many centuries. In this case, percolation analysis has been applied in a totally exploratory way, to
identify groups of sites for investigation based on their relative proximity, rather than something poten-
tially much more arbitrary, as for example a modern county boundary. The results are visually compelling
and the comparison of clusters with topography, combined with symbology scaled by the enclosed area
of the sites, gives strong support to an argument for regionality, which warrants pursuing with other evi-
dence such as coins, pottery, building styles and so forth (see Cunliffe, 1991). It also provides the potential
for detailed comparisons with other datasets such as the Portable Antiquities Scheme,3 finds, place names,
population genetics (Leslie et al., 2015) and indeed the results of the other case studies.
A major weakness of course is that no account has been taken of site dating. The data and the clusters
generated reflect the final state of hillfort construction, and do not necessarily represent the situation that
may have prevailed in earlier times. However, the approach lends itself readily to more refined analy-
ses with such dating information as exists, and of course can be run repeatedly over time as more data
becomes available.
Percolation analysis is a new technique for archaeology and there are clear opportunities for further
development. As noted earlier it comes from a statistical background where points are identical, but this
is not the case in archaeology. The ability to readily identify individual points out of the analysis, such
as the peripheral ‘linking’ sites that bring clusters together as the radius is extended, would be of great
Percolation analysis 91

value. Other potentially important features are establishing a metric for cluster ‘robustness’ as the radius is
changed, and the need to find a way of labelling clusters as they grow. This would aid comparisons with
other datasets, and other groupings over long periods of time. As the available statistical methods are not
designed to provide this, development and adaptation of tools specifically for archaeology is an obvious
next step, as has been argued by Brown (2015). To this end, DBSCAN with its different classes of cluster
membership may provide a useful starting point.
All the examples here are based on simple Euclidean distance between points, and this has the advan-
tage of being easy to justify. However, as De Guio and Secco (1988) attempted, other parameters could
also be used, including weighted distances as a way of exploring cluster relationship to human movement
(Brown, 2015). Even for the transitions observed based on Euclidean distance, (e.g. in Figure 5.4) some
comparisons with ‘day’s walk’ distances for different terrains might help explain variations in cluster
transition distances in different regions (see Herzog, this volume).
Percolation analysis provides a powerful way of visualizing clusters within an archaeological spatial
dataset and deserves to become a core tool for spatial analysis in archaeology. It is not a magic tool to
elicit the past, but it can generate useful hypotheses and the starting point for more detailed work, which
can then focus on relevant details and incorporate other sources of data, providing guidance, support and
corroboration. It hints at possible prehistoric groupings and cultural/socio-­political entities, but as always
detailed investigation on a case by case basis is required.

Notes
1 www.ucl.ac.uk/archaeology/research/projects/assembly, Reynolds, A., Yorke, B., Carroll, J., Baker, J., & Brookes,
S. Landscapes of governance project. Retrieved from http://www.ucl.ac.uk/archaeology/research/projects/assembly.
Accessed 2016.
2 www.r-­project.org, The R project for statistical computing. Retrieved from https://www.r-­project.org. Accessed
2015, 2016.
3 https://finds.org.uk/, The portable antiquities scheme. Retrieved from https://finds.org.uk/. Accessed 2016.

References
Arcaute, E., Brookes, S., Brown, T., Lake, M., & Reynolds, A. (forthcoming). Case studies in percolation analysis:
The distribution of English settlement in the 11th and 19th centuries compared. Journal of Archaeological Science.
Arcaute, E., Ferguson, P., Brookes, S., & Reynolds, A. (2014). Natural regional divisions of places in Domesday Book. Paper
presented at the The Connected Past, Imperial College London.
Arcaute, E., Hatna, E., Ferguson, P., Youn, H., Johansson, A., & Batty, M. (2015). Constructing cities, deconstructing
scaling laws. Journal of the Royal Society Interface, 12(102), Article 20140745.
Arcaute, E., Molinero, C., Hatna, E., Murcio, R., Vargas-­Ruiz, C., Masucci, A. P., & Batty, M. (2016). Cities and regions
in Britain through hierarchical percolation. Royal Society Open Science, 3(150691). http://dx.doi.org/10.1098/
rsos.150691
Broadbent, S. R., & Hammersley, J. M. (1957). Percolation processes. Mathematical Proceedings of the Cambridge Philo-
sophical Society, 53(3), 629–641. doi:10.1017/S0305004100032680
Brookes, S., & Reynolds, A. (2011). The origins of political order and the Anglo-­Saxon state. Archaeology International,
13/14, 84–93. http://dx.doi.org/10.5334/ai.1302
Brown, I. (2008). “Beacons” in the landscape: The hillforts of England and Wales. Bollington: Windgather.
Brown, T. (2015). The potential for percolation analysis within archaeology: Constructing and implementing an accessible percola-
tion method (Unpublished MSc thesis). University College London.
Cunliffe, B. (Ed.). (1991). Iron Age communities in Britain (3rd ed.). London and New York: Routledge.
dbscan. Retrieved from www.rdocumentation.org/packages/dbscan/versions/1.1-­1/topics/dbscan
De Guio, A., & Secco, G. (1988). Archaeological applications of the percolation method. Computer and quantitative methods
in archaeology, University of Birmingham.
92 M. Simon Maddison

Domesday Book Online. Retrieved from www.domesdaybook.co.uk


Flory, P. J. (1941). Thermodynamics of high polymer solutions. The Journal of Chemical Physics, 9(8), 660–661.
doi:10.1063/1.1750971
Fox, C. S. (Cartographer). (1938). The personality of Britain: Its influence on inhabitant and invader in prehistoric and early
historic times. Cardiff, Wales: National Museum of Wales.
Frisch, H. L., & Hammersley, J. M. (1963). Percolation processes and related topics. Journal of the Society for Industrial
and Applied Mathematics, 11(4), 894–918. Retrieved from www.jstor.org/stable/2946482
Hahsler, M., Matthew, P., Arya, S., & Mount, D. (2017). Package “dbscan”. Retrieved from https://cran.r-­project.org/
web/packages/dbscan/dbscan.pdf
Hennig, C. (2015). Package “fpc”. Retrieved from https://cran.r-­project.org/web/packages/fpc/fpc.pdf
Hogg, A. H. A. (1972). The size-­distribution of hill-­forts in Wales and the Marches. In F. Lynch & C. Burgess (Eds.),
Prehistoric man in Wales and the West: Essays in honour of Lily F. Chitty (pp. 293–306). Bath: Adams and Dart.
Leslie, S., Winney, B., Hellenthal, G., Davison, D., Boumertit, A., Day, T., . . . Bodmer, W. (2015). The fine-­scale
genetic structure of the British population. Nature, 519(7543), 309–314. http://dx.doi.org/10.1038/nature14230
Lock, G., & Ralston, I. (2017). The Atlas of hillforts of Britain and Ireland. Retrieved from https://hillforts.arch.ox.ac.uk/
Lock, G., & Ralston, I. (Eds.). (2019). Hillforts: Britain, Ireland and the nearer continent. Papers from the Atlas of hillforts
of Britain and Ireland conference, June 2017. Oxford: Archaeopress.
Lock, G., & Ralston, I. (forthcoming). The Atlas of hillforts of Britain and Ireland. Edinburgh: Edinburgh University
Press.
Lowerre, A. G. (2012). Recycling Roberts and Wrathmell: Building and analysing the Atlas of rural settlement in England.
Paper presented at the computer applications and quantitative methods in archaeology, University of Southampton.
Lowerre, A. G. (2018). Atlas of rural settlement in England GIS. Retrieved from https://historicengland.org.uk/
research/current/heritage-­science/atlas-­of-­rural-­settlement-­in-­england/
Maddison, S. (2016). The spatial distribution of Iron Age hillforts in the British Isles (Unpublished MSc dissertation).
University College London.
Maddison, S. (2017). Using Atlas data: The distribution of hillforts in Britain and Ireland. Paper presented at the The atlas
of hillforts of Britain and Ireland: Results, implications and wider contexts, University of Edinburgh.
Metz, M. (2015). v.cluster – Performs cluster identification. Retrieved from https://grass.osgeo.org/grass72/manuals/v.
cluster.html. Accessed January 2018.
Newcomb, R. E. (1970). The spatial distribution of hill forts in West Penwith. Cornwall Archaeology, 9, 47–52.
Rackham, O. (1976). Trees and woodland in the British landscape. London: Dent.
R igraph manual pages: Connected components of a graph. (2015). Retrieved from http://igraph.org/r/doc/components.
html
Roberts, B. K., & Wrathmell, S. (2000). An Atlas of rural settlement in England. London: English Heritage.
Rozenfeld, H. D., Rybski, D., Andrade, J. S., Batty, M., Stanley, H. E., & Makse, H. A. (2008). Laws of population
growth. Proceedings of the National Academy of Sciences of the United States of America, 105(48), 18702. doi:10.1073/
pnas.0807435105
Rozenfeld, H. D., Rybski, D., Gabaix, X., & Makse, H. A. (2011). The area and population of cities: New insights
from a different perspective on cities. American Economic Review, 101(5), 2205–2225.
Sander, J., Ester, M., Kriegel, H.-­P., & Xu, X. (1998). Density-­based clustering in spatial databases: The algorithm
GDBSCAN and its applications. Data Mining and Knowledge Discovery,2(2),169–194. doi:10.1023/A:1009745219419
Schubert, E., Sander, J., Ester, M., Kriegel, H., & Xu, X. (2017). DBSCAN revisited, revisited: Why and how you
should (still) use DBSCAN. ACM Transactions on Database Systems (TODS), 42(3), 1–21. doi:10.1145/3068335
Sherratt, A. (1996). Why Wessex? The Avon route and river transport in later British prehistory. Oxford Journal of
Archaeology, 15(2), 211–234. doi:10.1111/j.1468-­0092.1996.tb00083.x
Stauffer, D., & Aharony, A. (1991). Introduction to percolation theory (2nd ed.). Boca Raton, USA: CRC Press.
Stockmayer, W. H. (1943). Theory of molecular size distribution and gel formation in branched-­chain polymers. The
Journal of Chemical Physics, 11(2), 45–55. doi:10.1063/1.1723803
Tobler, W. (1970). A computer movie simulating urban growth in the Detroit region. Economic Geography, 46(1),
234–240.
6
Geostatistics and spatial
structure in archaeology
Christopher D. Lloyd and Peter M. Atkinson

Introduction
Geostatistics offers a framework for characterising the spatial structure of archaeological variables and
for predicting their values at locations where no sample observations are available. It can also guide sam-
pling strategies by relating spatial variation to sample spacing. In this chapter, we review the principles
of geostatistics at a basic and accessible level and we consider, via case studies, how geostatistics can ben-
efit archaeological applications. We do this with the overall intention of encouraging continued wider
adoption of geostatistics by archaeologists. The chapter begins by introducing some key concepts which
underlie geostatistics. Spatially referenced variables such as measurements of heights on earthworks,
resistivity surveys, or artefacts, have spatial structure and measuring this structure may offer considerable
insights. A key application of geostatistical tools for spatial prediction, is cases where there are sparse
samples (e.g. soil geochemistry or elevation) and a complete set of gridded values is desired for the whole
study area (see also Banning this volume; Conolly this volume). In such cases, kriging prediction can
be used to derive values with the predictions informed by the variogram – a measure of the scale and
magnitude of spatial variation. Kriging and related approaches are introduced before providing some real
case studies to bring these principles and techniques to life in an archaeological context.
Geostatistical methods have been used to analyse a wide variety of variables in archaeological studies.
Introductions to the topic are provided by Robinson and Zubrow (1999), Wheatley and Gillings (2002),
Lloyd and Atkinson (2004), and Conolly and Lake (2006). Lancelotti, Negre Pérez, Alcaina-­Mateos,
and Carrer (2017) offer a brief review of kriging and allied methods. Applications of geostatistics in
archaeology are clearly diverse, focusing on a variety of variables: digital elevation models (DEMs)
(Bentley & Schneider, 2000; Hageman & Bennett, 2003; Hesse, 2010; this chapter); soils (Lloyd &
Atkinson, 2004; Entwistle, McCaffrey, & Dodgshon, 2007; Wells, 2010; Burnett et al., 2012); tephra
thickness (Athanassas, Modis, Alçiçek, & Theodorakopoulou, 2018); multiple variables including pollen,
non-­pollen palynomorphs, macrofossils, and loss on ignition (Revelles et al., 2017); human remains (14C)
(Bocquet-­Appel & Demars, 2000); electrical resistivity (Webster & Burgess, 1980); settlement terminal
dates (Neiman, 1997); site density (Zubrow & Harbaugh, 1978); coins (this chapter); pottery (Bentley &
Schneider, 2000; Lloyd & Atkinson, 2004; Bevan & Conolly, 2009) and lithics (Ebert, 2002; Barrientos,
Catella, & Oliva, 2015).
94 Christopher D. Lloyd and Peter M. Atkinson

An early application was concerned with archaeological site location prediction: Zubrow and Har-
baugh (1978) made use of kriging in an application focused on reducing the effort expended in locat-
ing archaeological sites in Cañada del Alfaro in Guanajuato, Mexico, and the Hay Hollow Valley in
east-­central Arizona, USA. The study aimed to predict, from a sample of the sites identified through
fieldwork, the expected number of sites in each cell of a regular grid. The authors found that increasing
the initial sample from 12.5% of the surveyed area to 50% made relatively little difference to the number
of sites found in cells predicted by kriging. In other words, kriging enabled the location of almost as
many of the total sites from 12.5% of the total sample as it did from 50% of the total sample. The study
demonstrated that the density of sites was spatially dependent.
Applications of geostatistics in archaeology are concerned with the spatial distribution of environ-
mental characteristics (elevations, soils, tephra, etc.) or structures or other objects (e.g. pottery, coins
or lithics) produced by humans. While geostatistics has its roots in mining and most applications have
focused on features of the physical environment, using a random function model to represent uncer-
tainty in, for example, settlements or artefacts is logical given that we generally do not understand the
complex sets of interacting processes which produce the spatial distributions we observe. Athanassas
et al. (2018) used kriging to map Minoan tephra; the authors found that there was no spatial structure
for seaborne tephra while airborne tephra exhibited strong structure. An example of the mapping of
artefact distributions using kriging is provided by Barrientos et al. (2015) who mapped lithics from
the Late Holocene (c. 3000–200 14C years BP) in part of east-­central Argentina; the authors considered
the production of spatially continuous models to be helpful in understanding the spatial distribution
of lithics, although they comment on the need for richer sources of information to help construct
explanatory models of lithic landscapes. Bocquet-­Appel and Demars (2000) used geostatistics to
analyse the spatial structure of 14C dates associated with archaeological levels or directly from human
remains. The authors used kriging to chart the expansion of modern humans and the spatial contrac-
tion of the Neanderthals.

Method

Scale and spatial structure


Scale is central to the analysis of any spatial data. If data are provided as observations over areas (e.g.
grid cells in a remotely-­sensed image, or counts of people or artefacts over areas such as townlands) –
termed areal data – then results from analyses of these data is a function of the size and shape of
the zones. This connects to the Modifiable Areal Unit Problem (MAUP; Openshaw, 1984) whereby
model predictions are subject to scale effects (results are conditional on the sizes of zones) and zona-
tion effects (results are conditional on the shapes of zones). Analysis of, for example, remotely sensed
imagery is subject to the MAUP. In most archaeological analyses the observed data are effectively
point data such as the location of an artefact find or a Global Positioning System (GPS)-­derived
height measurement. However, scale effects are encountered when handling such data (e.g. predict-
ing values at locations where there are no samples). Here, the main focus is on such point data (for
point pattern analysis, see Bevan this volume).
Most geographical properties are spatially dependent – proximate values tend to be more similar
than values separated by larger distances. An obvious example of this is elevation – a location with
a small elevation value is likely to be neighboured by other locations with small elevations. While
exceptions such as cliffs are possible, this tendency is generally consistent. The degree of spatial
dependence – how far and over what spatial scales values tend to be similar – can be measured and
Geostatistics and spatial structure 95

provides a useful summary of the spatial distribution of data values. This tendency has been referred
to as the ‘the first law of geography’ (Tobler, 1970). A related concept is spatial autocorrelation; cor-
relation refers to different variables (e.g. how strong is the correlation between distance from a river
and artefact density?), whereas spatial autocorrelation refers to the correlation of a variable with itself
(see also Hacıgüzeller this volume). Measures such as the Moran’s I spatial autocorrelation coefficient
(see Lloyd, 2010 for an introduction) represent the magnitude of the correlation between data values
and values of the same variable at neighbouring locations. Such measures can be computed over a range
of different neighbourhoods and this provides a representation of spatial structure. Returning to the
example of elevation, a mountainous area has a very different spatial structure to a river flood plain. In
the former case, the spatial variation is short range (or high frequency), as elevation values may differ
even over very short distances. In the latter case, the spatial variation is long range (or low frequency),
as elevation values tend to be quite similar over even large distances. Similarly, spatial properties such
as find locations of pottery sherds or measurements of soil phosphate have a distinct spatial structure.
This chapter introduces tools for measuring spatial structure and it suggests ways in which the derived
information may be useful.

Introducing geostatistics
The basis of geostatistics is the theory of regionalised variables. In geostatistical analysis, spatial varia-
tion (at a location, x) is divided into two distinct parts: a deterministic component (m(x)) (representing
‘gradual’ change over the study area) and a stochastic (or ‘random’) component (R(x)):

Z ( x ) = µ( x ) + R( x )  (6.1)

This is termed a random function (RF) model. The random part reflects our uncertainty about spatial
variables – what seems random to the observer is a function of a multiplicity of factors that may be
impossible to model directly (but this does not mean that we really think variation is random; Isaaks &
Srivastava, 1989). In geostatistics, a spatially-­referenced variable, z(x), is treated as an outcome of a RF,
Z(x). That is, we consider an observation to have been generated by the RF model and this gives us a
framework to work with these data. A realisation of a RF is called a regionalized variable (ReV; a spatial
observation set). The theory of regionalised variables (RVT) (Matheron, 1971) is the fundamental frame-
work on which geostatistics is based.
Just as we estimate parameters, for example, the mean and variance of a distribution, we estimate
parameters of the RF model using the data. These parameters, like the mean and variance, summarise the
variable. The mean and variance describe the Gaussian (or normal) distribution and so are only useful
if the empirical distribution is fitted well by a Gaussian distribution. Similarly, the parameters of the RF
model are only meaningful in certain conditions. Where the properties of the variable of interest are
the same, or at least similar in some sense, across the region of interest we can employ what is termed a
spatially stationary model. In other words, we can use the same model parameters at all locations. If the
properties of the variable are clearly spatially variable, then a standard stationary RF model may not be
appropriate (see Crema, this volume). There are different degrees of stationarity, but for present purposes
we will only consider one, termed intrinsic stationarity. There are two requirements of intrinsic stationar-
ity. Firstly, the mean is assumed to be constant across the region of interest. In other words, the expected
value of the variable does not depend on the location, x:

E{Z ( x )} = µ( x ) for all x  (6.2)


96 Christopher D. Lloyd and Peter M. Atkinson

Secondly, the expected squared difference between paired RFs (i.e. the observations) (summarised by the
variogram, g(h)) should depend only on the separation distance and direction (the lag h) between the RFs
and not on the location of the RFs:
1
γ (h) = E[{Z (x ) − Z (x+h )}2 ] for all h  (6.3)
2
Where x + h indicates a distance (and direction) h from location x.
In terms of the data, the expected semivariance should be the same for all observations separated
by a particular lag irrespective of where the paired observations are located. In practical terms, the
geostatistical approach can be applied irrespective of these conditions, but the results will clearly
be sub-­optimal if the data depart markedly from them. In some cases the mean is allowed to vary
from place to place, but be constant within a moving window. This is termed quasi-­stationarity
(Webster & Oliver, 2007).

The variogram
Analysis of the degree to which values differ according to how far apart they are can be conducted by
computing the variogram (or semivariogram). With reference to the variogram, the term lag is used to
describe the distance and direction by which observations are separated. For example, two observations
may be 5 km apart and one may be directly north of the other. In simple terms, the variogram is estimated
by calculating the squared differences between all the available paired observations and obtaining half the
average for all observations separated by that lag (or within a lag tolerance, e.g. 5 km +/-­2.5 km, where
the observations are not on a regular grid). The term semivariance refers to half the squared difference
between data values. Figure 6.1 gives a simple example of a transect along which observations have been
made at regular intervals to estimate the variogram. Lags h of 1 and 2 are indicated. So in this case, half
the average squared difference between observations separated by a lag of 1 is calculated and the process
is repeated for a lag of 2 and so on. In many cases the distance between observations will not be regular,
so ranges of distances are grouped. The selection of the bin size (e.g. 0–5 km, >5–10 km, >10–15 km, . . .
or 0–10 km, >10–20 km, . . .) is important. Smaller bin sizes will result in more noisy variograms while
a bin size which is too large will smooth out too much spatial structure and it will not be possible to
capture the spatial variation of interest. In other words, the plotted values in a variogram with too small a
bin size will appear to be widely scattered, while the values in a variogram with a larger bin size will tend
to be more similar to neighbouring values on the plot. Finding an appropriate bin size (usually through
trial and error) is important in characterising spatial structure and in guiding the selection and fitting of
a model, as detailed below.
The variogram can be estimated for different directions to enable the identification of directional vari-
ation (termed anisotropy). That is, rather than consider all observations 5 km from a given observation,

Figure 6.1 Transect with paired points selected for lags of 1 and 2 units.
Geostatistics and spatial structure 97

Figure 6.2 Selection of paired observations for directional variograms.

we may consider only observations that are directly north or south (for example) of the observation of
interest within a particular angular tolerance (e.g. north or south +/-­45 degrees). In variogram estimation
(see equation (6.4)), observations z(xi) and z(xj) may be included in calculations if location xj is north
of location xi (indicated by 0º in clockwise from north) within an angular tolerance of 22.5º. Using
an angular tolerance of 45º (45º either side of the directional lines), one strategy would be to compute
directional variograms for 0º, 45º, 90º and 135º, thus, giving complete coverage and with no overlap
between the directions (Webster & Oliver, 2007). The selection of paired data is illustrated graphically in
Figure 6.2 where the specified direction is 45º clockwise from north and the angular tolerance is 22.5º
(i.e. 22.5º either side of the 45º directional line). In this case, z(xi) would be paired with z(xj) since it is
within the specified tolerance.
In summary, the variogram characterises the degree of difference in values as a function of the distance
by which they are separated. The experimental variogram, γˆ(h ) , relates semivariances to distances (and
directions). It, thus, places distance (for a given direction) – (the lag) – on the x axis and semivariance on
the y axis. If a property is spatially autocorrelated, we would expect the semivariance to increase as the
distance between observations increases.
As an example, if we take a distance range of 1000 to 2000 m and there are 346 pairs of observations
separated by a distance within that band then p(h), the number of paired observations, is 346. Note that for
each pair, the semivariance is calculated twice – once with respect to the first location and once with respect
to the second. We then calculate the squared difference between each of these paired values. The first value
in each pair is given by z(xi) and the value separated from it by the specified lag h (in this example the dis-
tance is 1000 to 2000 m and we are concerned with all directions), is given by z( xi + h ) . So, their squared
2
difference is given by {z( xi ) − z( xi + h )} . The summed values are then divided by two – hence the term
semivariance. Putting this together, the experimental variogram for lag h is computed with:
1 p( h )

2
γˆ(h ) = {z( xi ) − z( xi + h )}  (6.4)
2 p(h ) i =1
98 Christopher D. Lloyd and Peter M. Atkinson

As an example, if the lag is 5 km +/-­2.5 km (i.e. 2.5 km to 7.5 km) and two values are separated by
6.2 km, then these paired observations qualify and we compute the squared difference. If the two values
are 26.2 and 43.3 units then their squared difference is:

2 2 2
{z( xi ) − z( xi + h )} = {26.3 − 43.4} = {−17.1} = 292.41 units
In the same way, we compute the squared difference for all other pairs separated by 2.5 km to 7.5 km and
at each stage add the computed value to the previous values computed for that lag. Once this is done, we
multiply the summed values by 1/(2p(h)).
Figure 6.3 gives an example of an experimental variogram; the fitted model is described later. The
lags are 0–5000 m, 5000–10000 m and so on in groups of 5000 up to 70000 m. In this case, values are
compared irrespective of the direction by which they are aligned. That is, whether they are aligned
(approximately or absolutely) on a line north-­south or east-­west etc. of one another is irrelevant. A var-
iogram computed from data in all directions is termed omnidirectional.
In Figure 6.3, the semivariance values (that is, the average squared difference between observations
separated by each distance band) tend to be smaller for small lags and they generally increase with an
increase in lag size until perhaps 25,000 m where the values tend to level out (this is demonstrated further
down). This indicates that values are spatially dependent up to approximately this distance. At distances
larger than this, there is no spatial correlation. The variogram provides a useful means of summarising

Figure 6.3 Omnidirectional experimental variogram with fitted model.


Geostatistics and spatial structure 99

how values change with separation distance. Using topography as an example, data representing a
‘smooth’ surface like a flood plain will have a very different variogram to data representing a ‘rough’
surface like a mountain range.
A mathematical model may be fitted to the experimental variogram and the coefficients of this model
can be used for spatial prediction using kriging or for conditional simulation (defined further down). A
model can be fitted using some fitting procedure such as ordinary least squares or weighted least squares.
A model is usually selected from one of a set of ‘authorised’ models (see Webster & Oliver, 2007). There
are two principal classes of variogram model. Transitive (bounded) models have a sill (finite variance); that
is, the variogram model levels out as it reaches a particular lag. Unbounded models do not reach an upper
bound. Figure 6.4 shows the components of a bounded variogram model. These will be defined and
then practical examples given. The nugget effect c0 represents unresolved variation (a mixture of spatial
variation at a finer scale than the sample spacing and measurement error). The structured component c
represents the spatially correlated variation. The sill (or sill variance), c0 + c, is the a priori variance. The
range, a, represents the scale (or frequency) of spatial variation. For example, if a region is mountainous
and elevation varies markedly over quite small distances then the elevation can be said to have a high
frequency of spatial variation (a short range a) while if the elevation is quite similar over much of the
area (e.g. it is a river flood plain) and varies markedly only at the extremes of the site (that is, at large
separation distances) then the elevation can be said to have a low frequency of spatial variation (a long
range). The structured component captures the magnitude of variation, while the range represents the
spatial scale of variation.
As noted above, there are many different models that can be fitted to variograms. The variogram in
Figure 6.3, was fitted with a nugget effect and a spherical model component – as defined in Equation 6.6.
The nugget effect (nugget variance) is given as:
0 if h = 0
γ (h ) =   (6.5)
c 0 if h > 0
In words, the modelled semivariance has a value of zero for a lag of zero, but is equal to c0 for all positive
lags. In Figure 6.4, the nugget effect is indicated on the y axis of the graph.

Figure 6.4 Bounded variogram model.


100 Christopher D. Lloyd and Peter M. Atkinson

The spherical model, a bounded model (that is, it reaches a sill), is defined as:

c ⋅ [1.5 h − 0.5( h )3 ] if h ≤ a


γ (h ) =  a a
 (6.6)
c if h > a

where c is, as noted above, called the structured component. In words, the modelled semivariance is com-
puted using the top line for all lag values up to and including the range. For lag values larger than the
range the modelled semivariance is equal to c. Authorised models may be used in combination where
a single model is insufficient to represent well the form of the variogram. For example, if the spatial
structure is complex and does not simply increase and level out (as in the example in Figure 6.3) then
models may be combined to take this complexity into account (e.g. a model could comprise a nugget
effect and two spherical components such that there would be two breaks of slope, rather than just one).
The model fitted to the variogram can be used to determine the weights assigned to observations using
a geostatistical prediction procedure (or family of procedures) called kriging. In kriging, the variogram
model is used to obtain values of the semivariance for the lags by which observations are separated and
for the lags that separate the prediction location from the observation.

Kriging
There are many varieties of kriging and a summary of the procedure is also presented by Conolly in this
volume. Its simplest form is called simple kriging (SK). To use SK it is necessary to know the mean of the
property of interest and this must be constant across the region of interest. In practice this is rarely the case.
The most widely used variant of kriging, ordinary kriging (OK), allows the mean to vary and the mean is
estimated for each prediction neighbourhood. The OK predictions are weighted averages of the n available
data (that is, the predictions are based on the n nearest neighbours of the prediction location). The OK
prediction, zˆ( x0 ) , is defined as:
n
zˆ( x0 ) = ∑ λi z( xi )  (6.7)
i =1

with the constraint that the weights, li, sum to 1 (this is to ensure an unbiased prediction):
n

∑λ
i =1
i = 1 (6.8)

In words, the objective of the kriging system is to find appropriate weights by which the available
observations will be multiplied before summing them to obtain the predicted value. These weights are
determined using the coefficients of a model fitted to the variogram (or another function such as the
covariance function).
The weights are obtained by solving (that is, finding the values of unknown coefficients in) the OK
system:

 n
∑ λ j γ ( xi − x j ) + ψ = γ ( xi − x0 ) i = 1,..., n

 j =1  (6.9)
 n
∑ λ j = 1
 j =1
Geostatistics and spatial structure 101

where y is what is called a Lagrange multiplier. This equation may seem at first sight complicated. In
words, it says that the sum of the weights multiplied by the modelled semivariance for the lag separating
locations xi and xj plus the Lagrange multiplier equals the semivariance between locations xj and the pre-
diction location x0 with the constraint that the weights must sum to one. The way we find the weights
and the Lagrange multiplier is outlined below.
Computing the weights and a value of the Lagrange multiplier, y, allows us to obtain the prediction
variance of OK, a by-­product of OK, which can be given as:
n
σˆOK
2
= ∑ λi γ ( xi − x0 ) + ψ  (6.10)
i =1

The kriging variance is a measure of confidence in predictions and is a function of the form of the var-
iogram, the sample configuration and the sample support (the area over which an observation is made,
which may be approximated as a point or may be an area) (Journel & Huijbregts, 1978). If the variogram
model range is short, then the kriging variance will increase markedly with distance from the nearest
sample(s). There are two varieties of OK: punctual OK and block OK. With punctual OK the predic-
tions cover the same area (the support, V) as the observations. In block OK, the predictions are made to
a larger support than the observations (e.g. prediction from points to areas of 2 m by 2 m). The system
presented here is for the more commonly used form, punctual OK.
Returning to Equation 6.9, using matrix notation, the OK system can be written as:

Kλ = k  (6.11)

where K is the n+1 by n+1 (with n nearest neighbours used for prediction) matrix of semivariances
between each of the observations:
 γ ( x1 − x1 )  γ ( x1 − x n ) 1
 
     
K = 
 γ ( x n − x1 )  γ (xn − xn ) 1
 1  1 0

l are the OK weights and k are the semivariances for the observations to the prediction location (with
one placed in the bottom position):

 λ1  γ ( x1 − x 0 )
   
   
λ =   k =  

 λn  γ ( xn − x1 )
ψ   1 
   

To obtain the OK weights, the inverse of the data semivariance matrix is multiplied by the vector of data
to prediction semivariances:
λ = K −1k  (6.12)
The OK variance is then obtained with:

σ OK
2
= kT λ  (6.13)
102 Christopher D. Lloyd and Peter M. Atkinson

As an example using four data locations, the OK system is given as:

 0 376.905 359.589 379.853 1  λ1   268.116


     
376.905 0 307.108 394.601 1  λ2  311.250
     
359.589 307.108 0 448.401 1 ×  λ3  = 311.983

379.853 394.601 448.401 0 1  λ4  367.662

 1 1 1 1 0 ψ   1 


Note that the semivariance between a given location and itself is set to zero.
Solving the OK system, the weights are as follows: l1 = 0.368, l2 = 0.227, l3 = 0.234, l4 = 0.171 and
y = 33.332.
The predicted value is then given by: (0.368 × 68) + (0.227 × 29) + (0.234 × 48) + (0.171 × 53) =
51.889.
The kriging variance is given by: (0.368 × 268.116) + (0.227 × 311.250) + (0.234 × 311.983) +
(0.171 × 367.662) + (33.332 × 1) = 338.537.
The kriging variance is a useful by-­product which, as detailed earlier, provides a guide to the uncer-
tainty in the predicted values. Where values of the kriging variance are large, this suggests a higher level
of uncertainty; values will be larger as distance from the nearest samples increases and for short range
variation (as defined previously), as this indicates greater spatial variation.
With SK, the mean is assumed to be constant across the study area; this is very unlikely in reality as most
real-­world properties (including, for example, elevation or artefacts) have spatially-­variable means. OK
allows for variation in the mean. In some cases there is a strong spatial trend (e.g. large values in the east
and small values in the west). In such cases, an alternative to OK, kriging with a trend model (KT) may
be advisable and may make more accurate predictions (cf. Lloyd, 2014). There are several other forms of
kriging; cokriging, for example, allows the integration of information about secondary variables. In cases
where we have a secondary variable (or variables) which is cross-­correlated with the primary variable
both (or all) variables may be used simultaneously to make predictions using cokriging. With cokriging,
the variograms (which can be termed autovariograms) of both (or all) variables and the cross-­variogram
(describing the spatial dependence between the two variables) must be estimated and models fitted to
all of these. Cokriging is based on what is called the linear model of coregionalization (see Atkinson,
Webster, & Curran, 1992), which provides a means to model the autovariograms and cross-­variograms,
so as to ensure that the variances of any combination of the variables are positive. For cokriging to be
beneficial, the secondary variable should be cheaper to obtain or more readily available than the primary
variable (i.e. the variable which will be mapped) such as in the case of precipitation maps produced using
information on elevation, with which precipitation is positively correlated. An archaeological example is
given by Conolly and Lake (2006) who suggest the case of lithic artefacts and slope values. If the variables
are strongly related linearly then cokriging may provide more accurate predictions than OK.

Conditional simulation
Kriging predictions are weighted moving averages of the available sample data. Kriging is, therefore,
a smoothing interpolator. Conditional simulation (also called stochastic imaging) is not subject to the
smoothing associated with kriging (conceptually, the variation lost by kriging due to smoothing is added
back) as predictions are drawn from equally probable joint realisations of the random variables (RVs)
which make up an RF model (Deutsch & Journel, 1998). In other words, simulated values are not the
Geostatistics and spatial structure 103

expected values (i.e. the mean) but are values drawn randomly from the conditional cumulative distribu-
tion function (ccdf): a function of the available observations and the modelled spatial variation (Dungan,
1999). The simulation is considered ‘conditional’ if the simulated values ‘honour’ (that is, at data locations,
the simulated values match the observed values) the observations at their locations (Deutsch & Journel,
1998). Simulated realisations represent a possible reality whereas kriging does not. Simulation allows the
generation of many different possible realisations that may be used as a guide to spatial uncertainty in the
construction of a map (Journel, 1996), that is, encapsulating the uncertainty in spatial prediction.
Probably the most widely used form of conditional simulation is sequential Gaussian simulation (SGS).
With sequential simulation, simulated values are conditional on the original data and previously simulated
values (Deutsch & Journel, 1998). With SGS, all locations at which simulated values are required are
visited in random order and the neighbouring values are used to derive the simulated value. By using
different random number seeds the order of visiting locations is varied and, therefore, multiple realisations
can be obtained. In other words, since the simulated values are added to the dataset, the values available
for use in simulation are partly dependent on the locations at which simulations have already been made
and, because of this, the values simulated at any one location vary as the available data vary. Using SGS,
multiple alternative realisations can be generated and the distribution of values simulated at each location
can be used to assess spatial uncertainty.

Case studies
Two case studies are used here to illustrate the application of some key geostatistical tools. These focus on
(1) an earthwork in Northern Ireland and (2) Roman coins in southern Britain. All of the analyses make
use of the R statistical language (R Core Team, 2018) and the Gstat package1 in particular (see Bivand,
Pebesma, & Gómez-­Rubio, 2013).

Ballyhenry rath, County Antrim, Northern Ireland


Geostatistical methods have been used widely to generate topographic surface models. In this case study,
such methods are used to generate models of earthworks using differential GPS data points as a basis.
The focus of the study is an early Christian rath (circular earthwork) in Ballyhenry, County Antrim,
Northern Ireland (see Lynn, 1984). The data points are shown in Figure 6.5; the variogram estimated
from the height data is shown in Figure 6.6. The shape of the variogram is concave (it curves upwards)
at smaller lags – this is typical of smoothly varying properties such as elevation. The semivariances do not
‘level out’ at larger lags (it is not bounded, as discussed above in the introduction to the variogram) – this
suggests that there is a long-­range trend in elevation values. The directional variogram (for 0, 45, 90, and
135° clockwise from north; Figure 6.7) has a similar form to the omnidirectional variogram, although
the maximum semivariances differ supporting the idea that there is a trend in one direction; for example,
a gradual, consistent slope from east to west. It is not desirable that a trend dominates the variogram as
we are concerned with finer scale variation – this is the concern if a variogram model is being used to
make predictions using local data subsets with kriging. For this reason, a detrending approach was used –
this entails fitting a trend model (in this case a flat plane) to the data and then estimating the variogram
from the residuals. If the trend is a simple linear change in elevation then the resulting variogram will
be bounded. Figure 6.8 shows the detrended variogram. It is clear that the semivariances now level
out for at least some directions. The detrended omnidirectional variogram is given in Figure 6.9. The
model fitted is a Bessel model (see Pebesma, 2000) with a sill of 0.257 and a range of 10.379 m. This
is a bounded model which accounts for smooth variation; unlike the spherical model, for example, it is
Figure 6.5 Location of GPS measurements at Ballyhenry rath.
Geostatistics and spatial structure 105

Figure 6.6 Experimental variogram of GPS measured heights, Ballyhenry rath.

concave upwards at the origin. These model coefficients were used for kriging with a trend model (KT).
Variograms provide a useful way of summarising spatial variation and these could be used, for example,
to characterise a set of earthworks and to provide an index of surface roughness in each case.
The KT predictions are shown in Figure 6.10; the shape of the rath, with part of the western side
damaged, is apparent. Given the finely spaced source data, and the smoothly varying elevations, use of
a simpler interpolation method such as inverse distance weighting (IDW) may produce similar results
(see Conolly, this volume) and the added-­value of kriging tends to increase as sample spacing increases
and spatial variation increases. Nonetheless, kriging offers the optimal prediction amongst linear predic-
tors as, unlike IDW, it makes use of information on underlying spatial variation through the use of the
variogram. In addition, a by-­product of kriging is provided – the kriging variance (Figure 6.11). This
is a function of the sampling configuration and the form of the variogram and it constitutes a guide to
uncertainty in predictions.
106 Christopher D. Lloyd and Peter M. Atkinson

Figure 6.7 Experimental directional variogram of GPS measured heights, Ballyhenry rath.

Conditional simulation provides a way to construct multiple equally-­probable realisations (in this case,
multiple sets of different elevation values). These possible realities are arguably a more appropriate rep-
resentation of real-­world properties such as elevation than are the over-­smoothed grids derived through
interpolation; Figure 6.12 shows ’2.5D’ representations of the rath derived using KT (6.12(A)) and a
single realisation derived using conditional simulation (6.12(B)). Note that the simulated map exhibits
greater variation than the KT map and is strictly half as precise as the KT map. However, whereas the
KT map can never exist in reality (because it is over-­smooth) the conditionally simulated map might (it
is one of multiple possible realities).

Coins of Allectus
In AD 286/7 a breakaway empire was formed in Britain and northern Gaul by the usurper Carausius.
After his death in 293 he was replaced by Allectus (died 296), finance minister to Carausius. Carausius
and later Allectus struck billon (debased silver) coins marked with an ‘L’, denoting Londinium (London)
Geostatistics and spatial structure 107

Figure 6.8 Experimental detrended directional variogram of GPS measured heights, Ballyhenry rath.

and a ‘C’, which has been variously attributed to Camulodunum (Colchester), Corinium (Cirencester),
Glevum (Gloucester; on the grounds that C and G are indistinguishable on the coins), or a travelling
mint. Here, the percentage of coins within sites (defined later) which are from the C mint is analysed
using a geostatistical approach. It is worth noting that Portable Antiquities Scheme data are point events
and they could be treated instead as a point pattern; kernel estimation could then be used to produce an
intensity grid (see Bevan this volume). However, the data used here may be considered a ‘random’ sample
(i.e. realisations) of a much larger population (i.e. constituting a RF) of coins and we are not as interested
in the density of coins as in the expectation of the proportional attribution to L or C assuming that coins
may be found anywhere. Geostatistical methods were, therefore, considered appropriate.
Lloyd (1998) noted a western focus for C mint coins, suggesting that Glevum may be the most likely
attribution. In contrast, Walton (2011) observed no obvious trends in the products of the two mints for
either Carausius or Allectus. Here, as in Walton (2011), coins recorded as a part of the Portable Antiquities
Scheme2 are the subject of analysis. Point locations of coin finds were aggregated into grid cells of 5 km
by 5 km and only those cells containing at least two coins marked L or C were retained. The percentage
108 Christopher D. Lloyd and Peter M. Atkinson

Figure 6.9 Experimental detrended variogram of GPS measured heights, Ballyhenry rath, with fitted model
(Bessel model with a sill of 0.257 and a range of 10.379 m).

of coins from the C mint are shown in Figure 6.13. There are clear localised concentrations of coins
from London (corresponding to small percentages of C mint coins) or the C mint, although there is no
obvious region-­wide trend. The focus here is on determining if there is any spatial structure or if their
distribution appears unstructured. Analysis of raw percentages using statistical methods is not appropri-
ate and the percentages were log-­ratio transformed (see Aitchison, 1986) prior to analysis: C mint log
ratio = ln((C mint % + 0.01)/(100.0 – C mint % + 0.01)) (with 0.01 added to prevent logging zeros). In
the final kriged output the log-­ratios were back-­transformed to percentages with: exp(C mint log ratio)/
(1+exp(C mint log ratio)) * 100.0.
Figure 6.14 shows the directional variogram for C mint percentages. This suggests little spatial struc-
ture in most directions – for 0, 45 and 135° the models fitted to the semivariances would be close to flat;
this indicates no spatial structure. However, for 90° (east-­west direction) there is clear spatial structure
with semivariances increasing systematically from the smallest lag to 50 km and then levelling out. Fig-
ure 6.15 shows the variogram for 90° with a fitted model (nugget effect of 15.4, and a spherical model
Figure 6.10 Elevation estimates (in metres), derived using kriging with a trend model.
Figure 6.11 Kriging variances.
Figure 6.12 ‘2.5D’ representation of (a) kriged elevations and (b) conditionally simulated values (viewed from
the southwest). A colour version of this figure can be found in the plates section.
Figure 6.13 Radiate of Allectus: C mint percentages in 5 km grid cells. A colour version of this figure can be
found in the plates section.
Geostatistics and spatial structure 113

Figure 6.14 Directional variogram of C mint percentages.

component with a structured component of 14.2 and a range of 32,549 m; the units are percentages).
This suggests that there is structure in the east-­west direction, corresponding to bands of small/large C
mint percentages with a range of approximately 32 km. Figure 6.16 shows a map of C mint percentages
derived using kriging; this indicates that there are several localised concentrations of C mint coins – an
area to the east of the River Severn, an area around Essex, parts of the English midlands, and areas around
Leicestershire and Lincolnshire. The values in the far southwest are discounted as they fall at the edges
of the study area. A problem underlying analyses based on these data is that they are often single finds
rather than assemblages from sites and here all grid cells with more than two coins are used. Extending
the analysis to include sets of finds from archaeological excavations (as in Lloyd, 1998) would be benefi-
cial. But the provisional findings do suggest that there may be spatial structure in mint products and that
114 Christopher D. Lloyd and Peter M. Atkinson

Figure 6.15 Directional variogram of C mint percentages: 90º clockwise from north (east-­west); with fitted
model.

circulation of coins has not removed all evidence of such structure. However, based on this analysis, there
is clearly no strong evidence for any of the possible candidates for the C mint.

Conclusion
The chapter has introduced some key concepts and standard basic methods in geostatistics. The field is
dynamic and methodological innovation continues. In most analyses, the variogram is assumed to be con-
stant across the study area; however, in many cases the underlying spatial structure is not constant and an
array of methods for estimation of local variograms in such cases have been developed (see Lloyd, 2014 for
a review). Multiple point geostatistics (Mariethoz & Caers, 2015) offers a powerful means to incorporate
information on physical reality in stochastic modelling. Other innovations include the use of non-­linear
distance measures; a simple example is the use of cost surfaces to model travel time between places rather
than using straight line (Euclidean) distances. Negre, Muñoz, and Lancelotti (2016) provide an archaeo-
logical example whereby the walls of a house were used as barriers to the distribution of calcium residues.
The scope and number of applications of geostatistics in archaeology has grown rapidly in the last
decade as the availability of propriety and free open source software environments (e.g. in the R statistical
environment) to implement the methods has increased. A growth in examples has also facilitated new
analyses. Geostatistics has considerable potential in archaeological applications ranging from site prospec-
tion, through to analysis of artefact distributions, soil properties, and construction of earthwork digital
models. In detailing some key principles and outlining some ways in which geostatistical methods have
added to the study of archaeological variables it is hoped that this chapter will encourage further analyses,
Figure 6.16 Kriged map of C mint percentages. A colour version of this figure can be found in the plates
section.
116 Christopher D. Lloyd and Peter M. Atkinson

especially using new and exciting datasets such as those provided via the Portable Antiquities Scheme in
England and Wales.

Acknowledgements
Conor Graham of Queen’s University Belfast is thanked for allowing the use of the Ballyhenry rath data.
The staff of the Portable Antiquities Scheme (https://finds.org.uk/) are thanked for provision of the data
on coins of Allectus.

Notes
1 The gstat package for R was authored by Edzer Pebesma and Benedikt Graeler (https://cran.r-­project.org/web/
packages/gstat/gstat.pdf).
2 The Portable Antiquities Scheme (PAS) is a joint initiative between the British Museum and Amgueddfa Cymru –
National Museum Wales that encourages the general public in England and Wales to record any archaeological
objects they find (https://finds.org.uk/).

References
Aitchison, J. (1986). The statistical analysis of compositional data. London: Chapman and Hall.
Athanassas, C. D., Modis, K., Alçiçek, M. C., & Theodorakopoulou, K. (2018). Contouring the cataclysm: A geo-
graphical analysis of the effects of the Minoan eruption of the Santorini volcano. Environmental Archaeology, 23,
160–176.
Atkinson, P. M., Webster, R., & Curran, P. J. (1992). Cokriging with ground-­based radiometry. Remote Sensing of
Environment, 41, 45–60.
Barrientos, G., Catella, L., & Oliva, F. (2015). The spatial structure of lithic landscapes: The Late Holocene record of
East-­Central Argentina as a case study. Journal of Archaeological Method and Theory, 22, 1151–1192.
Bentley, J., & Schneider, T. J. (2000). Statistics and archaeology in Israel. Computational Statistics and Data Analysis,
32, 465–483.
Bevan, A., & Conolly, J. (2009). Modelling spatial heterogeneity and nonstationarity in artifact-­r ich landscapes.
Journal of Archaeological Science, 36, 956–964.
Bivand, R. S., Pebesma, E., & Gómez-­Rubio, V. (2013). Applied spatial data analysis with R (2nd ed.). UseR! Series.
New York: Springer.
Bocquet-­Appel, J. P., & Demars, P. Y. (2000). Neanderthal contraction and modern human colonization of Europe.
Antiquity, 74, 544–552.
Burnett, R. L., Terry, R. E., Alvarez, M., Balzotti, C., Murtha, T., Webster, D., & Silverstein, J. (2012). The ancient
agricultural landscape of the satellite settlement of Ramonal near Tikal, Guatemala. Quaternary International, 265,
101–115.
Conolly, J., & Lake, M. (2006). Geographical information systems in archaeology. Cambridge: Cambridge University Press.
Deutsch, C. V., & Journel, A. G. (1998). GSLIB: Geostatistical software library and user’s guide (2nd ed.). New York:
Oxford University Press.
Dungan, J. L. (1999). Conditional simulation. In A. Stein, F. van der Meer, & B. Gorte (Eds.), Spatial statistics for remote
sensing (pp. 135–152). Dordrecht: Kluwer Academic Publishers.
Ebert, D. (2002). The potential of geostatistics in the analysis of fieldwalking data. In D. Wheatley, G. Earl, &
S. Poppy (Eds.), Contemporary themes in archaeological computing (pp. 82–89). University of Southampton Depart-
ment of Archaeology Monograph, 3. Oxford: Oxbow Books.
Entwistle, J., McCaffrey, K., & Dodgshon, R. (2007). Geostatistical and multi-­elemental analysis of soils to interpret
land-­use history in the Hebrides, Scotland. Geoarchaeology, 22, 391–415.
Geostatistics and spatial structure 117

Hageman, J. B., & Bennett, D. A. (2003). Construction of digital elevation models for archaeological applications.
In K. L. Wescott & R. J. Brandon (Eds.), Practical applications of GIS for archaeologists: A predictive modelling toolkit
(pp. 113–127). Boca Raton: CRC Press.
Hesse, R. (2010). LiDAR-­derived local relief models: A new tool for archaeological prospection. Archaeological Prospec-
tion, 17, 67–72.
Isaaks, E. H., & Srivastava, R. M. (1989). An introduction to applied geostatistics. New York: Oxford University Press.
Journel, A. G. (1996). Modelling uncertainty and spatial dependence: Stochastic imaging. International Journal of
Geographical Information Systems, 10, 517–522.
Journel, A. G., & Huijbregts, C. J. (1978). Mining geostatistics. London: Academic Press.
Lancelotti, C., Negre Pérez, J., Alcaina-­Mateos, J., & Carrer, F. (2017). Intra-­site spatial analysis in ethnoarchaeology.
Environmental Archaeology, 22, 354–364.
Lloyd, C. D. (1998). The C mint of Carausius and Allectus. British Numismatic Journal, 68, 1–10.
Lloyd, C. D. (2010). Spatial data analysis: An introduction for GIS Users. Oxford: Oxford University Press.
Lloyd, C. D. (2014). Exploring spatial scale in geography. Chichester: Wiley-­Blackwell.
Lloyd, C. D., & Atkinson, P. M. (2004). Archaeology and geostatistics. Journal of Archaeological Science, 31, 151–165.
Lynn, C. J. (1984). Two raths at Ballyhenry, County Antrim early Christian period, each overlying prehistoric mate-
rial. Ulster Journal of Archaeology, Series 3, 46, 67–91.
Mariethoz, G., & Caers, J. (2015). Multiple-­point geostatistics: Stochastic modeling with training images. Chichester: Wiley.
Matheron, G. (1971). The theory of regionalized variables and its applications. Les Cahiers du Centre de Morphologie
Mathématique de Fontainebleau, 5. Fontainebleau: École Nationale Supérieure des Mines.
Negre, J., Muñoz, F., & Lancelotti, C. (2016). Geostatistical modelling of chemical residues on archaeological floors
in the presence of barriers. Journal of Archaeological Science, 70, 91–101.
Neiman, F. D. (1997). Conspicuous consumption as wasteful advertising: A Darwinian perspective on spatial patterns
in Classic Maya terminal monuments dates. In M. C. Barton & G. A. Clark (Eds.), Rediscovering darwin: Evolution-
ary theory and archaeological explanation (pp. 267–290). Archaeological Papers of the American Anthropological
Association, 7. Washington, DC: American Anthropological Association.
Openshaw, S. (1984). The modifiable areal unit problem. Concepts and Techniques in Modern Geography, 38. Norwich
Geo Books. Retrieved from http://qmrg.org.uk/files/2008/11/38-­maup-­openshaw.pdf
Pebesma, E. J. (2000). Gstat manual. Utrecht: Utrecht University.
R Core Team. (2018). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical
Computing. Retrieved from www.R-­project.org/
Revelles, J., Burjachs, F., Morera, N., Barceló, J. A., Berrocal, A., López-­Bultó, O., . . . Terradas, X. (2017). Use of
space and site formation processes in a Neolithic lakeside settlement: Pollen and non-­pollen palynomorphs spatial
analysis in La Draga (Banyoles, NE Iberia). Journal of Archaeological Science, 81, 101–115.
Robinson, J. M., & Zubrow, E. (1999). Between spaces: Interpolation in archaeology. In M. Gillings, D. Mattingly, &
J. van Dalen (Eds.), The archaeology of mediterranean landscapes (pp. 65–83). Oxford: Oxbow Books.
Tobler, W. R. (1970). A computer movie simulating urban growth in the Detroit region. Economic Geography, 46,
234–240.
Walton, P. J. (2011). Rethinking Roman Britain: An applied numismatic analysis of the Roman coin data recorded by the
portable antiquities scheme (Unpublished PhD thesis). University College London.
Webster, R., & Burgess, T. M. (1980). Optimal interpolation and isarithmic mapping of soil properties III: Changing
drift and universal kriging. Journal of Soil Science, 31, 505–524.
Webster, R., & Oliver, M. A. (2007). Geostatistics for environmental scientists (2nd ed.). Chichester: John Wiley and Sons.
Wells, E. C. (2010). Sampling design and inferential bias in archaeological soil chemistry. Journal of Archaeological
Method and Theory, 17, 209–230.
Wheatley, D., & Gillings, M. (2002). Spatial technology and archaeology: The archaeological applications of GIS. London:
Taylor and Francis.
Zubrow, E. B. W., & Harbaugh, J. W. (1978). Archaeological prospecting: Kriging and simulation. In I. Hodder (Ed.),
Simulation studies in archaeology (pp. 109–122). Cambridge: Cambridge University Press.
7
Spatial interpolation
James Conolly

Introduction
To estimate the value of a phenomenon at an unsampled location from samples of surrounding data requires
interpolation. This contrasts with extrapolation which is a process of estimating values beyond the extent of a
sample. Both approaches require a model to provide the estimate, which is based on a prediction function.
This might be a heuristic model – visually estimating or just educated guessing – but if the goal of predic-
tion is to estimate values at multiple unsampled locations in space, then a statistical approach provides a more
robust solution. Often, surface interpolation is used to generate a continuous raster-­based surface model
from a scatter of discrete point samples. The utility of spatial interpolation means that it is a commonly
applied tool used across a wide range of disciplines with interests in visualizing and predicting spatial patterns
and processes. In archaeology, it provides opportunities for the visualization and prediction of a wide variety
of geographically variable phenomenon, such as artifact intensities (densities), topographic features, or even
more complex models of processes such as the space-­time dynamics of cultural change. Conversely, spatial
extrapolation is more prone to error and has more limited value, but might be useful in cases where a clear
trend such as declining artifact densities needs to be estimated beyond a survey zone. For more informa-
tion on extrapolation, see Miller, Turner, Smithwick, Dent, and Stanley (2004) and Peters, Herrick, Urban,
Gardner, and Breshears (2004) for a discussion of issues and applications.
The basics of spatial interpolation are relatively easy to define, but there are several distinct types of
methods – Li and Heap (2008) review over forty – and they vary in their assumptions (or lack of assump-
tions) about the source data and degree of statistical complexity. Choosing an appropriate interpolation
method can be difficult, as archaeologists deal with a variety of different data (from biotic, physical, and
cultural), data are unlikely to have been optimally sampled, and samples may contain sources of noise or
error. To add to this, interpolation outcomes can also be considerably different depending on the method
chosen, so care is therefore required to ensure that the most appropriate method is selected that is sensitive
to the underlying data as well as the goals of the analysis.
The purpose of this chapter is to describe the basic concepts of spatial interpolation, to review and to
provide guidance on the use of three common interpolation methods, and to offer some examples and
discussion of how spatial interpolation provides opportunities for data visualization and prediction that
can build insight into the behaviours which generate patterns in the archaeological record.
Spatial interpolation 119

First, to illustrate the utility of spatial interpolation, consider the following three scenarios, which
capture a representative variety of the types of applications that can benefit from spatial interpolation.

Scenario One You are a conservation or commercial archaeologist, and you need to define the
spatial characteristics (i.e. the varying intensity) of a large artifact scatter identified by a sample
of test units. In this scenario, the goal is primarily data visualization, such that the spatial proper-
ties of the cultural materials can be effectively portrayed and communicated to planners for the
purposes of avoidance. Spatial interpolation provides a solution for illustrating where the highest
artifact intensities are located, and for estimating the edges of the scatter. If removal of the site is
required, spatial interpolation can also offer a guide to the placement and number of excavation
units, and even estimates of artifact recovery rates, to aid in the calculation of costs. A basic but
robust method such as an inverse distance weighted (IDW) algorithm is a good first approach.
Scenario Two You are a graduate student and your research project involves modelling an inun-
dated (underwater) cultural landscape. The goal is to obtain a more precise understanding of the
surface of the lake bed to assist in the identification of submerged shorelines. The data consists
of a combined set of terrestrial topographic and underwater bathymetric measurements that
together provide an opportunity to model a palaeolandscape. The interpolation methods require
consideration of the possibility of localized features such as submerged shorelines but also to the
possibility of errors in data capture. An appropriate resolution sensitive to the required analytic
scale and the intensity of data observations must also be considered. A spline interpolation with a
tension set to respond to local features is a reasonable starting point.
Scenario Three You are researching the spatial characteristics of a sample of radiocarbon dates
obtained from a cultural phenomenon, for example, the spread of a cultivar such as maize
throughout northeastern North America. The goals of the analysis are to define and visualize
the geospatial trend and to identify regions where radiocarbon dates deviate from a global trend,
perhaps exhibiting an early or late adoption pattern. In this scenario the data comprise a set of
georeferenced radiocarbon dates across a large geographic region (1000s of square kilometres).
The spatial interpolation algorithm needs to be sensitive to potential local effects of varying scale,
as well as a likely directional north-­south trend. For these types of highly complex problems, an
appropriate choice of spatial interpolation is kriging, in which the parameters of the interpola-
tion are derived empirically from the data. Geospatial interpolation improves the outcome as it
provides analysts with the ability to identify where the modelled surface is more or less accurate
due to data sampling issues.

Although these three examples are typical uses of spatial interpolation, they do not cover the full range
of scenarios in which spatial interpolation provides a solution to an archaeological problem. Other
applications of archaeological interest, such as the use of spatial interpolation to model a continuous
distribution of soil chemistry values taken from point samples across a historic living surface, have similar
requirements. At its most basic, the process begins with a set of discrete point observations recording the
changing values of a phenomenon distributed across geographic space. This is followed by the selection
and implementation of an appropriate interpolation method, which requires the selection of a range of
parameters, as well as the grid resolution of the output model. The first issue is the selection of an algo-
rithm; options are presented in the following section, including some guidance on which of the myriad
options is likely to provide the best balance between the data requirements, computational and statistical
simplicity, and the archaeological problem to be solved.
120 James Conolly

Method
In this section I explain the methods behind different forms of spatial interpolation, beginning with the
relatively straightforward, followed by the more complex. My goal here is to provide sufficient knowledge
such that the underlying concepts of different forms of spatial interpolation are sufficiently understood so
that practitioners can make an appropriate decision as to which method is the most suitable for a specific
application. Although some forms of spatial interpolation are highly complex and require some under-
standing of more advanced statistical concepts, other approaches are relatively easy yet are also extremely
versatile and powerful tools for surface modelling and data visualization.
Fundamentally, interpolation is a predictive modelling tool used to estimate the value of a quantitative
phenomenon (such as an artifact count) between measurements. Interpolation differs from methods such
as kernel density estimation (KDE – see Bevan, this volume) because interpolation is concerned with
prediction, rather than characterization. Whereas KDE can take a set of point-­based frequency data, such
as artifact counts, and convert observations to densities, KDE does not predict the densities between point
locations – it only characterizes them. Conversely, interpolation predicts the values between observations.
Interpolation is thus by its nature a more complex method than density estimations, but I have written
this chapter for archaeologists without a specialized background in spatial statistics, so I’ve avoided the use
of mathematical formula and have focused instead on written descriptions of the methods to present the
core concepts. It is nevertheless worth defining the fundamental concept of spatial interpolation, to show
how it is a weighted average of sampled data. This is illustrated by the equation (Li & Heap, 2008, p. 4):
n
zˆ( x0 ) = ∑ λi z( xi )  (7.1)
i =1

which simply says that to obtain an interpolated estimate ẑ at location x0, then one must take the sum n
of the weighted values from known locations z(xi), the weighting of each location being given by a pro-
cedure defined by λi. Where interpolation methods vary is in how many samples are needed to estimate
accurately the value at the point of interpolation, and how these samples are to be weighted. At the most
basic level, different methods frame how many of the surrounding known values should be considered,
and to what extent should nearby points be given greater weight than those far away.
With reference to the three scenarios presented in the previous section, I next consider three classes of
interpolation: (1) distance-­weighting; (2) thin plate splines; and (3) kriging (also see Lloyd & Atkinson,
this volume). There are many more approaches to spatial interpolation beyond these three, a few of which
I mention at the end of this section; but most archaeological problems in which spatial interpolation offers
a solution can be solved by the judicious application of one of these approaches.

Distance-­weighting interpolation
Linear interpolation is illustrated in Figure 7.1(a). Using data presented in the figure as an example, it
can be observed that at any location on the imaginary line joining the two points in geographic space,
an interpolated value can be reasonably estimated as a function of the linear distance (d) between the
known values. Thus, the value at the mid-­point of the line will be the mean of the two known observa-
tions; but as the point of interpolation moves closer to one of the two points the interpolated value is
linearly weighted towards the closest point. On this basis, point D in Figure 7.1(a) is therefore reasonably
estimated as being equal to 15.
Figure 7.1(b) shows a scenario in which the values at three points are known. This enables an
interpolation to be made at any point within the polygon formed by the three points (A, B and C).
Spatial interpolation 121

B = 21 B = 21
d=6 d=6
D D
d=3 d=3
North

North
d=
5
A = 12 A = 12
C=8

a East b East

Figure 7.1 Simple interpolation examples: (a) with two known point values, using linear interpolation esti-
mates D = 15; (b) adding a third sample location and using inverse distance weighted squared interpolation
estimates D = 12.5.

This is in fact the basis of a form of interpolation that is called a triangulated irregular network (TIN), in
which three neighbouring points form the vertices of a triangular surface that is empty of other points.
TIN-­based approaches were common in the formative days of GIS-­based spatial analysis, as they were
computationally easy and visually effective; however, they are one of the least accurate interpolation
methods and have been superseded by alternative ways of surface modelling. To interpolate the value,
rather than two points, we form an estimate from three or more points. As with the first example,
we consider the distances between the interpolation point and other known points and provide more
weight to the closer points. This assumes that points closer to the unsampled location are more similar
than those lying further away – in other words, that there is some positive spatial autocorrelation in
the dataset. A basic way of weighting points is to use the inverse (reciprocal) of the distance, or more
often the inverse of the distance raised to a power of two (i.e. 1/d2), to reduce the weight of more
distant points. Thus, the weight of each of the known point values to the interpolation value decreases
with distance to the location of interpolation, giving rise to the method known as an inverse distance
weighted (IDW) interpolation. The formula defines the method for estimating the value Z at point p,
based on the values of the other i points and their distances.


n
(Zi / d i2 )
Zp = i =1
 (7.2)

n
i =1
(1 / d i2 )

With reference to Figure 7.1(b), the equation 7.2 is applied to give the interpolated value at point D:

12 21 8 
 + + 
 32 62 52 
D= = 12.5
1 
 + 1 + 1 
 32 62 52 

As the number of available sample points with known values increases beyond three, then the accuracy
of the predication may be improved by using more points to calculate the prediction. This is defined as
the neighbourhood search area, which can be limited either by a distance or a defined number of closest
points. Because most GIS programs with interpolation tools will by default use the nearest 12 neigh-
bours and apply a power of 2 to the distance weighting, this gives rise to its common shorthand name
of IDW-­12.
122 James Conolly

As the neighbourhood search size or number of neighbouring points increases, this will influence the
prediction, but it is not always clear what search size to choose and how to weight more distant points. In
fact, the choice of neighbourhood size (n) and the power weighing (p) is arbitrary and depends both on
the goals of the interpolation and the characteristics of the sample points. For example, increasing p to val-
ues above 2, to 3 or even 4, will increase ‘bumpiness’ as it pays more attention to local values. Conversely,
the smoothness of the surface can be controlled by using a greater number of neighbours. Selecting
n = 12 or more may work well with uneven or noisy samples, in which the goal is to find a more general
trend. IDW methods thus usually require some degree of experimentation to dial-­in the balance between
a surface that is locally sensitive and one that illustrates the regional pattern.
It is worth considering an alternative interpolation method in cases where the underlying data leads to
problems with IDW results. This is typically manifest by the surface showing multiple peaks and pits around
the original data – and adjustment of the weighting parameters does not improve the result. The two most
common alternatives are thin plate splines (TPS) and interpolation with geostatistics, called kriging.

Thin plate spline interpolation


A spline is a continuous curve that connects a set of points. The term ‘thin plate spline’ (TPS) describes
a method conceptually analogous to a stiff plate of metal being warped to lie across a sample of points
in three-­dimensional space. Each point’s attribute is treated as a z-­value (elevation), and the resulting
surface should ideally pass through the data points with the least amount of curvature – i.e. it should
be as smooth as possible. Because the plate intersects each point and curves smoothly between points of
different values, it is a good approach for smoothly undulating surfaces such as land surface elevations
or climate measurements, such as rainfall, in which localized anomalies are minor (Hutchinson, 2007;
Hancock & Hutchinson, 2006).
As with IDW, splines have a set of parameters that can be manipulated, including the size of the
neighbourhood in which the spline obtains predictions. Like IDW, choosing parameters is typically by
experimentation, with 12 neighbours often set as the initial default. There are two basic approaches with
their own parameters that control the relationship of the model in regard to the original points:
Regularized splines allow for curves that extend beyond the range of z-­values, and this allows for
smoother gradients in the output surface. This can be controlled by a parameter in the function called
the regularization weight: higher weights reduce the stiffness of the spline and result in smoother surfaces
that can extend beyond the z-­values of the known samples (Figure 7.2(a)). The correct regularization
parameters can be obtained by experimentation, although some implementations of spline interpola-
tion include a built-­in procedure for estimating the correct weighting through cross-­validation. This is
a method of error estimation created by generating a series of surface models using different weighting
parameters in which a sample of known observations are withheld from each model. Each of the surface
models can then be compared to the known data values to obtain an estimate of the error in each model,
and the model with the lowest error then defines the optimum weighting value.
Tension splines are constrained to the original data values and prevent the surface curving beyond the
z-­range of the point sample (Figure 7.2(b)). If the data collection strategy has defined the range of values
(e.g. within an elevation survey, whether the highest and lowest elevations were captured), then a ten-
sioned spline is likely to give more accurate results at the expense of smoothness. A weighting parameter
can also be included for tensioned splines, providing some control over the amount of smoothing in the
final surface – although with tension splines, increasing the weight will decrease the smoothness.
As with IDW-­based interpolation, there is a lot of subjectivity and arbitrariness in the selection of
parameters that control the output. Some experimentation is thus necessary to obtain a model that meets
Spatial interpolation 123

Elevation

Elevation
a East b East

Figure 7.2 Spline as a concept: (a) regularized with high weighting, allowing the interpolation estimates to
exceed the z-­values to maintain smoothness at points marked by arrows; (b) a tension spline, which adheres to
the original data values at the expense of smoothness.

the needs of the analysis. Overall, however, thin plate spline approaches are usually better than IDW
methods for data in which smoothness is valued in the final product, such as in elevation models. Be
aware though that smoothness may be misleading if the underlying data is in fact rough and the goal of
the analysis is to illustrate or model this characteristic. In instances where roughness could be attributed
to measurement noise (e.g. in bathymetric or topographic survey in which vegetation may be impacting
true ground height), or in which more generalized trends are desired (at the expense of local accuracy),
then splines do offer a robust solution.
One inherent problem with both IDW and spline interpolation methods is that the neighbourhood
size and weighting parameters are typically arbitrarily assigned and visually evaluated. Although it is
certainly possible to estimate parameters less subjectively by using cross-­validation, this is rarely imple-
mented in the GIS environments in which interpolation is typically performed, and thus it is not routinely
applied. This is potentially problematic, as a visually-­satisfying surface may be erroneous, and without
some form of evaluation of its accuracy it may be uncritically adopted and used as the basis for further
interpretation, compounding the error.

Kriging
Kriging (pronounced with a hard ‘g’, after the South African Daniel Krige) is an interpolation method
in which the parameters of the interpolation are estimated empirically using geostatistics. The integra-
tion of geostatistics means that kriging is a more complicated form of interpolation, but it provides some
advantages over IDW and spline approaches. The primary advantage is that by integrating geostatistics
to estimate weighting parameters, the interpolation is sensitive to the characteristics of the samples, and
generally produces a more accurate surface model. Some forms of kriging can also generate an error
surface so that the interpolation’s accuracy can be evaluated across the sampling window. Finally, because
kriging requires the analyst to examine the spatial characteristics of the data set before interpolation, it
provides opportunities to examine underlying spatial patterns in sample data, which can lead to a better
overall decision about the type of kriging to apply that will in turn improve the outcome.
Kriging works by first measuring the spatial autocorrelation of sample points. Spatial autocorrelation
is a measure of the relationship between distance and similarity: positive spatial autocorrelation describes
a situation in which the value difference between pairs of points is correlated with distance, such that the
closer two points are in space, the more similar they are likely to be (also see Hacıgüzeller, this volume).
124 James Conolly

semivariance (γ)

0 10 20 30 40 50 60 70 80 90 100 m
distance (lag)

Figure 7.3 A variogram showing increasing variance between samples of values drawn from increasing dis-
tances apart. After a distance of 60 m there is no increase in variance.

This relationship between distance and similarity is expressed on a graph called a variogram, which plots
distances between pairs of points on the x axis (referred to as the ‘lag’), and a statistic called the variance,
denoted by γ on the y axis. The variance is a measure of the variability in differences between all pairs of
point values within a defined lag (such that as lag increases, the variance normally increases too, up to the
level of the variation in the entire sample). The reason that a graph of lag against variance is useful is that
for each unsampled location surrounded by points at varying distance, the variogram provides informa-
tion about the distance-­based weighting needed to estimate the value at unsampled locations. This means
that a sample-­specific distance weighting can be derived that is sensitive to the original data and removes
some of the subjectivity inherent in IDW or TPS interpolation methods. For example, a variogram like
the one shown in Figure 7.3 shows that the variance in differences between observed pairs of values
increases rapidly up to about 15 m, variance then increases slowly to about 40 m and hits a maximum
that does not increase any further from distances of 60 m and high. This means an appropriate weighting
factor will consider values within 15 m more significantly than those further away up as these have the
lowest variances and thus predictive power; whereas points more than 60 m away have little predictable
influence on estimates of local values.
As well as the changing influence of distance, some forms of kriging can include the directionality (or
geographic orientation) between pairs of points into the weighting calculation. The term anisotropic refers
to situations when direction independently influences the rate of change in a sample (e.g. Figure 7.4).
This occurs commonly in topographic surfaces but also in other situations when there is a directional-
ity to the source of the response variable, such as might be expected in the amount of material reaching
settlements at progressive distances if primarily distributed along geographically oriented distribution
routes. To detect the presence of anisotropy requires the creation of a two-­dimensional variogram sur-
face, in which the degree of change by direction can be measured, which will typically manifest itself
as an ellipse on the variogram surface. If there is anisotropy, then a method called anisotropic kriging will
produce more accurate surfaces. This requires calculating separate variograms at major and minor axes
of the ellipse, and these angles are then provided to the kriging solution to provide a weighting function
that considers the directionality of the surface.
These are the foundational concepts for all kriging methods, but they can be implemented differently
depending on starting assumptions and the use of additional parameters. Although there are many forms
Spatial interpolation 125

90

85

Northing
80

75

70

Easting

Figure 7.4 Anisotropy in a hypothetical sample of semi-­regularly spaced test units. The isolines depict sherd
counts in 5-­sherd intervals, illustrating how the rate of change is greater on the north-­south axis than on the
east-­west axis.

of kriging, the three basic forms are simple, ordinary, and universal. Note that all three of these methods
can be implemented to estimate values at unsampled point locations (which is typically the default), or in
large polygon units, which is referred to as block kriging. Simple and ordinary kriging both assume that
the variance in point samples within distance ranges is stationary across the sample (i.e. the observed vari-
ation between point samples is not higher in one part of the distribution than another; also see Crema this
volume). Unlike simple kriging, ordinary kriging does not assume a constant mean value within a sample
of points – it allows for the likelihood that values may have a trend and be higher in one part of the sam-
pling window. Of the two, ordinary kriging is usually the better choice as it has the fewest assumptions.
Finally, universal kriging establishes, in a separate process, an equation that describes the first order trend
of the observed data within the neighbourhood search window. The kriging function then is a model
of the residuals from the trend function. The advantage of universal kriging is that first-­order patterns
are managed by a separate model, allowing the kriging function to focus on the variability around the
global trend. In addition to these three are further forms of kriging, such as co-­kriging, which integrates
a secondary (correlated) variable into the weighting function to provide estimates that are described well
in overviews of geostatistics (see, for example, Haining, Kerry, & Oliver, 2010; Webster & Oliver, 2007).
In general, kriging methods work well when there is spatial structure to the underlying data – i.e. some
trend can be observed or is expected in the mean or variation in observations in one or more directions
across the sampling window. In addition, as error surfaces can be created in some forms of kriging, a
measure of certainty can also be provided. This may be more useful in some situations than others, but
if interpolated surfaces are being used for forming decisions about where the highest concentrations in a
distribution, or where the edge of artifact scatter is likely to be, then having some understanding of the
probable error in the modelled surface is likely to be valuable.
Finally, as these descriptions of kriging methods show, interpolation with geostatistics involves some
additional calculations and interpretations to generate the weighting functions, and most GIS packages
have limited capacity for full geostatistical analysis. Dedicated spatial statistics software or geostatisti-
cal plugins provide more customizable solutions including construction of permutation analysis for
126 James Conolly

evaluating the stability of the models. Lloyd and Atkinson provide a detailed explanation of kriging in
this volume; for further extended discussions see Sen (2016).

Sample intensity, measurement scale and edge effects


While the choice of interpolation method and selection of parameters will impact the final surface
model, equally significant are the characteristics of the starting point sample. In fact, comparison of model
output using identical samples has shown that differences in interpolation method had smaller impact
on surface models than differences arising from changing the sample size and the spatial complexity of
the phenomenon being modelled (Aguilar, Aguilar, & Carvajal, 2005). It should also be appreciated that
some data sets may be too sparsely or irregularly sampled to provide a meaningful output (see Banning,
this volume).
Increasing the resolution of the interpolated surface (i.e. by reducing the grain, or raster pixel size in
GIS terms) increases the amount of information produced by the analysis (Lam, 2004), but also increases
the storage demands and processing time. While more information is generally better (within the limits of
available computational time and space), the chosen resolution must also reflect the sample spacing of the
observed data points. This is because the ability of the interpolation algorithm to predict an output value
correctly decreases when the measurement scale greatly exceeds the spacing of the observation points.
Most GIS platforms default to output resolutions that are not based on empirical evaluation of the sample
spacing, or instead leave it to the user to subjectively decide. Hengl (2006) provides some useful guidance
and suggests the following equation for deriving an appropriate grid resolution:

A
p = w⋅  (7.3)
N

where p is the grid resolution (i.e. pixel edge length), which is based on the sample area (A) and number
of observation points (N) multiplied by a weighting factor of w. Hengl suggests that a weighting of w =
0.0791 to w = 0.25 is appropriate for random samples, but if the observation sample is regularly spaced
(e.g. as might be the case following a defined sampling interval), then a more appropriate weight is w =
0.5. It is better to err on the side of less precision (i.e. larger pixels) to maintain accuracy, although (as
shown later) an evaluation of the impact of larger or smaller resolutions is often worthwhile.
Edge effects are a major concern in interpolation. Many processes we wish to model are continuous
beyond our sampling window, and thus suffer from a lack of sampled information beyond the window.
As the accuracy of a prediction is partially dependent on being surrounded by locations where the true
values are known, one solution is to reduce the extent of the area to be interpolated so that it is sur-
rounded by known points. This is formally known as a border-­area edge correction (Yamada, 2009).
How much to step in though is dependent on the intensity of the point sample and thus is related to
average n-­th nearest neighbour distance, where n is the number of neighbours used in the interpola-
tion. For example, in Figure 7.5, to avoid edge effects in a routine interpolation based on eight nearest
neighbours, an estimate of the step-­size can be calculated by deriving the average distance from each
point to its eighth nearest neighbour (nn). In this case, the average nn distance is 11 m, and this provides
a width of the internal buffer around the sample distribution. The obvious disadvantage to this approach
is potentially considerable data loss at the edges, but the error rates in this zone are high because of edge
effects, so keeping it risks false confidence in the model. There are other methods, but all solutions require
compromises between predictive accuracy and data loss. Yamada (2009) provides several examples of ways
to manage these concerns.
Spatial interpolation 127

Model comparisons
There can be considerable variation in the output between different interpolation methods and as well
within methods when parameters are adjusted. For example, consider the following typical scenario
which consists of a scatter of 202 small (30 × 30 cm) test units across an area approximately 70 m wide by
130 m long (Figure 7.5(a)). The data was collected in advance of an excavation project on a grassy field
interspersed with buildings (the most substantial of which is marked as such), and each test unit records
the count of pottery artifacts observed at that location (for general context, please see Conolly et al.,
2014). There is a considerable amount of local variation (noise) in the data, but there is a global trend of
higher values in the southwest declining to the northeast. The goal of interpolation is to visualize this
trend as a continuous process in order to estimate the scale of the sub-­area(s) with substantially higher
artifact counts.
Three interpolation methods were used to construct the surfaces: inverse distance weighting (IDW),
splines, and kriging. In each, different parameters were selected to evaluate the impact these had on the
predictive accuracy of the modelled surface. Model resolution was also considered. There are 202 obser-
vations in the study area of 9100 m2, and from equation 7.3 with a weighting factor of 0.4 to reflect the
semi-­regular spacing, an appropriate pixel dimension is 2.7 m. To evaluate the impact of resolution on
accuracy, two models were run using each set of parameters for method: one with a pixel dimension of
1m, and one at 3m. Table 7.1 summarizes the methods and parameters used.

Figure 7.5 Archaeological point sample. (a) location of samples and artifact counts; (b) sample with border-­
area edge correction. The random test samples are designated by an ‘x’; the building location by the rectangle.
128 James Conolly

Table 7.1 Interpolation methods and parameters used in the analysis.

Method Parameters Anisotropic Grain

IDW Neighbours = 4, 8, 12 No 1m, 3m


Regularized spline Neighbours = 4, 8, 12 No 1m, 3m
Tensioned spline Neighbours = 4, 8, 12 No 1m, 3m
Simple kriging Spherical model No 1m, 3m
Simple kriging Spherical model Yes 1m, 3m
Ordinary kriging Spherical model No 1m, 3m
Ordinary kriging Spherical model Yes 1m, 3m
Universal kriging Polynomial trend, Spherical model No 1m, 3m
Universal kriging Polynomial trend, Spherical model Yes 1m, 3m

Table 7.2 Ranked RMS results by method and resolution.

Rank Method 1-­m 3-­m

15 Regularized spline (4) 11.2 10.7


14 Regularized spline (8) 10.7 10.6
13 Regularized spline (12) 9.5 9.4
12 Tension spline (4) 9.3 8.9
11 Tension spline (8) 8.7 8.8
10 Tension spline (12) 8.6 8.3
9 IDW (4) 7.3 7.5
8 Universal kriging (anisotropy) 7.1 7.2
7 IDW (8) 7.0 6.8
6 Ordinary kriging (anisotropy) 6.8 6.7
5 Simple kriging 6.8 6.8
4 IDW (12) 6.7 6.4
3 Universal kriging 6.6 7.2
2 Simple kriging (anisotropy) 6.5 6.6
1 Ordinary kriging 6.5 6.5

To evaluate the relative predictive error in each model, a random evaluation sample of thirty test units
was selected and removed from the analysis. The models were constructed on the remaining sample of
172 observations. A root-­mean-­square (RMS) error was calculated for the difference between the evalu-
ation sample and the predictive surface for each model. (Visual comparison of the models and the RMS
results are presented in Figure 7.6 and Table 7.2).
First, as expected, there is a slight increase in accuracy afforded by a 3-­m resolution over 1-­m pixel res-
olution, but it is not significant and the latter has been selected for model outputs. Second, the RMS results
establish that kriging is consistently able to produce more accurate predictions than other interpolation
Figure 7.6 Visual differences in the surfaces of nine interpolation methods at 1 m resolution. RMS errors
for each model are provided in Table 7.2. A colour version of this figure can be found in the plates section.
130 James Conolly

methods, even if in this instance the results are only marginally more accurate over IDW-­12. In this set of
data, the spline models performed relatively poorly, as they are better suited to phenomena with smoother
transitions and higher spatial autocorrelation, such as elevation surfaces. Clearly, a more locally sensitive
interpolation method like IDW provides some advantages over splines. In fact, in this implementation,
IDW-­12 is roughly equivalent to ordinary kriging, suggesting that it is a reasonable model to use if sim-
plicity in implementation is worth more than the additional insight and potential increase in accuracy
geostatistical modelling provides.

Case studies
Two case studies are considered, at two different scales of analysis, that illustrate how interpolation meth-
ods can convert samples of discrete observations into critical insights into the spatial patterns of human
behaviour. The first case study concerns the reconstruction of use-­of-­space in Late Neolithic (LN) and
Copper Age (CA) sites in Hungary by Salisbury (2013). The second examines the use of interpolation
to provide insight into the geographic patterns at continental scale generated by the spread of Neolithic
agricultural practices from Southwest Asia into and across Europe by Fort (2015).
In the first example, the goal of the analysis is to visualize the spatial variation in soil chemistry from
habitation sites in order to reveal patterns in use of space. This is a popular form of spatial analysis that
depends on interpolation and has been approached in different ways in a variety of contexts (see, for
example, Rondelli et al., 2014; Mikołajczyk & Milek, 2016; Negre, Muñoz, & Lancelotti, 2016). To
illustrate the potentials and a few pitfalls in the application of interpolation methods work by Salisbury
(2013) is used. The specifics involve a sample of six LN and CA habitation locales in eastern Hungary
and the data consists of element abundance (ppm) in soil samples taken systematically (at a 10 m or 5 m
grid interval) across each site. The multivariate data was reduced using principal components analysis
(PCA) to identify correlated variation in groups of elements that are assumed to reflect different anthro-
pogenic processes (e.g. cooking, food discard, metal working). The first five of the PCA component axes
(cumulatively representing over 80% of the variation) were then used as the variables to be interpolated.
Each component was examined separately by assigning each of the original samples a value based on the
sample’s position on the component’s axis.
Salisbury (2013) used ordinary kriging to produce the interpolated maps for visualizing the spatial
patterns. As described in the previous section, this is an excellent approach as it measures local variation
in the mean observed values across different portions of the data to generate a weighting function for
the interpolation. The maps generated from this analysis illustrate clear differences between the differ-
ent PCA components, which the author uses to interpret patterns in the use of space and their different
chemical signatures. However, note that the output scale appears to have been defined at too fine a grain
given the distances between sample points and this appears to have led to some instability in at least one
of the outputs (Figure 7.7).
Unfortunately, the author did not provide any information about the specifics of the scale used, nor
about the variogram model, leaving it to readers to trust that kriging methods were correctly and appro-
priately derived. More fundamentally, as the raw data is not provided in this paper, there is no opportunity
for interested readers to replicate the analysis or build on this work using different methods. I highlight
this not to criticize the analysis or interpretation, only to illustrate that without sufficient supplementary
data the interpolation methods and output can only be taken on trust. Nevertheless, the interpolations
do provide information about patterns in different chemical signatures in the soil, and these patterns can
be evaluated and substantiated using other archaeological data. Without the methods provided by spatial
interpolation, these insights would be difficult to obtain.
Spatial interpolation 131

Figure 7.7 Interpolation example modified from Salisbury (2013, Figure 5). Note that the high resolution
(small pixel size) has exceeded the limits of the original data. The interpolation is thus unstable and noisy where
there is higher local variance, for example in the area north of the ‘trample zone’ (arrows).

The next illustrative case study concerns a much smaller scale and correspondingly larger region of
analysis, which is the spread of agriculture across Europe. There are many papers on this topic which use
some form of interpolation to calculate patterns of movement (e.g. Ammerman & Cavalli-­Sforza, 1984;
Gkiasta, Shennan, & Steele, 2003; Bocquet-­Appel, Naji, Vander Linden, & Kozlowski, 2009). A recent
illustrative example by Fort (2015) uses interpolation to visualize and derive estimates for the absolute
speed of demic (i.e. movement of people) versus cultural diffusion related to this economic transforma-
tion. Fort’s work is based on a point sample of nearly 1000 radiocarbon dates scattered across Europe that
record the dates and locations of early farming communities. As there has been a long-­standing debate
over the relative importance of demic versus cultural processes in this transition, Fort’s stated goal was to
use the temporal patterns to distinguish between the two processes.
Fort first derives mathematical models for demic, cultural, and demic-­cultural diffusion rates and
shows that demic-­cultural diffusion will spread over space faster than just demic or cultural diffusion
alone. Second, Fort interpolates the radiocarbon point data to generate a surface model to visualize the
temporal patterns related to the appearance of agriculture. Because there is a clear southeast-­northwest
spatial trend he uses universal kriging to allow for the first order trend surfaces to be incorporated into
the weighting algorithm, although he does note that experimentation with different methods produced
132 James Conolly

Figure 7.8 Interpolation example modified from Fort (2015, Figure 1). An interpolated surface model of
radiocarbon dates from Neolithic sites (black dots) depicting the space-­time process of the spread of agriculture
across Europe. Note the areas indicated by arrows in the southwest and northeast of the model showing where
data sparseness causes instability in estimates. A colour version of this figure can be found in the plates section.

similar results. Like the Salisbury paper discussed earlier, there is no information on the model and vario-
gram, but the raw data is made available in Isern, Fort, and Vander Linden (2012) for further analysis and
verification by interested readers. Following the interpolation, Fort uses the predictive surface to derive a
local ‘directional’ surface that visualizes the speed and geographic direction of the transmission.
Note in the model (Figure 7.8) how the sparseness of the data samples in some locations (e.g. in Spain)
causes some instability in the predictions that create artificial looking temporal boundaries. This is an
unavoidable problem with sample data that are unevenly distributed, but contributing to the difficulties in
this case could be a grain that is too fine for the density of points. Creating sub-­interpolations at different
resolutions to consider the heterogeneity of the sample and then combining into a single map is a potential
solution. As a further possibility, because calibrated radiocarbon dates are non-­normal probability distri-
butions, it would be useful to consider applying a Monte Carlo approach to generate multiple predictive
surfaces based on radiocarbon values randomly selected from under each site’s pooled probability curve.
This would allow for estimation of uncertainty in the surface prediction reflecting the challenge of using
radiocarbon data for time-­space models. The details of this are beyond the scope of this chapter but it serves
to illustrate the potential additional uses of interpolation for the visualization of space-­time dynamics.

Conclusion
I have described the core concepts underlying all forms of spatial interpolation, along with the primary
methods that archaeologists have used in the past and continue to use when they need to build and
Spatial interpolation 133

interpret continuous surfaces from point observations. As so much of archaeological data is collected as
point observations – or can easily be converted to point observations – this means that interpolation has
a very wide range of potential applications.
From the case studies and scenarios described, it should be clear that interpolation methods are
also very flexible and offer multiple ways to tailor an analysis to fit the character of the data and the
inferred spatial process that is being modelled. However, inexperienced users are strongly advised to
not accept the defaults in many of the GIS software platforms, as these rarely provide optimum solu-
tions. Instead, first consider questions such as whether there is a spatial trend in the data, whether there
is a potential for anisotropy, whether the surface process is likely to be smooth or rough, whether there
are boundaries and how these are to be managed, how the edge effect is going to be managed, and what
the appropriate output resolution should be. Careful consideration of these questions will certainly
lead to a more successful application of interpolation and is likely to produce a more accurate surface.
Experimenting with several approaches and parameters and evaluating them against a test sample of
known points withheld from the analysis or in a more formalized way may be the best option. For
critical application formal evaluation of predictive accuracy is the preferred solution, such as through
test and training samples, cross-­validation or by using a method such as ordinary or universal kriging,
which is generally seen as a more robust approach than IDW or TPS approaches, especially with data
having a global trend.
As a final note,interpolation does not have to end with visualization. There are ways of generating additional
insights into spatial processes by using the interpolated surfaces to derive other measurements. The obvious
ones are using maps of elevation to derive slope and aspect maps, but as shown earlier, Fort (2015) explains how
he obtains a direction of change map based on his interpolation of radiocarbon dates. If maps are produced
representing artifact density, they can similarly be converted into insightful visualizations that locate major
rates of change using similar methods. Interpolated maps showing different artifact densities (e.g. comparing
pottery to lithics) can also be manipulated with map algebra to illustrate how the two are correlated. These
types of approaches more generally fall under the umbrella of spatial data manipulation and raster or map
algebra but are mentioned here to emphasize that analysis need not end with the construction of a continuous
surface of a spatial process – surfaces can be the building blocks for additional forms of data visualization and
analysis.

References
Aguilar, F. J., Aguilar, M. A., & Carvajal, F. (2005). Effects of terrain morphology, sampling density, and interpolation
methods on grid DEM accuracy. Photogrammetric Engineering and Remote Sensing, 71(7), 805–816.
Ammerman, A. J., & Cavalli-­Sforza, L. L. (1984). The Neolithic transition and the genetics of populations in Europe.
Princeton, NJ: Princeton University Press.
Bocquet-­Appel, J.-­P., Naji, S., Vander Linden, M., & Kozlowski, J. K. (2009). Detection of diffusion and contact
zones of early farming in Europe from the space-­time distribution of 14c dates. Journal of Archaeological Science,
36(3), 807–820.
Conolly, J., Dillane, J., Dougherty, K., Elaschuk, K., Csenkey, K., Wagner, T., & Williams, J. (2014). Early collective
burial practices in a complex wetland setting: An interim report on mortuary patterning, paleodietary analysis,
zooarchaeology, material culture and radiocarbon dates from Jacob Island (BcGo17), Kawartha Lakes, Ontario.
Canadian Journal of Archaeology/Journal Canadien d’Archéologie, 38, 106–133.
Fort, J. (2015). Demic and cultural diffusion propagated the Neolithic transition across different regions of Europe.
Journal of the Royal Society Interface, 12, 20150166.
Gkiasta, M., Shennan, S., & Steele, J. (2003). Neolithic transition in Europe: The radiocarbon record revisited.
Antiquity, 77, 45–62.
134 James Conolly

Haining, R. P., Kerry, R., & Oliver, M. A. (2010). Geography, spatial data analysis, and geostatistics: An overview.
Geographic Analysis, 42, 7–31.
Hancock, P., & Hutchinson, M. (2006). Spatial interpolation of large climate data sets using bivariate thin plate
smoothing splines. Environmental Modelling and Software, 21(12), 1684–1694.
Hengl, T. (2006). Finding the right pixel size. Computers and Geosciences, 32(9), 1283–1298.
Hutchinson, M. (2007). Interpolating mean rainfall using thin plate smoothing splines. International Journal of Geo-
graphical Information Systems, 4, 385–403.
Isern, N., Fort, J., & Vander Linden, M. (2012). Space competition and time delays in human range expansions.
application to the neolithic transition. PLoS One, 7(12), e51106. doi:10.1371/journal.pone.0051.
Lam, N. S.-­N. (2004). Fractals and scale in environmental assessment and monitoring. In E. Sheppard & E. B.
McMaster (Eds.), Scale and geographic inquiry: Nature, society, and method (pp. 23–40). Hoboken, NJ: Wiley.
Li, J., & Heap, A. D. (2008). A review of spatial interpolation methods for environmental scientists. Techni-
cal Report Record 2008/23, Retrieved from Geoscience Australia, Department of Resources, Energy
and Tourism, Commonwealth of Australia. Retrieved December 5, 2018, from https://data.gov.au/
dataset/a-­review-­of-­spatial-­interpolation-­methods-­for-­environmental-­scientists
Mikołajczyk, L., & Milek, K. (2016). Geostatistical approach to spatial, multi-­elemental dataset from an archaeological
site in Vatnsfjörður, Iceland. Journal of Archaeological Science: Reports, 9, 577–585.
Miller, J. R., Turner, M. G., Smithwick, E. A. H., Dent, C. L., & Stanley, E. H. (2004). Spatial extrapolation: The
science of predicting ecological patterns and processes. Bioscience, 54(4), 310–320.
Negre, J., Muñoz, F., & Lancelotti, C. (2016). Geostatistical modelling of chemical residues on archaeological floors
in the presence of barriers. Journal of Archaeological Science, 70, 91–101.
Peters, D. P. C., Herrick, J. E., Urban, D. L., Gardner, R. H., & Breshears, D. D. (2004). Strategies for ecological
extrapolation. OIKOS, 106(3), 627–636.
Rondelli, B., Lancelotti, C., Madella, M., Pecci, A., Balbo, A., Pérez, J. R., . . . Ajithprasad, P. (2014). Anthropic activ-
ity markers and spatial variability: An ethnoarchaeological experiment in a domestic unit of Northern Gujarat
(India). Journal of Archaeological Science, 41, 482–492.
Salisbury, R. B. (2013). Interpolating geochemical patterning of activity zones at Late Neolithic and Early Copper
Age settlements in eastern Hungary. Journal of Archaeological Science, 40(2), 926–934.
Sen, Z. (2016). Spatial modeling principles in earth sciences (2nd ed.). New York: Springer.
Webster, R., & Oliver, M. A. (2007). Geostatistics for environmental scientists. Chichester, UK: Wiley.
Yamada, I. (2009). Edge effects. In R. Kitchin and N. Thrift (Eds.), International encyclopedia of human geography
(pp. 381–388). Elsevier.
8
Spatial applications of correlation
and linear regression
Piraye Hacıgüzeller

Introduction
Identifying the relationships that exist between variables and exploring the nature of these relationships is
an essential element of any study of archaeological spatial phenomena. Methods of correlation and regres-
sion analysis serve this purpose by modelling how change in one variable (or more variables) is accounted
for in other variable(s). Once the association between the dependent/response and independent/explana-
tory variables is quantified, the next step is to interpret this relationship. This process requires domain
knowledge and a critical approach since the mathematical identification of an association does not neces-
sarily mean there is a meaningful, real-­world association in the observed data set.
There are various methods for linear and non-­linear regression modelling, some of which have
found widespread application on spatially explicit archaeological data sets (i.e. data sets that comprise
observations with geospatial coordinates). One of the major themes in correlation analysis and regres-
sion modelling in archaeology has been the diffusion of populations or “cultures”. For example, Silva
et al. (2015) concentrated on the diffusion of rice cultivation in Asia while Bicho, Cascalheira, and
Gonçalves (2017), which also forms the case study in this chapter, investigated the demic dispersal of
the Anatomically Modern Humans (AMH) across Europe (see also e.g. Cobo, Fort, & Isern, 2019;
Jerardino, Fort, Isern, & Rondelli, 2014; Pinhasi, Fort, & Ammerman, 2005). Another spatial archaeo-
logical theme where one can frequently come across correlation and regression is the association
between “site” or find locations and environmental, cultural and/or administrative variables such as
terrain curvature (Bevan & Conolly, 2004) or modern land use (Bevan, 2012; see also e.g. Carrero-­
Pazos, 2018; Contreras et al., 2018; Winter-­Livneh, Svoray, & Gilead, 2010). In this context binary
logistic regression has also been widely used to model the absence-­presence of archaeological features
and sites (e.g. Bevan & Conolly, 2011; Carrer, 2013; Spencer & Bevan, 2018; cf. Kvamme, this vol-
ume; Verhagen & Whitley, this volume). Among other examples of archaeological spatial phenomena
researched through regression modelling is the association between food collection strategies and
environmental variables (Zhang, Bevan, Fuller, & Fang, 2010), intra-­site artefact densities (Domínguez-­
Rodrigo et al., 2017) and landscape terracing (Fall et al., 2012).
Here I provide an overview of the bivariate correlation and bivariate linear regression with Ordinary
Least Squares (OLS) methods, both commonly used techniques in archaeological spatial analysis. I also
136 Piraye Hacıgüzeller

explain that in spatial applications of these methods (as well as their multiple, multivariate and, in the case
of regression analysis, non-­linear versions) there are particular issues involved caused by spatial autocorre-
lation. The chapter includes methods and examples to deal with these issues both for bivariate correlation
and bivariate linear regression with OLS. Whilst these effects have largely been overlooked in archaeo-
logical applications of correlation and regression, spatial autocorrelation is a ubiquitous issue in spatial
phenomena and results in the replication of information and, hence, redundant information being used
in these analyses and models. As discussed in more detail later and in the case study, disregarding spatial
autocorrelation in correlation and regression studies can have significant effects on the results including
increased chances of committing Type I error (i.e. rejecting the null hypothesis even though it is true),
and yielding less precise and biased regression coefficient estimates.

Method

Correlation

Scatter plots and the question of causality


Correlation is a statistical method used to evaluate potential associations between variables. The first use-
ful step in correlation analysis is plotting a scatter plot to visualise the potential relationship between the
variables in question. Each set of observations is represented by a point in the scatter plot and a first visual
assessment of obvious trends is possible at this stage as the plot may already present important clues about
the linearity, direction (e.g. positive when both variables increase or decrease together) and strength of
the trend present in the data set as well as the outliers and clusters involved.
It is important to note here that correlation scatter plots or correlation studies in general do not prove
a cause-­and-­effect relationship between variables. This is simply because an observed correlation between
variables does not mean that change in one variable is a direct result of the change in other(s). It could be
that the observed correlation is there by chance (in which case the changes in one variable has nothing
to do with changes in the other) or that an additional variable is at play. For instance, in the hypothetical
case of a negative association between settlement frequency and average slope (where the first increases
while the second decreases), it may be that in the particular topographical set up of the landscape under
consideration, flat lands are located closer to water resources and that in fact the choice of human settlers
was decided by proximity to water while their apparent preference for flatlands was merely an indirect
result of this. Therefore, it would be wrong to conclude on the basis of the correlation study alone that
high settlement frequency was a result of mild relief since the real variable at play is proximity to water.

Pearson’s correlation coefficient


The correlation of two variables can be quantified through different measures. Among these, Pearson’s
correlation coefficient (often referred to simply as the correlation coefficient), which applies to metric variables
(interval and ratio), is the most frequently used. Other correlation measures, such as Spearman’s rank cor-
relation coefficient (also known as Spearman’s rho) or Kendall’s tau (τ) are useful when dealing with ranked
data, non-­linear relationships, or when the normality condition for a bivariate data set is not met. Bivari-
ate normality is not a condition for calculating Pearson’s correlation coefficient but is required for its
significance testing as described further below.
As usual in statistics, the population parameter, which in this case is the correlation coefficient of the
population (also referred to as the “true correlation coefficient”) is represented by a Greek letter, rho (ρ).
Applications of correlation and linear regression 137

The corresponding sample statistic, that is the correlation coefficient of a sample taken from the popula-
tion, is represented by the Roman letter r. One of the many versions of the equation used to calculate r is
given by Equation 8.1 where n is the sample size, x and y are the average value of observations for each
variable, and sx and sy are the sample standard deviations for each variable.

n  xi − x 
 y 
1   yi −
r=
n−1
∑ i= 0  
 sx  s y
 

(8.1)

The resulting correlation coefficient can be any real number between –1 and 1. A perfect linear associa-
tion (where all points can be connected through a straight line) will have a magnitude –1 or 1, depending
on the direction of the correlation. In reality, the linear relationship between the variables of a data set
will almost always be imperfect meaning that the points will not sit neatly on a straight trend line and
the correlation coefficient will get a value between –1 and 1, where a strong negative association will be
close to –1 and a strong positive association close to 1. Importantly, a correlation coefficient value of zero does
not mean that there is no correlation between the variables in question but rather that there is no linear
correlation. It is, in fact, possible to have a strong non-­linear association between two variables and yet have
zero as a correlation coefficient (for instance in u-­or n-­shaped relationships).

Significance testing for Pearson’s correlation coefficient and the issue with
spatial autocorrelation
There are no hard-­and-­fast rules to decide whether a certain calculated value of r is “sufficiently high to
make the researcher happy about the level of correlation” (Rogerson, 2010, p. 190). Therefore, the results
of correlation analysis should involve significance testing to judge how reliable r is. Specifically, signifi-
cance testing for Pearson’s correlation coefficient is used to calculate the probability that a sample has a
correlation coefficient, r, which comes from a population with a correlation coefficient, ρ, not different
from zero. If the probability (or p value) of the true correlation coefficient being zero is small in com-
parison to the significance level decided on prior to analysis, denoted by α, then the null hypothesis (that
the population correlation coefficient is equal to zero) is rejected. It can therefore be concluded that in
the parent population of the sample, the two variables in question are also very likely to be correlated. Of
course, here, like in any significance testing, there is always the chance that the null hypothesis is rejected
even though it is true (Type I error) and, as discussed below, with the presence of spatial autocorrelation in
the variable values this chance increases. If, on the other hand, the p value is equal to or larger than α, the
null hypothesis cannot be rejected. This means that there is considerable probability (or at least enough
not to disregard it) that the sample comes from a population where there is no correlation between the
variables in question and the sample might in fact be selected by chance. Here, then, making a Type II
error, where the null hypothesis is failed to be rejected even though it is false, remains a possibility. As a
rule of thumb in statistics, the best way to minimise the risk of making Type I and Type II errors is to
have as large a sample size as possible.
The conditions for the significance testing of r are bivariate normality1 and independence. If these
conditions are met, the t-­statistic for significance testing can be calculated in terms of sample size n and
correlation coefficient r as given in Equation 8.2 and then used in a t-­table to estimate the p value for
comparison to the selected significance level. Importantly, if there is a clear positive or negative correlation
138 Piraye Hacıgüzeller

between two variables, a one-­tailed significance test is performed to test for the possibility of either posi-
tive or negative correlation. If the direction of correlation is unclear, however, a two-­tailed test should be
used to test for the possibility of correlation in both directions.

r (n − 2 )
t=  (8.2)
(1 − r 2 )
Coming to the issue of spatial autocorrelation, Gangodagamage, Zhou, and Lin (2017, p. 92) define spatial
autocorrelation as the “natural inclination of a variable to exhibit similar values as a function of distance
between the spatial locations at which it is being measured” (see also Lloyd & Atkinson, this volume). As
mentioned earlier, independence is one of the conditions for significance testing for the correlation coef-
ficient. This condition requires that the data set in question comprises only independent observations of
the two variables. For x values, this means that each value of x is not affected by and does not affect other
values of x. The same goes for each y value. If the variables in the correlation analysis are spatially auto-
correlated however (which is often the case in spatial applications), this condition is not fulfilled. In such
a situation “a certain amount of information is shared and duplicated among neighbouring locations,
and thus, an entire data set possesses a certain amount of redundant information” (Lee, 2017, p. 361). The
result is that the actual sample size is larger than the effective sample size, the latter being the number of real
independent observations (Gangodagamage et al., 2017, p. 93; Griffith & Paelinck, 2011, p. xxii). Such
artificial inflation of the sample size is problematic when judging whether a correlation coefficient value
in a given study is high or low enough (in the cases of positive and negative correlation, respectively) for
a statistically significant association depends on sample size. As Rogerson (2010, pp. 189–190) explains in
detail, the minimum absolute value of r needed to attain significance decreases as sample size increases.
So, if the sample size is relatively large (e.g. n = 250), a seemingly low r value (e.g. r = 0.2) may point to
a statistically significant correlation. Yet when the same r value is calculated for a smaller sample (e.g. n =
20), the researcher may fail to reject the null hypothesis at the same significance level. Therefore, when
the effective sample size is rendered smaller than the actual sample size due to spatial autocorrelation,
the chances will be elevated that the null hypothesis is rejected even though it is true (i.e. a Type I error
is committed) (cf. Lee, 2017). As will be discussed below, spatial autocorrelation causes similar issues in
regression analysis. There are different ways of measuring spatial autocorrelation in a data set, one of the
most common being through Moran’s I statistic (used also in the case study later). The detailed coverage
of its methodology and calculations is beyond the scope of this paper but readers can confer Rogerson
(2010, pp. 268–273).
Rogerson (2010, p. 194) explains that when only one variable shows spatial autocorrelation in a bivari-
ate correlation analysis, there is no effect on the significance testing result and so spatial autocorrelation is
not an issue. However, when both variables are spatially autocorrelated, corrective measures are necessary
to mitigate the risk of a Type I error occurring. One of the methods that can be used to address this issue
is modifying the significance testing to take the degree of autocorrelation into account. As Lee (2017,
pp. 364–365) points out, this can be done by replacing actual sample size n with effective sample size n*
calculated via Equation 8.3. Here, R̂ x and R̂ y are the estimated n × n spatial autocorrelation matrices
for each of the two variables and trace is a matrix operation summing the diagonal elements of a matrix,
which in this case is the product of R̂ x and R̂ y (see also Haining, 2003, p. 279). The diagonal elements
of this particular product matrix provide the relative degree of bivariate spatial autocorrelation at each
observation location (where 1 corresponds to no spatial autocorrelation and values of more than 1 to
positive spatial autocorrelation) and their sum quantifies the overall degree of spatial autocorrelation. The
effective sample size calculated in this way is then used to form a new t-­statistic for significance testing
Applications of correlation and linear regression 139

(Equation 8.2) which arguably takes spatial autocorrelation at each location into account (see case study
below).
−1
n∗ = 1 + n 2 trace (R
ˆ R ˆ ) 
y  (8.3)
 x

Linear regression with ordinary least squares method

The regression line and residuals


Linear regression analysis comprises a range of methods that involve finding the best-­fitting straight
line to a data set in order to model an association between variables. A linear regression model is essen-
tially this very line, that can be used to make predictions about the phenomenon it explains, through
interpolation and extrapolation. What generally sets regression analysis apart from correlation is that in
the former, variables are identified as dependent and independent, and the relationship between these
variables can be statistically quantified. Linear regression analysis is referred to as bivariate or simple if
the association in question involves only one independent variable and one dependent variable. Ordi-
nary Least Squares (OLS) regression is the most popular method used for linear least squares regression
in particular and linear regression more generally. It is based on minimizing the sum of the squares of
the distances between each data point and the value predicted by the regression line, hence its name.
The difference between the actual data value and the corresponding value predicted by the regression
line is referred to as a residual in regression analysis. A negative residual will be calculated in cases where
the value predicted by the model is larger than the actual data value, while a positive residual will result
from the predicted value being smaller than the actual value. When a data point is on the regression line,
the residual will be equal to zero. It is important to note that OLS is not a suitable method in cases of
spatial heterogeneity (see Crema, this volume) because it typically treats all observations equally (that is,
residuals are not weighted when fitting the trend line).
The regression line equation is calculated as shown in Equation 8.4 where ŷ is the dependent variable
value predicted by the regression line (hence the hat) and x is the observed value for the independent
variable. The slope of the regression line is represented by the letter m and the y-­intercept (i.e. where the
line intersects the vertical y axis) is b. Slope m is calculated as given in Equation 8.5 where r is Pearson’s
correlation coefficient, and sy and sx are the sample standard deviations for the dependent and independent
variables respectively, calculated by taking the square root of the sample variance.

ŷ = mx + b  (8.4)
sy
m= r  (8.5)
sx

The best-­fit regression line determined by the OLS method always passes through the point with values
equal to the sample mean of the two variables, x̅ and y̅. Therefore, the y-­intercept can be calculated using
Equation 8.6 once slope value is known. Given the definition of regression residuals above, the observed
ith value of the y variable, yi, will be equal to the sum of the corresponding predicted value, ŷi and a
residual term (also known as error term) ei, as given in Equation 8.7.

b = y − mx  (8.6)
yi = yˆi + ei = mx + b + ei  (8.7)
140 Piraye Hacıgüzeller

Coefficient of determination (or R-­squared) and root mean square deviation


The coefficient of determination and the root mean square deviation (RMSD) are the two main indices
for measuring how well a linear regression model fits a data set. If we understand regression analysis as a
way to describe variation in the y variable as only partially explained by variation in the x variable, then
the coefficient of determination can be used to quantify what percentage of the y variable is explained
by the x variable in a regression model. The coefficient of determination, therefore, can be calculated
with Equation 8.8, where the numerator is the total variation in y described by x while the denominator
is the variation in y altogether.


n
( yˆí − y )2
r 2
= i=1  (8.8)

n
( y − y )2
i=1 í

The value of the coefficient of determination is equal to the squared value of Pearson’s correlation coef-
ficient r (hence its notation as R2 or r2 or alternatively R-­squared) although discussing why this is the case
mathematically is beyond the scope of this chapter. The significance testing for R2 will involve testing the
null hypothesis that the true coefficient of determination (i.e. the coefficient of the determination of the
population), ρ2, equals zero (i.e. H0: ρ2 = 0). The F-­statistic formed for this test through Equation 8.9 has
1 and n-­2 degrees of freedom for the numerator and denominator respectively and is the square of the
t-­statistic used for the significance testing for Pearson’s correlation coefficient, r (Equation 8.2). Impor-
tantly, the two tests are identical, providing identical p-­values and conclusions when the same significance
level is selected (cf. Rogerson, 2010, pp. 209–210).

r 2 (n − 2)
F=  (8.9)
(1 − r 2 )
RMSD, on the other hand, is concerned with the variation in the y variable that is not explained by the
variation in x. It is also known as the standard deviation of the residuals (or Root Mean Square Error
(RMSE)). Given that standard deviation is about spread, this alternative name implies that it measures the
spread around the regression line in the y direction or, in other words, how precisely the regression line
fits the data points in terms of y values. For bivariate regression, it can be calculated with Equation 8.10
where y – ŷ denotes the difference between actual observations for the y variable and the corresponding
values predicted by the regression line. The denominator, n – 2, is the number of degrees of freedom.


n
( yi − yˆi )2
RMSD = i=1
 (8.10)
n− 2

The issue with spatial autocorrelation


The OLS method assumes that parameters m and b describing the regression line are constant across the
study area. As mentioned, one consequence of this assumption is that an OLS estimation of the regression
line would not be suitable for those cases where the spatial phenomenon in question systematically varies
(i.e. is heterogeneous) across the spatial area under study. Constant parameters also mean that any varia-
tion in the relationship between variables across the study area, which may come about as a result of spatial
autocorrelation within the variables, will be confined to the residual term, ei (Srinivasan, 2017, p. 2067). The
residual term in a linear OLS model of such spatially autocorrelated data sets will therefore also be spatially
Applications of correlation and linear regression 141

autocorrelated which means that residuals will exhibit similar values as a function of distance between their
spatial locations. Crucially, this violates assumptions about the residual term in simple regression which are
the zero mean, independence, constant variance and normal distribution (Rogerson, 2010, p. 281; Sriniva-
san, 2017, p. 2067). Moreover, much like the case for correlation, redundant information is produced by
spatial autocorrelation and hence the linear model estimated through OLS without adjusting for spatial
autocorrelation will be erroneous. A spatial regression model needs to be created instead.
One should, however, not conflate larger scale spatial structures in a study area with the mainly
neighbourhood-­scale (Kühn & Dormann, 2012, p. 995) spatial autocorrelation issues caused by intrinsic
factors. In a perfect regression model with the right choice of independent variables, spatial dependency
in the dependent variable will be fully explained by the spatial dependency in the independent variable(s)
if there is no additional spatial autocorrelation within the dependent variable caused by intrinsic factors.
In that case, the residuals will not be spatially autocorrelated and the spatial autocorrelation in the regres-
sion analysis will not be an issue (Beale, Lennon, Yearsley, Brewer, & Elston, 2010, p. 248; Bini et al., 2009,
p. 194; Kühn & Dormann, 2012, pp. 995–996). The spatial autocorrelation observed within the dependent
variable due to intrinsic factors, on the other hand, will result in the replication of information and hence
redundancy that cannot be explained by the independent variable regardless of how well the latter is chosen
in the modelling process. In such cases the spatial autocorrelation is observed in model residuals and a spatial
regression model is needed to obtain more accurate regression coefficient estimates (Bini et al., 2009, p. 194;
Kühn & Dormann, 2012, pp. 995–996; see also Bevan, this volume; Bevan & Conolly, 2011, pp. 1306–1307
on first-­order and second-­order effects). That said, Beale et al. (2010, p. 248) also stress that the theoretical
conditions mentioned here for the case of broader scale spatial structures (i.e. spatial autocorrelation in the
dependent variable being simply a function of spatial autocorrelation in the independent variable(s)) are
almost never encountered in practice meaning that the presence of spatial autocorrelation almost always
produces spatially autocorrelated residuals. Hence, spatial regression modelling needs to be used for almost
all phenomena where spatial autocorrelation in regression variables is observed.
In real-­world data sets it is impossible to identify the true effects of spatial autocorrelation on regres-
sion analysis because “one can never know if the results are a true reflection of the input data or an arte-
fact of the analytical method” (Beale et al., 2010, p. 247). Therefore, Beale et al. (2010) use simulations
to compare true values from realistic simulation scenarios with regression model parameter estimates and
test and compare the performance of non-­spatial (OLS) and a range of spatial regression methods (e.g. the
Simultaneous Autoregressive Model, Generalised Least Squares and Bayesian Conditional Autoregressive
Model). Their results show that using OLS regression on data sets that are spatially autocorrelated leads to
results with low precision (i.e. high variation around the true value; Beale et al., 2010, p. 247) and, similar
to the aforementioned effect of spatial autocorrelation on the results of correlation studies, increases the
possibility of Type I errors (Beale et al., 2010).
A major point of debate in spatial regression modelling is that different models can provide substan-
tially different regression coefficients for the same data set. Precisely why this happens is unclear and still
needs to be investigated (Beale et al., 2010; Bini et al., 2009). Yet it is clear that with spatial autocorrelation
in regression residuals, spatial regression methods will provide less biased and more precise model coef-
ficient estimates than the non-­spatial OLS method and reduce the chances of Type I errors.
The effects of spatial autocorrelation can be incorporated in linear regression models in two major
ways: through the error term and as co-­variates (cf. Anselin, 2009; Beale et al., 2010, pp. 250–251). In
this chapter, an accessible method to error modelling is presented which is, in fact, an example of a Simul-
taneous Autoregressive Model (SAR), often referred to as the Autocorrelated Errors Model (Bailey &
Gatrell, 1995, pp. 282–286; see Rogerson, 2010, pp. 283–284). For this method, a spatial regression model
is specified in the same way as the OLS linear regression method. The difference is that each residual is
142 Piraye Hacıgüzeller

modelled as a function of the nearby residuals (Rogerson, 2010, p. 283). The method is applied in the
following case study. Thorough discussions on SAR and other spatial regression models as well as further
references can be found elsewhere (e.g. Anselin, 2009; Chun & Griffith, 2013; Srinivasan, 2017).
The method involves calculating two sets of quantities using Equations 8.11. These equations specify
that a new set of values is defined on the basis of weight, w, which indicates spatial proximity, and a ρ
value. The latter is selected by trying a range of possible positive ρ values and observing how these dif-
ferent values improve the residuals when y* is regressed against x*. As Rogerson (2010, pp. 290–291)
explains, the ρ value associated with the “best” set of residuals is selected and among other methods the
decision can be based on minimizing the RMSD calculated for each regression model as further illustrated
in the case study below. Statistically more sophisticated methods for estimating ρ values do exist and
involve a computationally intensive maximum likelihood procedure (cf. Bailey & Gatrell, 1995, pp. 286–
289). Finally, Beale et al. (2010, p. 253) stress, on the basis of their simulation studies, that the regression
models that assign the effect of spatial autocorrelation to an error term, such as the one presented here,
will retain some spatial autocorrelation in the residuals “but the important difference is that these models
are tolerant of such autocorrelation and should provide [more] precise estimates and correct error rates”.

y∗ = y − ρ∑ j =1 w ij y j
n

 (8.11)
x∗ = x − ρ∑ j =1 w ij x j
n

Case study
In their article “Early Upper Paleolithic colonization across Europe: Time and mode of the Gravettian
diffusion”, Bicho, Cascalheira, and Gonçalves (2017) aim to model demic dispersal of Anatomically Mod-
ern Humans (AMH) between c. 37,000 and 30,000 years ago across Europe using correlation and linear
regression. The dispersal phenomenon they study leads to the replacement of the previous Aurignacian
tradition and Neanderthal populations in marginal areas. Some parts of Europe at the time, however, were
still devoid of hominins and were occupied for the first time by Gravettian diffusion during the period
studied. The analyses involve the oldest Gravettian calibrated Accelerator Mass Spectrometry (AMS)
dates from 33 sites spread across Europe. Hence, there is a single date corresponding to each site. The
authors explain carefully how they filter the data set they use in the study. They identify three potential
locations as the oldest Gravettian sites, namely Buran Kaya III, Geissenklösterle and Krems-­Hundssteig.
They hypothesise that each of these sites may be the origin of the Gravettian techno-­complex in Europe.
Following Fort, Pujol, and Cavalli-­Sforza (2004), a 150 km radius for Paleolithic waves-­of-­advance
is adopted in the study and the authors compute three sets of 150 km isopleths starting in each of the
three potential origin sites using the least-­cost distance method. They then plot the site locations and
select the oldest site within each two isopleths in different cardinal directions (Figure 8.1). Subsequently,
they carry out least-­cost distance calculations from each of the three potential origin sites to each one
of the selected sites (Table 8.1). They also calculate the difference between the mean calibrated date of
each origin site and each one of the remaining sites. In the next step of their analysis, they create a scat-
ter plot for these three data sets where they place the potential dependent variable (i.e. the time interval
between each site and the origin site in years) on the vertical axis and the potential independent variable
(i.e. least-­cost distance between each pair of sites in kms) on the horizontal axis. With this, they intend
to examine whether the time difference between the appearance of the Gravettian at each site and the
possible origin site may have been affected by the least-­cost distance between them. Consequently, they
calculate the correlation coefficient r, a p value for its significance testing and a regression line with an
Figure 8.1 Cost-­distance surface with 150 km isopleths having (a) Buran Kaya III (BK); (b) Geissenklösterle
(GEISSE); (c) Krems-­Hundssteig (KRE-H) as origin sites (Bicho et al., 2017, Figure 1). A colour version of this
figure can be found in the plates section.
Table 8.1 Early Gravettian calibrated Accelerator Mass Spectrometry (AMS) dates of sites included in the study
together with least-­cost path distances from the three earliest sites to the sites included in each correlation and regression.

Site Code Mean Least-­Cost Least-­Cost Least-­Cost Path


calibrated Path from Path from from Krems-­
age (BP) Buran Geissenklösterle Hundssteig
Kaya (Km) (Km) (Km)

Buran Kaya BK 38528 – – –


Geissenklösterle GEISSE 37569 3701 – –
Krems-­Hundssteig KRE-­H 37124 3062 614 –
Ranis 4 Ilsenhöhle RANIS 35655 3327 – 506
Dolni Vestonice Ila DVI 35550 2946 751 130
Fumane FUMAN 35479 3200 790 1111
Henrykow 15 HENRY 35477 2833 784 437
Trencianske Bohuslavice-­Pod Tureckom TRENC 34058 – 880 –
El Castillo CASTI 33887 5613 1994 2530
Le Sire SIRE 33465 4533 876 –
Maisieres Canal, champ de fouille MAISI 33261 4122 – –
Lapa do Picareiro LP 33230 6543 2927 3459
Komarowa Cave KC 32526 2705 – –
Vale Boi VB 32372 6537 2922 3450
Les Garennes GAREN 32324 4793 1136 1668
Solutre-­J-­10 SOLUT 32319 4357 700 1231
Tarte TARTE 32308 5013 1397 1930
Arbreda ARBRE 32227 – 1345 1878
Paglicci PAGLI 32157 – 1472 –
Palomar PALOM 31983 5744 2129 2662
Antonilako Koba AK 31348 5457 1839 2374
Mira MIRA 31315 736 2888 2559
Grotta Arene Candide ARENE 31263 3554 – –
Piana Ciresului POIAN 31236 1577 1774 1169
Sirgenstein SIRG 31184 – – 617
Source: Bicho et al., 2017, Table 1

80 percent confidence interval (Figure 8.2). Using the slope of each of the regression lines they calculate
the speed and spread of the Gravettian techno-­complex.
For Buran Kaya, the sample size is n = 21 and the correlation coefficient is r = 0.358; for Geissen-
klösterle, n = 19 and r = 0.657; and for Krems-­Hundssteig, n = 17 and r = 0.568.2 For one of the three
data sets, where Geissenklösterle is taken as the origin site (Figure 8.2(b)), the details of the correlation
coefficient calculation are as follows: the average least-­cost distance of all sites to Geissenklösterle ( x ) is
1432.526 km; the average time difference ( y) is 4125.421 years; the sum of the products of the differences
at each location between least-­cost distance and x, and time interval and y̅ (i.e. Σ( yi – y) × (xi – x )),
is 19471293.789 (Table 8.2); the sample standard deviations for the x and y variables are 847.302 km
and 1942.505 years respectively. The product of the two standard deviations and n – 1 = 18 (i.e. the
Figure 8.2 Linear regression models created with the Ordinary Least Squares (OLS) method to determine the
association between the time difference for the appearance of the Gravettian techno-­complex at different sites
and their least-­cost distance to three origin sites. (a) model with Buran Kaya III as origin; (b) model with Geissen­
klösterle as origin; (c) model with Krems-­Hundssteig as origin (related data is presented in Table 8.1; Bicho et al.,
2017, Figure 2).
146 Piraye Hacıgüzeller

Table 8.2 Details of calculations for the numerator of Equation 8.1 where Geissenklösterle is taken as the origin.

Site Code Least-­Cost Distance Time Difference (xi – xave) (yi – yave) (yi – yave) ­
from Geissenklösterle (yrs) (y-­values) (xi – xave)
(km) (x-­values)

Geissenklösterle GEISSE 0 0 –­1433 –­4125 5909774


Krems-­Hundssteig KRE-­H 614 445 –­819 –­3680 3012521
Dolni Vestonice Ila DVI 751 2019 –­682 –­2106 1435581
Fumane FUMAN 790 2090 –­643 –­2035 1307812
Henrykow 15 HENRY 784 2092 –­649 –­2033 1318727
Trencianske Bohuslavice-­ TRENC 880 3511 –­553 –­614 339484
Pod Tureckom
El Castillo CASTI 1994 3682 561 –­443 –­248969
Le Sire SIRE 876 4104 –­557 –­21 11921
Lapa do Picareiro LP 2927 4339 1494 214 319188
Vale Boi VB 2922 5197 1489 1072 1596089
Les Garennes GAREN 1136 5245 –­297 1120 –­331985
Solutre-­J-­10 SOLUT 700 5250 –­733 1125 –­823784
Tarte TARTE 1397 5261 –­36 1136 –­40343
Arbreda ARBRE 1345 5342 –­88 1217 –­106483
Paglicci PAGLI 1472 5412 39 1287 50786
Palomar PALOM 2129 5586 696 1461 1017255
Antonilako Koba AK 1839 6221 406 2096 851798
Mira MIRA 2888 6254 1455 2129 3098091
Piana Ciresului POIAN 1774 6333 341 2208 753830
Σ = 19471294

denominator in Equation 8.1), is 29625990.506. Dividing 19471293.789 by 29625990.506 gives us a


correlation coefficient r of 0.657 for Geissenklösterle.
We can use Equation 8.2 to calculate the t-­statistic as in Equation 8.12. In a student’s t distribution
table, we can see that for n – 2 = 17 degrees of freedom and using α = 0.01, this t-­statistic is larger than
the critical values of t in both one-­ (t = 2.567) and two-­tailed (t = 2.898) tests. This means that if the
null hypothesis was true and the correlation coefficient of the population was equal to zero (H0: ρ = 0),
it would be highly unlikely (less than 1 percent chance) that we would get the sample we did. On the
basis of this result, we can reject the null hypothesis (as did the authors of the article, although with an α
value of 0.05) and infer that there is a statistically significant relationship between the time interval and
least-­cost distance in the model where Geissenklösterle features as the origin site.

0.657 × 17
t= = 3.593  (8.12)
(1 − 0.6572 )
As discussed above, the conditions for significance testing for Pearson’s correlation coefficient are bivariate
normality and independence. In order not to elaborate beyond the scope of this chapter, let us assume
Applications of correlation and linear regression 147

that the normality condition for this data set was checked by the authors and met. The independence
condition, however, will not be fulfilled if both variables happen to be spatially autocorrelated. The
strength and scale of spatial autocorrelation in the two variables is not discussed by the authors. The maps
in Figure 8.3 show the difference between average and actual values for each location and each variable.
They display fine-­scale positive spatial autocorrelation since pairs of locations in close proximity to one
another often both score either above average or below average contributing positively to the calculation
of Moran’s I. We can define weights to indicate spatial proximity and calculate Moran’s I to quantify
this spatial autocorrelation. Although there are less arbitrary ways to do this, let us simply assign weights
other than zero only to the two nearest neighbours of each site. Specifically, let us employ a function
for inverse-­decaying distances, w(d), where the distance of two closest neighbouring sites to the site in
d
question (measured as great circles and in km) is d. When d £ 1000, w(d ) = 1 − 1000 and when d ³ 1000,
w(d) is equal to zero. This means that, for instance, for the case of Antonilako Koba (AK), only its two
closest neighbours El Castillo (CASTI) and Tarte (TARTE), which are both less than 1000 km away from
AK, are assigned weights other than zero, specifically 0.9 and 0.7 corresponding to a distance of 104 and
301 km, respectively. Accordingly, we calculate Moran’s I for the least-­cost distance, x, variable as 0.672
and for the time, y, variable as 0.633.

Figure 8.3 (a) Map illustrating the difference between x value (i.e. least-­cost distance in kms) at each location
and average x value in order to give an indication of spatial autocorrelation (after Bicho et al., 2017, Figure 1).
(b) Map illustrating the difference between y value (i.e. time difference in years) at each location and average
y value in order to give an indication of spatial autocorrelation (after Bicho et al., 2017, Figure 1). A colour
version of this figure can be found in the plates section.
148 Piraye Hacıgüzeller

Figure 8.3 Continued

In order to get an idea about the effect of spatial autocorrelation on significance testing of the correla-
tion coefficient and keep the discussion relatively brief and simple, let us simply follow the example of Lee
(2017, p. 365; cf. Chessel [1981] on the spatial autocorrelation matrix) and check how a hypothetical positive
bivariate spatial autocorrelation score of 2.0 on average across locations (which would be calculated through
the trace matrix operation explained above) would affect our results here. The effective sample size n* in this
case can be calculated as 10.5 (1 + 192 / 38; Equation 8.3), rounded to 10 to be on the safe side. Hence,
the sample size drops from its actual size of 19 to an effective size of 10 and a new t-­statistic can be calcu-
lated as 2.465 using Equation 8.2. In a student’s t distribution table, we can see that for n*-­2 = 8 degrees of
freedom and α = 0.01, this t-­statistic is smaller than the critical values of t both for the one-­(t = 2.896) and
two-­tailed (t = 3.355) tests. Hence, we fail to reject the null hypothesis this time meaning that the correla-
tion is no longer significant at this level and illustrating how not taking spatial autocorrelation into account
may lead to inaccurate inferences in correlation studies of spatial phenomena.
Moving to regression analysis, we calculate the slope parameter for the OLS regression using Equa-
tion 8.5 as 1.507 yrs/km. We then calculate the y-­intercept for the linear model using Equation 8.6 as
1966.939 yrs. This means that at the origin site of Geissenklösterle (where x is equal to 0) the Gravet-
tian techno-­complex arrived approximately 1967 years later than at the origin site. This of course does
not make sense and forms an example of how interpretations of y-­intercept in the regression model or,
Applications of correlation and linear regression 149

in fact, the parameter estimates of the regression models in general may not always be meaningful and
should be approached carefully and critically. Now that we have both parameters for our bivariate linear
model, we can write down the equation for the OLS regression line for Geissenklösterle (which is shown
in Figure 8.2(b)) as:

yˆ = 1.507x + 1966.939

The equation indicates that for every 1000 km increase in the least-­cost distance from the origin site
Geissenklösterle, the appearance of the Gravettian techno-­complex is delayed for approximately 1507
years. The coefficient of determination for this regression model, r2, is equal to 0.432 which means that in
this linear regression model approximately 43 percent of the variation in the dependent variable, the time
interval, is explained by variation in the independent variable, least-­cost distance. The RMSD is calculated
as 1506.480 yrs. As discussed, the conclusions of the significance testing for r2 will be same as those for r.
Hence, we can say that at a 0.01 significance level, we can reject the null hypothesis and infer that there
is a statistically significant relationship between time interval and least-­cost distance.
Spatial autocorrelation, however, changes things considerably. An examination of Figure 8.4 which
presents regression residuals at each location shows that some of the residuals in close proximity are

Figure 8.4 Map showing residuals at each location in order to give an indication of spatial autocorrelation
(after Bicho et al., 2017, Figure 1). A colour version of this figure can be found in the plates section.
150 Piraye Hacıgüzeller

Table 8.3 Details of calculations for the Autocorrelated Errors Model (ρ = 0.36, Equation 8.11).

Code x* y* New predicted value by Residual (εi) εi2


the regression line (ŷ*)

GEISSE –331.70 –622.80 177.41 –800.21 640335.55


KRE-H 117.24 –1220.32 1008.06 –2228.39 4965712.46
DVI 266.94 737.26 1285.06 –547.81 300091.49
FUMAN 679.48 2009.90 2048.36 –38.46 1479.13
HENRY 314.27 499.36 1372.63 –873.27 762602.25
TRENC 459.84 2728.68 1641.98 1086.71 1180933.08
CASTI 1096.41 530.02 2819.79 –2289.77 5243049.65
SIRE 322.03 892.44 1386.99 –494.55 244578.71
LP 1903.52 2499.15 4313.15 –1814.00 3290594.21
VB 1877.82 3299.19 4265.60 –966.41 933944.26
GAREN 531.67 2737.28 1774.87 962.41 926226.51
SOLUT 129.90 2598.56 1031.50 1567.06 2455680.46
TARTE 723.37 2400.76 2129.56 271.20 73549.68
ARBRE 753.45 2940.37 2185.22 755.15 570249.60
PAGLI 1314.03 5079.00 3222.44 1856.56 3446802.10
PALOM 1577.05 4159.97 3709.09 450.88 203290.47
AK 840.90 3702.26 2347.03 1355.23 1836654.03
MIRA 2696.41 5570.04 5780.19 –210.16 44166.26
POIAN 1367.06 5278.38 3320.55 1957.83 3833092.50

similar in sign and magnitude especially in the case of the first-­degree neighbours. Calculating Moran’s
I as equal to 0.500 with the same weighing function used above confirms positive spatial autocor-
relation effect. In order to build the Autocorrelated Errors Model explained above (Equation 8.11),
we use the same weighing scheme and calculate new quantities of y* and x* for different values of ρ
starting from zero. When ρ is equal to 0.36 (Table 8.3), the RMSD value for the regression of y* versus
x* is minimized and equals 1349.358. This value is smaller than the RMSD value of 1506.480 for the
original linear regression model. The equation for the linear regression model that takes spatial auto-
correlation into account is:

yˆ = 1.850x + 791.147  (8.14)

This new model roughly indicates that instead of a 1507-­year increase in time difference with every
1000 km increase in least-­cost distance to the origin site, as suggested by the original model, an 1850-­
year increase ought to be considered. The authors calculate the speed of advance for the Gravettian
1
with the equation speed = slope following Jerardino et al. (2014). The new regression model, which
1
adjusts for spatial autocorrelation, indicates that this dispersion rate drops from 1.507 = 0.664km/yr to
1
1.850
= 0. 541km/yr . Moreover, the coefficient of determination increases from 0.432 to 0.533. The
Applications of correlation and linear regression 151

F-­statistic calculated for this new model using Equation 8.9 and the new r2 value, is 6.730 which has
1 and (since there are 19 observations in the sample) 19 -­2 = 17 degrees of freedom for the numera-
tor and denominator, respectively. The F-­table shows that at a 0.01 significance level the critical value
is 8.40. So, the null hypothesis that r2 is equal to zero can no longer be rejected at a 0.01 significance
level and, hence, the new slope value is no longer statistically significant. It therefore appears that not
taking spatial autocorrelation into account in the regression modelling does cause a Type I error in
this particular case.

Conclusion
The aim of this chapter has been to give short summaries of the bivariate correlation and bivariate linear
regression with Ordinary Least Squares (OLS) methods which are both widely applied to archaeological
spatial phenomena, as well as presenting accessible methods and examples to account for the effects of
spatial autocorrelation on such analyses. As highlighted, these effects mainly manifest themselves in the
results of significance testing, but also lead to less precise linear regression models with a potential bias
in the regression model parameter estimates. Even though it is argued that null hypothesis testing is not
always the best way to deal with spatial data sets (and alternative methods for model selection are sug-
gested which can bypass the Type I error issue (Bini et al., 2009, p. 194; Hawkins, 2012; cf. Burnham &
Anderson, 2002)), it has also been demonstrated that ignoring spatial autocorrelation in regression mod-
elling may even lead to a dramatic inversion of the slope sign of the linear regression model turning an
estimated positive linear relation to a negative one (Kühn, 2007)! So, the issues with coefficient estimates
in the case of ignoring spatial autocorrelation in regression analysis certainly remain.
On the basis of the observation that both in simulated and real data sets different spatial regression
methods produce different regression coefficients, the main research questions concern how much these
coefficient shifts differ and why they occur (Beale et al., 2010; Bini et al., 2009). Therefore, difficulties
remain when choosing the best method to account for spatial autocorrelation during regression (and
correlation) analysis. Yet “it is not good practice to use a statistical method when the data do not meet its
(sic) underlying assumptions” (Bini et al., 2009, p. 2002). Aiming to remedy the effects of spatial autocor-
relation on correlation and regression applications remains best practice and a spatially explicit method
will provide a more accurate regression model than a non-­spatially explicit one.
While the effects of spatial autocorrelation in correlation and regression analyses in archaeology are
largely overlooked (see, however, Gil et al., 2016), the results of related studies in other disciplines are
certainly alarming and show that the effects for archaeological models can potentially be dramatic too.
The topic promises to be thoroughly researched and discussed in the social science. A recent book titled
Spatial regression models for the social sciences by Chi and Zhu (2020) is, for instance, of great interest to
archaeologists carrying out such analyses. Archaeology, with its own discipline-­specific spatially autocor-
related phenomena, can add valuable data, information and insights to interdisciplinary research on spatial
regression in the future.

Acknowledgements
I would like to thank Frank Carpentier, Mark Gillings and Gary Lock for copyediting the chapter and
for their insightful comments. I am also grateful to Serkan Kemeç and Sumeeta Srinivasan who provided
valuable remarks on the content. A special thank you goes to Ingolf Kühn for a detailed and very helpful
review. Any remaining inaccuracies or mistakes are my own.
152 Piraye Hacıgüzeller

Notes
1 Bivariate normality implies that both variables considered in a correlation come from normal distributions and
their joint distribution is also normal-­shaped (a three-­dimensional bell curve). It is important to realise here that
even if each random variable X and Y is normally distributed, they will not necessarily be jointly bivariate normal.
2 The authors choose to include the origin site in the analyses of each case and this significantly increases the cor-
relation coefficient. For the case of Buran Kaya, r without the origin site (i.e. Buran Kaya) in the calculations is
equal to 0.146; for the case of Geissenklösterle, r = 0.571; and for Krems-­Hundssteig, r = 0.467. The inclusion is
not a good choice in terms of the accuracy of the results because, as explained earlier, the authors are questioning
the association between the least-­cost distance between non-­origin and each of the three origin sites and time
difference between the appearance of Gravettian for each pair. It is clear that including the origin sites in these
calculations (i.e. where both least-cost distance and time difference are zero, and hence below average) will only
strengthen the expected positive correlation, and in this case significantly so. It is not clear, however, what kind of
interpretative advantage this inclusion provides to the researchers.

References
Anselin, L. (2009). Spatial regression. In A. S. Fotheringham & P. Rogerson (Eds.), The SAGE handbook of spatial
analysis (pp. 255–275). Los Angeles and London: SAGE Publications.
Bailey, T. C., & Gatrell, A. C. (1995). Interactive spatial data analysis. Harlow: Longman Scientific & Technical.
Beale, C. M., Lennon, J. J., Yearsley, J. M., Brewer, M. J., & Elston, D. A. (2010). Regression analysis of spatial data.
Ecology Letters, 13(2), 246–264. doi:10.1111/j.1461–0248.2009.01422.x. Retrieved from https://onlinelibrary.
wiley.com/doi/abs/10.1111/j.1461-­0248.2009.01422.x
Bevan, A. (2012). Spatial methods for analysing large-­scale artefact inventories. Antiquity, 86(332), 492–506.
doi:10.1017/S0003598X0006289X. Retrieved from www.cambridge.org/core/article/spatial-­methods-­
for-­analysing-­largescale-­artefact-­inventories/F52E018213D69DBC559D1AD2DAF5F8DD
Bevan, A., & Conolly, J. (2004). GIS, archaeological survey, and landscape archaeology on the Island of Kythera,
Greece. Journal of Field Archaeology, 29(1–2), 123–138. doi:10.1179/jfa.2004.29.1-­2.123. Retrieved from https://
doi.org/10.1179/jfa.2004.29.1-­2.123
Bevan, A., & Conolly, J. (2011). Terraced fields and Mediterranean landscape structure: An analytical case study from
Antikythera, Greece. Ecological Modelling, 222(7), 1303–1314. https://doi.org/10.1016/j.ecolmodel.2010.12.016.
Retrieved from www.sciencedirect.com/science/article/pii/S0304380010006824
Bicho, N., Cascalheira, J., & Gonçalves, C. (2017). Early Upper Paleolithic colonization across Europe: Time and
mode of the Gravettian diffusion. PLoS One, 12(5), e0178506. doi:10.1371/journal.pone.0178506. Retrieved
from https://doi.org/10.1371/journal.pone.0178506
Bini, L. M., Diniz-­Filho, J. A. F., Rangel, T. F. L. V. B., Akre, T. S. B., Albaladejo, R. G., Albuquerque, F. S., . . . Hawkins,
B. A. (2009). Coefficient shifts in geographical ecology: An empirical evaluation of spatial and non-­spatial regres-
sion. Ecography, 32(2), 193–204. doi:10.1111/j.1600-­0587.2009.05717.x. Retrieved from https://onlinelibrary.
wiley.com/doi/abs/10.1111/j.1600-­0587.2009.05717.x
Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-­theoretic
approach (2nd ed.). New York: Springer.
Carrer, F. (2013). An ethnoarchaeological inductive model for predicting archaeological site location: A case-­study of
pastoral settlement patterns in the Val di Fiemme and Val di Sole (Trentino, Italian Alps). Journal of Anthropological
Archaeology, 32(1), 54–62. https://doi.org/10.1016/j.jaa.2012.10.001. Retrieved from www.sciencedirect.com/
science/article/pii/S0278416512000530
Carrero-­Pazos, M. (2018). Beyond the scale: Building formal approaches for the study of spatial patterns
in Galician moundscapes (NW Iberian Peninsula). Journal of Archaeological Science: Reports, 19, 538–551.
https://doi.org/10.1016/j.jasrep.2018.03.026. Retrieved from www.sciencedirect.com/science/article/pii/
S2352409X17308052
Chessel, D. (1981). The spatial autocorrelation matrix. In P. Poissonet, F. Romane, M. A. Austin, E. van der Maarel, &
W. Schmidt (Eds.), Vegetation dynamics in grasslans, healthlands and mediterranean ligneous formations: Symposium of the
Working Groups for Succession research on permanent plots, and data-­processing in phytosociology of the International Society
for Vegetation Science, held at Montpellier, France, September 1980 (pp. 177–180). Dordrecht: Springer Netherlands.
Applications of correlation and linear regression 153

Chi, G., & Zhu, J. (2020). Spatial regression models for the social sciences. Los Angeles: SAGE.
Chun, Y., & Griffith, D. A. (2013). Spatial statistics & geostatistics: Theory and applications for geographic information sci-
ence & technology. Los Angeles: SAGE.
Cobo, J. M., Fort, J., & Isern, N. (2019). The spread of domesticated rice in eastern and southeastern Asia was mainly
demic. Journal of Archaeological Science, 101, 123–130. https://doi.org/10.1016/j.jas.2018.12.001. Retrieved from
www.sciencedirect.com/science/article/pii/S0305440318303765
Contreras, D. A., Hiriart, E., Bondeau, A., Kirman, A., Guiot, J., Bernard, L., . . . Van Der Leeuw, S. (2018). Regional
paleoclimates and local consequences: Integrating GIS analysis of diachronic settlement patterns and process-­based
agroecosystem modeling of potential agricultural productivity in Provence (France). PLoS One, 13(12), e0207622.
doi:10.1371/journal.pone.0207622. Retrieved from https://doi.org/10.1371/journal.pone.0207622
Domínguez-­Rodrigo, M., Cobo-­Sánchez, L., Uribelarrea, D., Arriaza María, C., Yravedra, J., Gidna, A., . . . Mabulla,
A. (2017). Spatial simulation and modelling of the early Pleistocene site of DS (Bed I, Olduvai Gorge, Tanzania): A
powerful tool for predicting potential archaeological information from unexcavated areas. Boreas, 46(4), 805–815.
doi:10.1111/bor.12252. Retrieved from https://doi.org/10.1111/bor.12252
Fall, P. L., Falconer, S. E., Galletti, C. S., Shirmang, T., Ridder, E., & Klinge, J. (2012). Long-­term agrarian landscapes
in the Troodos foothills, Cyprus. Journal of Archaeological Science, 39(7), 2335–2347. doi:https://doi.org/10.1016/j.
jas.2012.02.010. Retrieved from www.sciencedirect.com/science/article/pii/S030544031200074X
Fort, J., Pujol, T., & Cavalli-­Sforza, L. L. (2004). Palaeolithic populations and waves of advance. Cambridge Archaeologi-
cal Journal, 14(1), 53–61. doi:10.1017/S0959774304000046. Retrieved from www.cambridge.org/core/article/
palaeolithic-­populations-­and-­waves-­of-­advance/B1370E9320ABED5563469999FA41FE9B
Gangodagamage, C., Zhou, X., & Lin, H. (2017). Spatial autocorrelation. In S. Shekhar, H. Xiong, & X. Zhou (Eds.),
Encyclopedia of GIS (2nd ed., pp. 92–99). New York: Springer.
Gil, A. F., Ugan, A., Otaola, C., Neme, G., Giardina, M., & Menéndez, L. (2016). Variation in camelid δ13C and δ15N
values in relation to geography and climate: Holocene patterns and archaeological implications in central western
Argentina. Journal of Archaeological Science, 66, 7–20. doi:https://doi.org/10.1016/j.jas.2015.12.002. Retrieved
from www.sciencedirect.com/science/article/pii/S0305440315003179
Griffith, D. A., & Paelinck, J. H. P. (2011). Non-­standard spatial statistics and spatial econometrics. Heidelberg and Lon-
don: Springer.
Haining, R. P. (2003). Spatial data analysis: Theory and practice. Cambridge, UK: Cambridge University Press.
Hawkins, B. A. (2012). Eight (and a half) deadly sins of spatial analysis. Journal of Biogeography, 39(1), 1–9.
doi:10.1111/j.1365-­2699.2011.02637.x. Retrieved from https://onlinelibrary.wiley.com/doi/abs/10.1111/
j.1365-­2699.2011.02637.x
Jerardino, A., Fort, J., Isern, N., & Rondelli, B. (2014). Cultural diffusion was the main driving mechanism of the
Neolithic transition in Southern Africa. PLoS One, 9(12), e113672. doi:10.1371/journal.pone.0113672. Retrieved
from https://doi.org/10.1371/journal.pone.0113672
Kühn, I. (2007). Incorporating spatial autocorrelation may invert observed patterns. Diversity and Distributions,
13(1), 66–69. doi:10.1111/j.1472-­4642.2006.00293.x. Retrieved from https://onlinelibrary.wiley.com/doi/
abs/10.1111/j.1472-­4642.2006.00293.x
Kühn, I., & Dormann, C. F. (2012). Less than eight (and a half) misconceptions of spatial analysis. Journal of Biogeog-
raphy, 39(5), 995–998. doi:10.1111/j.1365-­2699.2012.02707.x. Retrieved from https://onlinelibrary.wiley.com/
doi/abs/10.1111/j.1365-­2699.2012.02707.x
Lee, S.-­I. (2017). Correlation and spatial autocorrelation. In S. Shekhar, H. Xiong, & X. Zhou (Eds.), Encyclopedia of
GIS (2nd ed., pp. 360–368). New York: Springer.
Pinhasi, R., Fort, J., & Ammerman, A. J. (2005). Tracing the origin and spread of agriculture in Europe. PLoS Biol-
ogy, 3(12), e410.
Rogerson, P. (2010). Statistical methods for geography: A student’s guide (3rd ed.). Los Angeles: Sage.
Silva, F., Stevens, C. J., Weisskopf, A., Castillo, C., Qin, L., Bevan, A., & Fuller, D. Q. (2015). Modelling the geo-
graphical origin of rice cultivation in Asia using the rice archaeological database. PLoS One, 10(9), e0137024.
doi:10.1371/journal.pone.0137024. Retrieved from https://doi.org/10.1371/journal.pone.0137024
Spencer, C., & Bevan, A. (2018). Settlement location models, archaeological survey data and social change in Bronze
Age Crete. Journal of Anthropological Archaeology, 52, 71–86. https://doi.org/10.1016/j.jaa.2018.09.001. Retrieved
from www.sciencedirect.com/science/article/pii/S0278416517301253
154 Piraye Hacıgüzeller

Srinivasan, S. (2017). Spatial regression models. In S. Shekhar, H. Xiong, & X. Zhou (Eds.), Encyclopedia of GIS (2nd
ed., pp. 2066–2071). New York: Springer.
Winter-­Livneh, R., Svoray, T., & Gilead, I. (2010). Settlement patterns, social complexity and agricultural strate-
gies during the Chalcolithic period in the Northern Negev, Israel. Journal of Archaeological Science, 37(2),
284–294. https://doi.org/10.1016/j.jas.2009.09.039. Retrieved from www.sciencedirect.com/science/article/
pii/S0305440309003458
Zhang, H., Bevan, A., Fuller, D., & Fang, Y. (2010). Archaeobotanical and GIS-­based approaches to prehistoric agri-
culture in the upper Ying valley, Henan, China. Journal of Archaeological Science, 37(7), 1480–1489. https://doi.
org/10.1016/j.jas.2010.01.008. Retrieved from www.sciencedirect.com/science/article/pii/S0305440310000130
Plate 2.8 A map showing points from two surveys that were collected on a total station. Location of the total
station or origin is represented as a star, survey on archaeological features on surface is marked in green, and
the survey of topography is in brown.

Plate 2.9 The survey points overlaid on a scanned map that is geo-­rectified to WGS-­UTM 21. A Python script
was developed to enable rotation and transformation of points in a local coordinate system to a global coordinate
system (UTM) using two known coordinate pairs.
Plate 5.4 Percolation cluster transitions for Domesday settlement. Evolution of the largest cluster in the perco-
lation process of Domesday settlement, overlaid on the transition plot (as in Figure 5.3). Maps of the clusters at
the distance threshold for each transition are depicted. Each vector point colour represents membership, when
two or more nodes are close enough to be part of the same cluster.

Plate 5.5 Domesday vill clusters at 3km and 2.9km overlaid on English coastline and Domesday counties
(generated from datasets provided by Stuart Brookes).
Plate 5.6 Domesday vill and 19th-­century settlement clusters. (a) Domesday vill clusters at 3.2km overlaid
on coastline and Domesday counties, generated from Domesday vill datasets provided by Stuart Brookes; (b &
c) Roberts and Wrathmell’s 19th Century Settlement Nucleation dataset at 3km (Brown, 2015, p. 37) and at
3.5km overlaid by Roberts and Wrathmell’s central province (Brown, 2015, p. 57).

Plate 5.7 Hillfort clusters in Britain, at (a) 34km, (b) 12km and (c) 9km percolation radius.
Plate 6.12 ‘2.5D’ representation of (a) kriged elevations and (b) conditionally simulated values (viewed from
the south west).
Plate 6.13 Radiate of Allectus: C mint percentages in 5 km grid cells.
Plate 6.16 Kriged map of C mint percentages.
Plate 7.6 Visual differences in the surfaces of nine interpolation methods at 1 m resolution. RMS errors for
each model are provided in Table 7.2.
Plate 7.8 Interpolation example modified from Fort (2015, Figure 1). An interpolated surface model of radiocarbon
dates from Neolithic sites (black dots) depicting the space-­time process of the spread of agriculture across Europe.
Note the areas indicated by arrows in the southwest and northeast of the model showing where data sparseness causes
instability in estimates.

Plate 8.1 Cost-­distance surface with 150 km isopleths hav-


ing (a) Buran Kaya III (BK); (b) Geissenklösterle (GEISSE);
(c) Krems-­Hundssteig (KRE-H) as origin sites (Bicho et al.,
2017, Figure 1).
Plate 8.3 (a) Map illustrating the difference between x value (i.e. least-­cost distance in kms) at each location and average x value in
order to give an indication of spatial autocorrelation (after Bicho et al., 2017, Figure 1) (b) Map illustrating the difference between y
value (i.e. time difference in years) at each location and average y value in order to give an indication of spatial autocorrelation (after
Bicho et al., 2017, Figure 1).

Plate 8.4 Map showing residuals at each location in order to give an indication of spatial autocorrelation (after
Bicho et al., 2017, Figure 1).
Plate 9.1 Screenshot depicting the distribution of radiocarbon dates available from the Canadian Archaeologi-
cal Radiocarbon Database, version 2.1 (Martindale et al 2016).

a
0.004

Transition IV
0.003

(6500−6001 to 6000−5501 cal BP)



0.002



0.001


● ● ●
●●


● ● ●

● ● ●
● ●
● ●
●● ● ●
● ● ●


● ● ● ●
● ● ●


● ●
● ●
● ●●

0.000

● ●
●● ●

●●●●
● ●● ●●

● ●●●

● ●

● ●●


● ●
● ●● ●●
● ●●● ● ●

● ●● ●
● ● ● ●
● ●
●●
● ●● ●
● ● ●●
●●

● ● ●●● ● ● ● ● ●
●● ●
● ●●
● ● ●
● ●● ●
● ●
●● ●
● ●

● ● ●● ● ●
● ●● ●
−0.001

● ● ●
● ● ● ● ●
●● ●
● ● ●
● ●●
● ●● ●●
● ● ● ● ●●
● ● ● ●
● ● ●●
●● ●
● ●●
● ● ● ● ●● ●
● ●
● ● ●● ●
● ● ● ●● ● ● ●

● ●
● ● ● ●●●● ●
● ● ●●

● ● ● ● ● ●●● ● ● ● ● ●● ●
● ● ● ● ●
● ●● ● ● ● ● ●
● ●
●●
●● ● ● ●
● ● ● ●● ● ● ● ●● ●● ● ●

● ● ● ● ●● ●
● ●●
● ● ●● ● ● ● ●
● ●●
● ●● ●●● ● ●●

● ● ● ● ● ●
●●
●●

●● ● ●
●●








●●

●● ●
●● ● ●
● ●


● ● ● ● ● ●●●

● ●

● ● ●● ●● ●

I II III IV V
● ● ●
● ● ● ● ●●● ●
● ● ●●
● ● ● ● ●


●● ● ●●
● ● ● ● ● ●
● ● ● ●
● ● ● ●
● ● ● ● ●
● ● ●
●● ● ●● ● ●●
●● ●
●● ●

● ● ● ● ●● ● ● ●●
● ● ●
● ●● ● ● ● ●● ●
●● ●●●●
●● ● ●● ● ● ●
● ●

●●

●● ●
●●
● ● ●● ● ● ●
● ●
● ● ●

transition
●● ● ●● ● ●
●● ●● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ●● ●● ● ●
● ● ● ●●
● ● ●
● ● ● ● ●
●● ● ● ● ●

b

●● ●
●● ● ●● ● ●


● ● ● ●●
● ●
0.004

● ● ●
● ● ●
● ● ● ●
● ●● ● ●
● ●●● ●● ●● ● ● ● ● ● ● ●

●● ●● ●
● ● ●● ●● ●

a
●● ●● ● ●● ● ●
● ●● ●
● ● ● ● ● ● ●

● ● ● ● ● ●
● ● ● ●
● ● ●● ● ●● ● ● ●
● ●
● ● ● ● ● ● ●● ● ●
● ●● ● ●● ● ●●
● ● ●

●● ● ● ● ●● ● ● ● ● ● ● ● ● ●●●
● ● ● ●
● ●●● ● ●●
● ● ● ● ● ● ●●●●●● ●●
●● ● ● ● ●●
● ● ●● ●●● ● ●● ●
● ●● ●
● ●● ● ● ●●
●●●● ●●● ●
● ● ● ●
● ● ●● ● ● ●
● ●● ● ●
● ●● ●
● ●
● ●●
● ● ● ●●●
● ●● ● ● ● ● ● ● ●
● ●● ●
● ● ● ● ●
● ●●● ● ●● ●● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ● ●
●● ●
● ● ● ● ●●●● ●

● ●
● ● ●● ●● ●
● ●● ●

● ● ● ● ● ●
●● ●
● ●
● ● ● ● ● ● ●● ● ● ● ● ●●
● ●

●●● ● ● ●

● ●
● ● ● ●● ● ●

● ● ● ● ● ● ●
●●● ●●●●

0.003

● ● ● ● ●
●●●
●● ●●● ● ● ●

● ● ●


● ● ● ●● ●

●●●
● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ●● ●● ● ● ● ● ●
● ●
● ● ● ●● ● ● ● ● ●●
● ●●

● ●● ● ●
●● ● ● ● ●
● ●● ● ● ● ● ● ● ● ●●●
●●●● ● ● ● ● ● ●
● ● ●● ● ●●
● ●● ● ● ●
● ● ●● ●●● ● ● ●
● ●

●●



● ●
●●

● ● ●
●●
● ● ● ● ●




●●

● ●
● ● ● ● ● ●● ●● ● ● ● ● ● ●
● ● ● ●● ● ●● ●●
●●● ●
● ● ●
● ●● ● ● ●●
● ● ●
● ●
●●

●● ●
●●
● ● ●
● ● ●
● ● ● ● ●
● ●
●●● ● ● ● ● ● ● ● ●● ●● ●

● ● ● ● ● ●●
● ● ● ●● ●● ● ● ● ●
● ● ● ● ● ● ● ●
● ●●● ● ● ●
● ● ●● ● ●
● ● ●
● ●

●● ● ● ●● ● ●
●● ● ● ● ● ●●

● ● ● ●
● ●● ● ● ●●
● ●
● ● ● ● ●
●● ●● ●
● ●●●

0.002

● ● ●
●● ● ●
● ● ● ● ●

● ● ● ● ●

●● ● ● ●

● ● ● ● ● ●●
● ●

● ●
● ●
● ●●●● ●●
● ●● ● ●
● ●● ●●
● ● ●
●● ●
● ●● ● ●● ●
● ● ●●● ● ● ● ●●

●●●● ●

● ●● ● ● ● ●
● ●●
●● ● ● ● ●
● ● ●
● ● ● ● ● ● ●●
● ●● ● ● ● ● ● ● ●●
● ●● ●● ● ●●●● ● ●
● ●●● ● ● ●● ●

● ●

●●
●●


● ●
●●
●●




●●
● ●●●

●● ● ●●









● ● ● ● ●●
● ●
●● ● ● ● ● ● ● ●● ●
●●● ●● ● ● ●



● ●



● ● ● ●
● ● ● ● ●● ● ● ●
● ● ● ●● ●● ● ● ● ● ● ● ● ●
● ●●● ● ● ●●
● ● ●
●● ● ● ●● ● ● ●
0.001

● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ●●


● ●
● ● ●
● ● ● ●● ●


● ●●



●●●



●● ●
●●● ● ● ● ● ● ●
●●

● ● ● ● ● ● ●●●● ●● ●


● ● ●●● ●
●● ●●

b
●● ●

●●●●●
●●●
● ●● ●●●● ● ● ●●
●●●●

●● ● ● ●●
● ●
● ● ● ● ●●
●● ●
● ● ●●● ●
● ● ●
● ● ● ● ●●●
●● ●
● ●● ● ● ●
● ●● ●●●
● ● ● ● ● ●● ● ●
●● ●
● ● ● ●
● ●
●● ● ●●
● ● ● ● ● ● ● ● ●
● ● ● ● ●● ● ●

● ● ●
● ●●● ● ● ●
● ●● ● ● ●
● ● ●
●● ● ● ●● ● ●

● ● ●● ●


−0.001 0.000

● ●● ●● ● ● ●
● ● ● ● ● ●
● ● ●
● ● ● ● ● ●
●●
● ● ● ● ●
● ● ● ●
●● ●●
● ●
● ● ● ● ●
● ●● ●● ● ●
● ●
● ●
● ●
● ●●
●● ●
● ● ● ●
● ●●●● ●

●● ●●
● ● ● ●● ● ●●
● ●
● ● ● ●
● ● ● ●●

● ●●
● ● ● ●●

● ●
● ●
● ● ●● ●
● ● ●●
● ●● ● ●

● ● ● ● ●
● ● ●
● ● ●
● ●● ● ●


●● ● ●●● ●
● ●
● ●● ●
● ●● ●
●●● ● ●● ●
● ●
● ● ●●
●●●●● ● ●
●●
● ●●
● ● ● ● ● ●
● ●
● ● ●

●●●●




● ● ● ●● ●
●●●●● ● ●● ●
●● ●●● ●●
● ● ●● ●● ●
● ● ● ●
●● ●
● ● ● ●● ● ● ●●
●●● ● ●

● ●● ●● ● ● ●● ●
● ● ●
● ● ● ● ● ●
●●● ● ●
● ● ●●
● ● ●
● ● ●●●
●●●● ●● ● ●
● ● ●●● ●●
●●
●●

I II III IV V

● ●●
● ●●
● ●●

● ●
● ● ●

transition

Plate 9.4 Local spatial permutation test of the summed probability distribution of radiocarbon dates (SPDRD)
from Neolithic Europe showing locations with higher (red) or lower (blue) geometric growth rates than the
expectation from the null hypothesis (i.e. spatial homogeneity in growth trajectories) at the transition period
between 6500–6001 and 6000–5501 cal BP. The insets on the right show the observed local geometric growth
rates and the simulation envelope for locations a and b on the map (see Crema et al., 2017, for details).
Plate 10.9 Spatiotemporal trajectories and imperfection of archaeological data in Syrian Arid Margins during
the Bronze Age.

Plate 10.14 Application of the fuzzy sets framework to each sub-­space.

Plate 10.16 Mapping the


attractiveness of and pos-
sibilities of finding settle-
ments at the sub-­spaces
using fuzzy set estimates
and survey intensity levels.
Plate 10.17 From type-­1 to type-­2 fuzzy sets: introducing the reliability of archaeological sites in fuzzy set
calibration.
Plate 10.18 Type-­2 fuzzy set settlement estimates for Early Bronze Age IV Arid Margins comparing results
for ‘cluster’ sites and ‘outlier’ sites.

Plate 11.3 Focal medians for 5km BASr catchments calculated from the BASr baseline for Ireland (left) and
residuals between expected 87Sr/86Sr ratios based on the BASr catchments and the observed 87Sr/86Sr ratio for
Individual A2 (right). Locations from which Individual A2 could have originated are shown in white. Loca-
tions from which Individual A2 is unlikely to have originated are shown in blue (more depleted) and orange
(more enriched).
Plate 11.5 Probability density surface (Top Left) and maximum likelihood estimations showing the locations
where Burial K from Duggleby Howe could have originated from based on the observed 87Sr/86Sr ratio (Top
Right), the observed d18O value (Bottom Left), and both the observed 87Sr/86Sr ratio and d18O value (Bottom
Right). The geographic assignments based on dual isotope tracers are unduly influenced by one of the isotopes
(oxygen) raising further questions about the utility of oxygen as a tracer isotope.
Plate 12.3 A northwest Arkansas historic data set from 1892: (a) the 18 ´ 27 km study region with 589
historic farmsteads and roads plotted over topography with towns outlined, (b) maps of the four principal
components of historic settlement with central values of legend indicating most preferred locations.
Plate 13.1 Southern portion of the coastal Georgia study area: maximum available calories for white-­tailed
deer (Odocoileus virginianus) for the month of September (ca. 500 BP).
9
Non-­stationarity and local
spatial analysis
Enrico R. Crema

Introduction
One of the core assumptions hold by most spatial analyses is that the generative process behind the
observed pattern is stationary. This implies that statistical properties such as the intensity of a point process,
the nature of the relationship between dependent and independent variables, or the patterns of spatial
interaction are independent of their absolute location, and hence homogenous across space. The assumption
is often adopted implicitly and not exclusively in spatial analyses; for example, when inferring population
trajectories of a particular region (using site counts or density of radiocarbon dates), the pattern observed
in the aggregate time-­series is considered to be, at least to some extent, representative of the region as
a whole. The advantage of holding this view is that information can be reduced into global statistics,
enabling for example the description of complex and multi-­scalar patterns of spatial interaction using
a single, distance-­based function (cf. Bevan this volume, Figure 4.3). Yet in many cases holding such an
assumption might be problematic as many processes do vary in their properties across geographic space.
They are, in other words, spatially heterogeneous and non-­stationary. Under these circumstances choosing
inappropriate methods that assume stationarity might at best hinder the detection of interesting variations
and outliers in the data, and at worst lead to an erroneous understanding of the overall pattern.
The common way to informally approach potential issues derived from non-­stationarity is to simply
select a window of analysis where the generative process can be assumed to be spatially homogeneous.
Intuitively speaking stationarity is negatively correlated with scale as larger study areas are more likely
to incorporate variation in spatial properties, making the use of global statistics less appropriate. The
problem is that the exact scale where the assumption stops being valid can vary depending on the nature
of the process under investigation and the idiosyncrasies of the specific case study. While informal rules
of thumb might be appropriate in some situations, stationarity should not be an a priori assumption, but
rather a hypothesis to be evaluated. This is particularly the case for large scale synthetic research that
harnesses the availability of increasingly larger collections of digital data, ranging from spatial databases
of radiocarbon dates (e.g. Shennan et al., 2013; Chaput et al., 2015) to remotely sensed data (e.g. Menze,
Ur, & Sherratt, 2006; Biagetti et al., 2017).
It is worth noting that stationarity is a property of the model, and not of the observed data per se
(Fortin & Dale, 2005). This is a crucial point, as in practical terms non-­stationarity arises from model
156 Enrico R. Crema

misspecification. Stationarity is an assumption whereby statistical properties such as mean or variance are
considered to be spatially invariant. But these properties do often vary over space as a consequence of
some unidentified variables, and failing to appropriately model these will drastically reduce our capacity
to explain spatial heterogeneity and lead to the incorrect use of global statistics. (Fotheringham, Bruns-
don, & Charlton, 2000). A trivial example can explain this issue. Suppose that someone is analysing the
regional distribution of archaeological sites over a rugged landscape characterised by patches of flat areas
that are more suited for human occupation. For the sake of simplicity, we can assume that the only driver
of site density is the terrain morphology, with the intensity of occupation being five times larger in the flat
patches. The density of archaeological sites will not be homogenous over space, but instead characterised
by several clusters located in these patches. Examining this data and computing a single estimate of site
density (i.e. computing a global statistic) would be inappropriate, and similarly analysing for spatial inter-
action might misleadingly suggest evidence of second-­order interaction (when in fact the sites are not
attracted to each other but only to absolute spatial locations). The problem can be solved by either ana-
lysing the flat patches separately by partitioning the study area or by specifying a variable that explains the
variation in site density (i.e. terrain ruggedness). Ignoring either option will lead to incorrect inferences.
The substantial growth in the availability of Geographic Information Systems (GIS)-­based spatial data
in recent years has undoubtedly eased the creation of more sophisticated and complex models that can
account for different kinds of spatial dependencies induced by environmental variables. If appropriately
identified and modelled, these advances can limit the risk of model misspecifications. However, this is not
necessarily a trivial task, and the situation is worsened for two reasons. First, spatial differences can also
arise simply as a consequence of heterogeneity in archaeological research design. Different states, regions,
and individuals often employ different sampling strategies resulting in biases that can exhibit strong spatial
structure. Figure 9.1, for example, shows the location of North American archaeological sites included in
the Canadian Archaeological Radiocarbon Database (CARD, v.2.0). The overall variation in the density
of archaeological sites with radiocarbon dates is a combined effect of past population density and differ-
ences in sampling intensity, but the remarkable strength of the latter is not always as self-­evident as in the
case of the state of Wyoming, shown here as a rectangular patch with a disproportionately high sample
density. Despite the known role of these forms of sampling biases, archaeological spatial analyses have
rarely addressed this issue formally (but see Bevan (2012) for an exception; see also Banning, this volume).
Yet examples in fields such as ecology showcase how the challenging task of quantifying and formally
integrating sampling bias is not only possible but can dramatically improve the predictive power of a
model (see Syfert, Smith, and Coomes (2013) and Stolar and Nielsen (2015) for applications in species
distribution modelling).
Second, whilst model misspecification and sampling bias are, at least potentially, tractable problems,
non-­stationarity can also arise because different individuals might genuinely exhibit different relation-
ships across space. Cultural, behavioural, and economic differences can in fact lead to different practices,
attitudes, and preferences towards the very same environmental variable, and at the same time these varia-
tions are likely to exhibit spatial autocorrelation. Global analysis will, by definition, ignore these potential
variations as its core assumption is that individual observations are interchangeable and originating from
the same process. This can be regarded as a particular form of model misspecification (e.g. one could,
at least in theory, specify categorical variables to depict cultural affiliations), albeit one where identify-
ing and quantifying key variables is difficult, if not impossible. From a theoretical standpoint ignoring
potential spatial heterogeneity arising from these factors is an example of environmental determinism (see
Gaffney & van Leusen, 1995; Jones & Hanham, 1995), an approach that ‘denies geography and history’ by
assuming that ‘every time and everywhere is basically the same’ (Jones 1991, p8; cited in Fotheringham
et al., 2000, p. 95).
Non-­stationarity and local spatial analysis 157

Figure 9.1 Screenshot depicting the distribution of radiocarbon dates available from the Canadian Archaeo-
logical Radiocarbon Database, version 2.1 (Martindale et al., 2016). A colour version of this figure can be
found in the plates section.

Method
How then can we identify non-­stationarity? How can we discern cases where using global statistics is
still appropriate in contrast to instances where model misspecifications, sampling bias, and un-­modelled
cultural variables can deeply undermine the results of the spatial analysis? Within a typical modelling
framework (e.g. regression analysis), the standard way to tackle this issue is to examine for the presence
of spatial autocorrelation in model residuals. While this is an efficient solution that directly examines the
assumptions of global statistics, the detection of spatial structure in the residual provides only a general
indication of misspecification and does not provide sufficient details on the nature of the spatial variation
per se.
One way to approach this problem is to break down the average properties observed at the global
scale and focus the perspective on to its local scale constituents. Thus rather than yielding a single statistic
describing the entire window of analysis, the objective is to retrieve multiple values, one for each of the
sampled locations. By analysing these statistics or even simply visualising them on a map, regularities and
exceptions can be identified. This provides clues for identifying plausible missing variables or provides
some insights into the nature of a culturally-­driven spatial heterogeneity. The growth of global infor-
mation systems (GIS) in the mid-­90s has particularly fostered the development of a suite of statistical
158 Enrico R. Crema

techniques, generally referred to as local spatial analysis, that implements this shift from a global to a local
perspective. These include both local versions of pre-­existing global statistics (e.g. Local Ripley’s K,
Local Moran’s I, Geographically Weighted Regression, Spatial Expansion Method, etc.) as well as pur-
posely developed new methods (e.g. the geographical analysis machine, GAM, by Openshaw, Charlton,
Wymer, & Craft, 1987, but also the locally-­adaptive model of archaeological potential, LAMAP, by Carleton,
Conolly, & Iannone, 2012).
While these techniques vary in their details (see below), they generally share two main properties:
(1) statistics are computed for each observed sample location, and hence they can be “mapped”; and
(2) statistics are computed by weighting the contribution of samples based on the distance to each focal
observation, i.e. they are based on local neighbourhoods that can be specified in a variety of ways (e.g.
contiguity in polygon data, a fixed number of ‘nearest’ neighbours, cut-­off distance, distance decay func-
tions etc. . . . (see Getis & Aldstadt, 2010, for a review). The subsections below provide a brief summary
of the key concepts pertaining to the most commonly used forms of local spatial analysis, and a review
of their archaeological applications.

Local point pattern analysis


Point pattern analysis (see Bevan, this volume) refers to a body of statistical techniques designed to assess
the spatial distribution of entities that can be described as points located most typically (but not necessar-
ily) on a two-­dimensional plane. The underlying processes behind a given point pattern are determined
by exogenous and/or endogenous factors. The former is often referred to as a first-­order effect (Bailey &
Gatrell, 1995), or induced spatial dependency (Fortin & Dale, 2005) and includes factors that are indepen-
dent from the phenomena of interest such as topography, soil, or distance to key resources. Endogenous
factors are instead referred to as a second-­order effect (Bailey & Gatrell, 1995), or inherent spatial dependency
(Fortin & Dale, 2005). These include factors that are intrinsic to the phenomena of interest, such as the
repulsion between settlements resulting from territoriality, or the aggregation of house-­units driven by
socio-­economic principles. The goal of point pattern analysis is to discern and model these two forms
of dependency.
Archaeological applications of point pattern analysis have a long tradition that goes back to the early
1970s (see references within Hodder & Orton, 1976), and since then the focus has been predominantly on
methods designed to discern between regular, clustered, and random patterns such as the Nearest Neighbour
Index (Clark & Evans, 1954) or the Ripley’s K function (Ripley, 1976). Both of these methods are global
statistics, and while they provide easy to interpret numerical indices and can be used within a hypothesis
testing framework, they generally assume stationarity and do not directly distinguish induced and inherent
spatial dependency. This can be problematic in a variety of ways. First, the standard null hypothesis used
in most techniques is a spatially homogeneous Poisson process where the intensity (i.e. density) is estimated
from the observed data. Techniques such as Ripley’s K function or the Pair Correlation Function (PCF) are
designed to detect spatial interaction (i.e. inherent spatial dependency) as instances of clustering or disper-
sion that are not accounted for by the null model (cf. Figure 9.2(a) vs. Figure 9.2(b); see Bevan, this volume).
Both techniques extrapolate a measure of local density at different spatial scales and compare them against
expectations from this null model. Statistical significance is then obtained by simulating point patterns
under the null hypothesis and generating a simulation envelope: observed statistics above this envelope are
then interpreted as evidence of clustering at the given spatial scale, whilst statistics below the envelope are
regarded as evidence of dispersion (see Bevan, this volume for further details).
However, clustering can also be the result of induced spatial dependency that is often expected in
a larger window of analysis (e.g. settlement clustering along rivers, or on flat patches – cf. the earlier
Non-­stationarity and local spatial analysis 159

0.30
a c

0.20

0.20
0.10

0.10
0.00

0.00
0.00 0.10 0.20 0.00 0.10 0.20

0.20

0.20
b d
0.10

0.10
0.00

0.00
0.00 0.10 0.20 0.00 0.10 0.20

0.2

Figure 9.2 Simulated point patterns with associated observed (solid line) and expected (dashed line, under
Complete Spatial Randomness) L function (a variant of Ripley’s K function where the theoretical expectation
of Complete Spatial Randomness (CSR) is a straight line): (a) homogeneous Poisson process; (b) clustered point
process; (c) spatially inhomogeneous Poisson process with different intensities between left and right sides of
the window of analysis (separated by the dashed line); (d) second-­order spatial heterogeneity with a combina-
tion of regular (left) and clustered (right) patterns. The function suggests aggregation (clustering) when the
observed L function is above the expected value and segregation (regular spacing) when below.

example). If the objective is the detection of spatial interaction then using a homogeneous Poisson process
in this case can be regarded as a particular form of misspecification (see Figure 9.2(c)). The issue can be
tackled by using more sophisticated techniques that can replace the null hypothesis with a spatially inho-
mogeneous version of the Poisson model, where the intensity varies as a function of external covariates.
160 Enrico R. Crema

For example, Eve and Crema (2014) investigated the distribution of Bronze Age houses at Leskernick
Hill (Cornwall, UK) by first fitting a point process model using a range of covariates including elevation,
slope, and visibility of landmarks (i.e. modelling induced spatial dependency), and subsequently used a
residual K function to detect clustering that was not accounted for by their fitted model (i.e. inherent
spatial dependency).
This solution is feasible as long as the inhomogeneous Poisson model can be assumed to be stationary.
However, the relationship between the intensity of the point process and the external variables (described
by the parameters of the fitted model) might also vary over space. If this is the case a global fitted model
is no longer a viable option and one should adopt alternative solutions (see Baddeley (2017)) similar to
those used in geographically weighted regression (see below).
Furthermore, even when variation in the externally induced spatial dependency is taken into account,
the nature of spatial interaction (i.e. inherent spatial dependency) might still vary over space (Fig-
ure 9.2(d)). Such second-­order heterogeneity (Pélissier & Goreaud, 2001) cannot be tackled by the most
commonly adopted point-­pattern analysis techniques such as Ripley’s K function or Nearest Neighbour
Index, as the mathematics underpinning the methods described above are based on aggregate statistics
(e.g. the mean density within a specific radius or the average distance to the nearest neighbour) that
effectively ignore variation between observations.
The solution in this case is to measure the same statistic for each observation point and map their
variation over space. The most widely adopted example of this approach is Getis and Franklin’s (1987)
second-­order neighbourhood analysis, which is effectively equivalent to a local version of Ripley’s K function.
A few archaeological examples employ this technique either in its basic form (e.g. Palmisano, 2013) or in
its bivariate version, where the inherent spatial dependency is investigated in terms of relationship attrac-
tion or repulsion between two classes of points (e.g. two different artefact types). For example, Orton
(2004) re-­examined the flint artefact distribution within the Mesolithic site of Barmose I identifying
potential activity areas as an alternative to cluster analyses. Crema and Bianchi (2013), and more recently
Riris (2017), applied the same suite of techniques on survey data, operationalizing the transition from a
site-­centric to artefact-­centric analysis of surface scatters. Both of these studies identified local patterns
of inter-­type artefact aggregation and segregation (with statistical significance obtained from random
permutation tests), and more importantly ‘mapped’ the variation of such relationships over space, iden-
tifying complex patterns within and between clusters that cannot be adequately described by standard
global spatial analysis. Figure 9.3 compares, for example, the output of a global (Figure 9.3(b)) and a local
(Figure 9.3(c)) point pattern analysis aimed to assess the aggregation/segregation of stone tools made of
different raw materials (see Crema & Bianchi (2013) for further details). The global bivariate L function
suggest an aggregation between different materials (in this case Gafsa sourced flint vs flint sourced from
elsewhere) up to 350 meters. The local version of the same analysis shows, however, that this aggregation
occurs only in some areas (see filled dots in Figure 9.3(c)).

Local indicators of spatial association (LISA)


One of the most commonly adopted forms of local spatial analysis is a suite of geostatistical techniques
commonly referred to as local indicators of spatial association (LISA; see also Fusco and de Runz, this volume).
These are designed to determine for any given sampled location and its local neighbourhood the extent of
clustering of similar observed values (Anselin, 1995). The primary objective of LISA is thus to decompose
global indices of spatial autocorrelation into their local constituents in order to identify the location of
outliers and local spots of non-­stationarity. Although the various association statistics (i.e. local Gamma,
local Moran, and local Geary), differ slightly from each other they generally all employ Monte-­Carlo
Non-­stationarity and local spatial analysis 161

a b c
● ●
● ● ● ●● ● ● ●●
● ● ● ● ● ● ● ●
● ●● ●● ● ●● ●●
● ● ● ● ● ●
● ● ● ● ● ●
● ●●● ● ● ● ●● ● ● ● ●●● ● ● ● ●● ● ●
● ● ● ●
● ●● ●● ● ●●
●● ● ●● ●● ● ●●
●●

60
● ●● ● ●● ● ●● ● ●● ● ●● ● ●
●● ●●

● ●● ● ● ●
● ● ●●●●●●
● ●●●● ●

●●● ●
● ●● ● ●
● ● ●●●●●

● ●●● ●●


● ● ● ●● ● ● ● ●●● ● ●●●●
● ● ● ● ●● ● ● ● ●● ●●●●●●●

●● ● ● ●●● ● ● ●
● ●● ●● ● ● ●●● ● ● ●
● ●●
● ●● ● ●●● ● ●
●● ●● ● ● ●●● ● ●
●● ●●
● ●● ●● ● ● ● ● ● ●● ● ●
●●
●●
●●

● ●●●● ● ●● ●
● ●●
● ●
● ● ● ● ● ● ●● ● ●
●●

●●

●● ● ●
● ●●
●● ● ●●●● ●●●●● ● ●
● ● ●● ● ● ●●●
● ●● ● ● ● ●
● ● ●●
● ● ● ● ●● ● ● ● ●●


● ● ● ●● ● ● ● ●● ● ● ● ●●



● ●
●● ● ●●
●●● ● ●● ●
● ● ●●●
●● ● ●●

100 m
● ● ● ● ● ● ●
● ● ● ●●●● ● ●● ● ● ● ●● ● ● ●

● ● ● ● ● ●
● ● ●● ●● ● ● ●● ●●
● ● ● ●
● ●●● ●● ● ●●● ●●
●● ● ● ● ●● ● ● ●● ● ● ● ●● ● ●
●●●
● ● ●● ● ● ● ●●●● ●● ● ● ●
● ●●●
●●


● ●●●●
● ● ●●●
●● ●
● ●●●●

● ● ●●●●
● ● ● ● ● ●● ●●
●● ● ●
● ●● ●● ● ●● ●●●●●● ● ● ●● ●● ●● ●
●●●●● ●
●● ● ● ●●
●●●● ● ● ● ● ●● ● ●●●
●●●●● ● ● ●
● ● ●● ●●

50
● ● ●● ● ● ● ● ●● ● ●● ●
● ● ●● ●● ● ●●
● ● ●● ● ●●

● ●● ● ● ● ●● ● ●●● ● ● ●
● ●● ●●
●●● ● ●● ● ●
● ●● ● ●●
● ●● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ●●
● ●
●● ● ●
●● ●
● ● ●
●● ●● ●
●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ●
●● ● ● ●● ●
● ● ●
●● ● ● ●●
● ● ●●● ●● ● ●● ● ● ●●
● ●● ● ●●
●● ● ●
●●
●● ●●
● ●
● ●●
● ● ●●
● ●● ● ●
●●
●● ●●
● ●● ●●
● ●●●

● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●
● ●● ● ●● ●

●●●● ● ●● ●●●●● ● ●
● ●● ● ●● ●●●●●
●●●

● ● ●● ●● ● ● ●●
● ●●●

● ● ●● ●● ● ● ● ●

● ● ● ●● ●● ● ● ● ● ●●● ●● ● ●
●●●●
●●
● ●● ●●

● ● ●● ● ● ●●●●
●●

● ● ●● ● ●
● ●● ●●●● ● ●●●●●● ● ● ●
● ● ●● ● ● ● ● ●● ● ● ●
● ●● ● ●●●●●
●●

● ● ●● ● ● ●
● ● ● ● ●● ● ●●●● ● ●●● ● ● ● ● ●● ● ●●●● ●
● ●●●
● ● ● ●● ●
● ● ● ●● ● ● ● ●● ●
● ● ● ●●
● ● ● ● ●● ● ● ● ● ●●

40
● ● ● ● ● ● ● ● ● ● ● ● ●
●●
●● ● ● ● ● ●
●● ●● ●● ● ● ● ●●● ● ● ● ● ●
●● ●● ● ● ●

● ●● ● ●● ●● ● ● ●●
●● ● ●● ●●

Cross L function
● ● ● ●
● ● ● ●
● ● ● ● ● ●
● ●● ● ● ● ●
● ● ●
● ●
● ● ● ● ● ●●●● ●●●
● ● ● ● ● ● ●●●●
●●●
● ●● ● ● ● ●● ● ● ●● ● ● ●●● ●
● ●● ● ● ● ●●●
● ●● ●●
● ●
● ● ● ●● ●●
●● ● ● ● ●● ●●
●●
● ●● ● ● ●● ●
●● ●
● ● ● ● ●● ●● ●
● ● ● ● ●●
●● ●
● ●● ●● ●
● ●●
● ●●● ●● ●
●● ●

● ●●●● ●
●● ●

● ● ● ● ●● ● ●●●
● ● ● ● ● ●● ● ●●●

● ●● ● ● ●● ● ● ●●
● ●● ● ●
● ● ● ● ●● ●● ● ●●●
● ● ● ●
● ● ● ●

30
● ●

● ●● ●● ●

● ●● ●● ●
● ● ● ●
● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ●
● ● ● ● ● ● ● ●
●● ● ● ● ● ●● ● ● ● ●
● ● ●● ● ● ● ● ●● ● ●

● ●● ●
● ● ● ● ●

● ●● ●
● ● ● ● ●
● ●● ● ● ● ●● ● ●

● ●● ● ● ● ●
● ●● ● ● ●
● ● ● ●● ● ● ● ● ● ● ●● ● ● ●
● ●
● ● ● ●● ● ● ● ● ●● ●
●● ●●
● ● ● ● ● ●
●● ● ● ● ●● ● ● ●

20
● ● ●
● ● ● ●

● ●
● ● ●● ● ● ● ●● ●
● ● ● ●
● ● ● ● ● ● ● ●
●● ● ● ●● ● ●
● ● ● ●
● ● ● ●
● ● ● ●
●● ● ● ●● ● ●
● ● ● ● ● ●
● ● ● ●
● ● ● ● ● ● ● ● ●
● ● ● ● ●● ● ●
● ● ● ● ● ● ●● ●
● ● ● ● ● ● ●
● ● ● ● ●
● ●●
● ●● ● ● ● ● ● ●
● ●●
● ●

10
● ● ●●●● ● ● ●● ●●●●
● ●● ● ●●
●● ● ●● ● ●●
●●
● ●
● ●●
●●
●●● ● ●●●
● ●● ● ● ●
● ●


●●● ● ●●●
● ●● ●
●●● ●● ● ● ●●
●● ● ●
● ●
● ●●● ●● ●
●● ●
● ●●●
● ●● ●


● ●● ●

● ●
●●● ●


● ● ● ● ● ● ●
● ●●● ● ● ● ●
●●●● ●● ●● ● ●
● ●● ● ●● ●
● ● ●
● ● ● ● ●
●●●● ●● ●● ● ●
● ●● ●
● ●
● ● ●● ●●




● ●● ●● ● ●● ● ● ●● ●●




● ●● ●●
● ● ●●
● ● ● ● ●
●● ● ● ● ● ● ●● ● ● ● ●

Observed Cross−L Function


● ● ●●● ● ●
● ● ● ●●
● ● ● ●
● ●
●● ●● ●
●● ●● ●● ● ● ●● ●●●
●●● ●● ●● ● ●
● ● ●● ● ● ●●
● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ●●●● ● ● ● ● ● ●●●● ● ● ●●

Expected Cross−L Function


● ●●● ● ●● ● ● ● ● ●● ● ●●
● ● ● ● ● ● ●
● ●●● ● ●●●● ●● ● ● ●● ● ● ●●● ● ●●●● ●● ● ● ●● ●
●● ●●
●●
●● ● ● ●● ●●● ● ●●
●● ● ● ●● ●●● ●
●● ● ●●● ● ●● ● ●●● ●
●● ●● ●● ●● ●● ●●

0
MC pointwise envelope
●● ●● ● ● ● ●● ●● ● ● ●● ●● ● ● ● ●● ●● ●
●● ● ● ● ● ● ●● ● ● ● ● ● ●
●● ● ● ●● ● ●
● ● ● ● ●● ● ● ● ● ● ●● ●
● ●
● ● ● ●
● ●
● ● ● ●
● ● ● ●
● ●● ● ● ●● ●
● ● ●●
● ● ● ● ● ●●
● ● ●
● ● ● ● ● ● ● ●
● ● ● ●● ● ● ● ●●
● ● ● ● ● ●
●● ●● ● ●● ●● ●
● ● ●●● ● ● ● ● ●●● ● ●
● ● ● ● ● ●
● ●● ● ● ●● ●

0 100 200 300 400 500


●● ●
●● ● ●● ●
●● ●

●● ●

● ● ● ● ● ● ● ●
● ●
●● ●
● ● ● ●
●● ●
● ●
● ●

Distance (in meters)

Figure 9.3 Lithic distribution analysis from the Sebkha Kelbia survey, Tunisia (after Crema & Bianchi, 2013),
showing contrasting results between global and local bivariate L functions of stone tools divided by their raw
material (Gafsa flint vs. flint sourced elsewhere): (a) distribution of the analysed stone tools (filled circle: Gafsa
sourced flint; hollow circle: flint sourced from elsewhere); (b) bivariate L function showing significant segrega-
tion between the two classes between 20 and 320 meters (MC: Monte-­Carlo); (c) local bivariate L function
scale showing evidence of aggregation at 100 meters (black dots indicate location of Gafsa sourced flints with
a statistically significant proportion of neighbours composed by flint sourced from elsewhere).

simulations to assess statistical significance and formally define spatial neighbourhoods using a weighted
scheme (Getis & Aldstadt, 2010).
While the most conventional use of LISA is to provide a better diagnostic tool of regression analysis by
identifying where residuals exhibit strong autocorrelation, the range of archaeological applications testifies
how this suite of techniques (along with other local versions of geostatistical analyses) can be used in a
range of contexts. For example, Premo (2004) used Moran’s local I (Anselin, 1995) and Getis’s local Gi*
statistics (Getis & Ord, 1992; a related technique designed to identify clusters distinguishing whether they
are low or high values compared to the mean) to explore the spatial distribution of terminal long-­count
dates carved on Classic Maya monuments. The objective in this case was to determine whether these
proxies of ‘collapse’ (the terminal dates indicate the most recent year when elites at a particular site raised
monuments) exhibit local variations in their extent of autocorrelation, and identify the presence and the
location of significant clusters of early and late dates. Crema, Bevan, and Lake (2010) also used the local
Gi* statistics as an exploratory analysis to identify areas of low or high chronological uncertainty in Middle
to Late Jomon pit-­dwellings in central Japan. More recently Styring, Maier, Stephan, Schlichtherle, and
Bogaard (2016) used the same analysis on the δ15N value of cereal grains at the Neolithic site of Hornstaad-­
Hörnle IA, Germany, to investigate patterns of inter-­household variation in crop-­husbandry practices.

Geographically weighted regression


Regression analyses are one of the most widely used statistical techniques in archaeology and often entail
observations that are spatially situated (Hacıgüzeller, this volume). Typical examples include estimates
of the speed of the spread of farming using radiocarbon dates (e.g. Pinhasi, Fort, & Ammerman, 2005),
162 Enrico R. Crema

the fitting of fall-­off curves of proportion data (e.g. artefact type) from potential centres of production
(e.g. Eerkens, Spurling, & Gras, 2008), or the modelling of site presence/absence via logistic regression
(e.g. Carrer, 2013; see Kvamme this volume). Most of these regression models assume that: (a) samples
are independent; and (b) the observed relationships between variables are the same across space, i.e. they
assume stationarity. The latter implies that estimates of the rate of expansion are assumed to be constant
over space, decrease in the proportion of artefact types from the source is assumed to be isotropic (i.e.
there is no directionality in the fall-­off), and that the independent variables are assumed to have the same
role in determining the likelihood of site presence across the study area. As for the other cases, if these
assumptions are not justified, models can potentially be misspecified and estimates biased.
While diagnosis of regression residuals can help identify problematic cases, they do not explicitly
model spatial heterogeneity and hence do not provide means to formally approach the non-­stationarity
problem (i.e. they are not able to inform how the relationships vary across space). The last two decades
however have seen the development of a wide range of regression techniques designed for the analyses of
spatial data. Problems such as the non-­independence and autocorrelation of sample observations are being
tackled by tailored methods such as spatial auto-­regressive models (see Gil et al., 2016 for an archaeo-
logical application). Geographically Weighted Regression (GWR) (Fotheringham, Brunsdon, & Charlton,
1998; Fotheringham et al., 2002) is one such technique that is suited for instances where the relationship
between variables are known to be spatially heterogeneous. The method is essentially a ‘local’ version of
regression analysis where global model parameters are replaced by continuous functions that are depen-
dent on the spatial coordinates of each location. Thus a ‘global’ regression model can be regarded as a
special case of the GWR, whereby the output of these continuous functions do not vary across space.
By allowing model parameters to vary across space this technique takes into account spatial heterogene-
ity (reducing model misspecification), and allowing at the same time the possibility to ‘map’ the spatial
variation of the parameters (and hence the spatial variation in the relationship between dependent and
independent variables). Geographically weighted regression assumes that when estimating the parameters
for a given location i, sites in proximity have a larger impact in the estimate of the model parameters than
those that are further away. This is achieved by weighting the contribution of neighbouring data points
using some distance decay function. Geographically weighted regression shares some similarities with
the spatial expansion method (Jones & Casetti, 1992), an earlier technique that similarly highlighted the
importance of spatially varying relationships. Whilst the spatial expansion method is a relevant precursor of
GWR, it provides less flexibility in defining how parameters vary over space, as it is designed to capture
general directional trends and its form needs to be assumed a priori (Fotheringham et al., 2000).
Despite its ability to address potential issues of environmental determinism, archaeological applications
of GWR have been comparatively limited. Gkiasta, Russell, Shennan, and Steele (2003) explored local
variations in the rate of the spread of farming in Neolithic Europe, whilst Bevan and Conolly (2009)
examined how covariates such as slope, vegetation, and geology have different relationships to the surface
pot-­sherd density in different parts of the Greek island of Antikythera using a geographically weighted
zero-­inflated Poisson regression. The technique has also been explored in the context of predictive
modelling of site locations (Löwenborg, 2010), as well as for larger synthetic research such as Linearband-
keramik (LBK) faunal remains in western Europe (Manning et al., 2013).

Case study
The core principles shared across the methods described above can be applied to virtually any analysis that
seeks to tackle non-­stationarity. One recent archaeological example is the spatial extension of the summed
probability distribution of radiocarbon dates (SPDRD). The non-­spatial version of this technique has
Non-­stationarity and local spatial analysis 163

recently renewed a strong interest in prehistoric demography, as the increasing availability of a large col-
lection of radiocarbon dates is providing a new proxy for inferring past population trajectories within
an absolute chronological framework. While the core assumptions of this “dates as data” (Rick, 1987)
approach are still being discussed, it is undeniable that SPDRD is quickly becoming part of the standard
toolkit in regional studies. In particular, the production of demographic time-­series within an absolute
chronology is opening new possibilities to infer the role of past climatic change (e.g. Kelly, Surovell,
Shuman, & Smith, 2013; Warden et al., 2017) or to explore cross-­regional divergences in demographic
trajectories (e.g. Timpson et al., 2014; Crema, Habu, Kobayashi, & Madella, 2016), potentially at the
global level (Chaput & Gajewski, 2016).
The possibility to incorporate a spatial dimension is particularly noteworthy here as it requires a care-
ful balance between sample size and the spatial extent of the window of analysis. Because the shape of
SPDRD is subject to sampling error, a formal assessment of its shape (i.e. the hypothesised demographic
trajectories) will require a sufficient number of radiocarbon dates. While some suggestions for a thresh-
old size have been suggested (e.g. Williams, 2012), the optimal size ultimately depends on the specific
null hypothesis that is being tested (the most common ones being exponential and logistic population
growths) and the effect size being sought. With other things being equal, the most straightforward solu-
tion to increase the sample size is to expand the size of the window of analysis. This, however, means that
stationarity is harder to justify as different regions are likely to experience heterogeneous demographic
histories (cf. sub-­regions in Shennan et al., 2013 and Timpson et al., 2014) as well as different sampling
strategies (see Figure 9.1, Bevan et al., 2017; see also Banning, this volume). The latter in particular
hinders the straightforward application of methods such as Kernel Density Estimates (KDE; see Bevan,
this volume), as the number of radiocarbon dates is determined at least in part by local differences in
sampling intensity. Attempts to overcome this issue have been rare, with the notable exception of Chaput
and Gajewski (2016) who employ relative risk surfaces (see also supplementary materials in Bevan et al.,
2017) by taking the ratio of each KDE map by the overall sampling intensity. While this approach is a
valuable correction in the observed pattern it does not distinguish between genuine instances of spatial
heterogeneity from variations arising from sample error.
Crema, Bevan, and Shennan (2017) have recently explored this issue by developing a local spatial
analysis designed to identify presence of spatial heterogeneity in the demographic trajectories hypoth-
esised from the SPDRDs, enabling the formal assessment of non-­stationarity. The method involves the
following six steps (for the full description see the original paper):

1) Compute for each site i a local SPDRD which is created by summing all radiocarbon probabilities
but weighting (using an exponential decay function) the contribution of dates from neighbouring
sites as function of distance from i.
2) Define temporal slices (e.g. 7500–7001 cal BP, 7000–6501 cal BP, etc.) and compute the geometric
growth rate between abutting pairs for each local SPDRD (e.g. between 7500–7001 and 7000–6501
cal BP, between 7000–6501 cal BP and 6500–6001 cal BP, and so on . . .).
3) Randomly permute the spatial coordinates of the radiocarbon dates, so that the entire set of dates
associated to a particular location x is given a new location y, and then execute steps 1 and 2 above.
4) Repeat step 3 n times, so that for each transition (e.g. from 7500–7001 to 7000–6501 cal BP) at each
site, there is one observed geometric growth rate (obtained in steps 1–2), and n simulated geometric
growth rates (obtained in step 3). The latter is the expected pattern under the assumption of spatial
stationarity (i.e. the same expected growth rate across space with variation entirely determined by
sampling error). Notice that the envelope of the simulated dates will be narrower in regions with a
higher sampling intensity and wider in areas with a lower sampling intensity.
164 Enrico R. Crema

5) Compare the observed and simulated growth rates for each location and compute the p-­value for
significance testing, equivalent to (r+1)/(n+1) where r is the number of replicates where the simu-
lated growth rate is lower (or higher) than the observed rate.
6) Use the distribution of p-­values to compute false discovery rates (q-­values, Benjamini & Hochberg,
1997) to take into account expected inflation of type I error (i.e. incorrect rejection of a true null
hypothesis) due to multiple hypothesis testing.

Figure 9.4 shows the result of this local analysis applied in the context of Neolithic Europe. The red
dots indicate site locations with a significant (q-­value < 0.05) local positive departure from the expected
growth rate under stationarity in the transition between 6500–6001 cal BP and 6000–5500 cal BP (transi-
tion IV), whilst the blue dots indicate the opposite (lower than expected rate). If all regions experienced
similar population trajectories (as inferred from the density of radiocarbon dates) and local variations in
the SPD were purely the result of sampling error, we would not expect to observe any significant positive
or negative departures. The insets show the result of two particular locations where the observed growth
rate (solid line with filled dots) is higher and lower than the expected rates under stationarity (dashed line

0.004

Transition IV

0.003
(6500−6001 to 6000−5501 cal BP)

0.002


0.001

● ● ●
●●


● ● ●

● ● ●
● ●
● ●
●● ● ●
● ● ●


● ● ● ●
● ● ●


● ●
● ●
● ●●

0.000

● ●
●● ●

●●●●
● ●● ●●

● ●●●

● ●

● ●●


● ●
● ●● ●●
● ●●● ● ●

● ●● ●
● ● ● ●
● ●
●●
● ●● ●
● ● ●●
●●

● ● ●●● ● ● ● ● ●
●● ●
● ●●
● ● ●
● ●● ●
● ●
●● ●
● ●

● ● ●● ● ●
● ●● ●
−0.001

● ● ●
● ● ● ● ●
●● ●
● ● ●
● ●●
● ●● ●●
● ● ● ● ●●
● ● ● ●
● ● ●●
●● ●
● ●●
● ● ● ● ●● ●
● ●
● ● ●● ●
● ● ● ●● ● ● ●

● ●
● ● ● ●●●● ●
● ● ●●

● ● ● ● ● ●●● ● ● ● ● ●● ●
● ● ● ● ●
● ●● ● ● ● ● ●
● ●
●●
●● ● ● ●
● ● ● ●● ● ● ● ●● ●● ● ●

● ● ● ● ●● ●
● ●●
● ● ●● ● ● ● ●
● ●●
● ●● ●●● ● ●●

● ● ● ● ● ●
●●
●●

●● ● ●
●●








●●

●● ●
●● ● ●
● ●


● ● ● ● ● ●●●

● ●

● ● ●● ●● ●

I II III IV V
● ● ●
● ● ● ● ●●● ●
● ● ●●
● ● ● ● ●


●● ● ●●
● ● ● ● ● ●
● ● ● ●
● ● ● ●
● ● ● ● ●
● ● ●
●● ● ●● ● ●●
●● ●
●● ●

● ● ● ● ●● ● ● ●●
● ● ● ●
● ●● ● ● ● ●● ●
●● ●●●●
●● ● ●● ● ● ●
● ●

●●

●● ●
●●
● ● ●● ● ● ●
● ●
● ● ●

transition
●● ● ●● ● ●
●● ●● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ●● ●● ● ●
● ● ● ●●
● ● ●
● ● ● ● ●
●● ● ● ●

b
● ● ●
●● ●● ● ●● ● ●


● ● ● ●●
● ●
0.004

● ● ●
● ● ●
● ● ● ●
● ●● ● ●
● ●●● ●● ●● ● ● ● ● ● ● ●

●● ●● ●
● ● ●● ●● ●

a
●● ●● ● ●● ● ●
● ●● ●
● ● ● ● ● ● ●

● ● ● ● ● ●
● ● ● ●
● ● ●● ● ●● ● ● ●
● ●
● ● ● ● ● ● ●● ● ●
● ●● ● ●● ● ●●
● ● ●

●● ● ● ● ●● ● ● ● ● ● ● ● ● ●●●
● ● ● ● ●●● ●
● ● ● ● ● ● ● ●●●● ●●
● ●●
●● ● ● ● ●●
● ● ●●●●●●● ●● ●
● ●● ●
● ●● ● ● ●●
●●●● ●●● ●
● ● ● ●
● ● ●● ● ● ●
● ●● ● ●
● ●● ●
● ●
● ●●
● ● ● ●●●
● ●● ● ● ● ● ● ● ●
●● ●
● ● ● ●

● ●●●
● ● ● ● ●● ●
●● ● ● ● ● ● ●
●● ● ● ● ● ● ● ● ●
● ● ● ● ● ●●●● ●
● ●
● ●
● ●● ● ●● ●
● ●● ●
● ● ● ● ● ●● ●
● ●
● ● ● ● ● ● ●● ● ● ● ● ●●
● ●

●●● ● ● ● ● ● ● ●
● ● ●● ● ●

● ● ● ● ● ● ●
●●● ●●●●

0.003

● ● ● ● ●
●●●
●● ● ●●● ● ● ●
● ● ●


● ● ● ●● ●

●●●
● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ●● ●● ● ● ● ● ●
● ●
● ● ● ●● ● ● ● ● ●●
● ●●

● ●● ● ●
●● ● ● ● ●
● ●● ● ● ● ● ● ● ● ●●●
●●●● ● ● ● ● ● ●
● ● ●● ● ●●
● ●● ● ● ●
● ● ●● ●●● ● ● ●
● ●

●●



● ●
●●

● ● ●
●●
● ● ● ● ●




●●

● ●
● ● ● ● ● ●● ●● ● ● ● ● ● ●
● ● ● ●● ● ●● ●●
●●● ●
● ● ●
● ●● ● ● ●●
● ● ●
● ●
●●

●● ●
●●
● ● ●
● ● ●
● ● ● ● ●
● ●
●●● ● ● ● ● ● ● ● ●● ●● ●

● ● ● ● ● ●●
● ● ● ●● ●● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ●●● ●
● ●



● ●
● ● ●● ●
● ●● ●
●● ● ● ● ● ● ● ●● ● ●
● ● ●
● ● ●
● ●● ● ● ●●
● ●
● ● ● ● ●
●● ●● ●
● ●●●

0.002

● ● ●
●● ● ●
● ● ● ● ●

● ● ● ● ●

●● ● ● ●

● ● ● ● ● ●●
● ●

● ●
● ●
● ●●●● ●●
● ●● ● ●
● ●● ●●
● ● ●
●● ●
● ●● ● ●● ●
● ● ●●● ● ● ● ●●

●●●● ●

● ●● ● ● ● ●
● ●●
●● ● ● ● ●
● ● ●
● ● ● ● ● ● ●●
● ●● ● ● ● ● ● ● ●●
● ●● ●● ● ●●●● ● ●
● ●●● ● ● ●● ●

● ●

●●
●●


● ●
●●
●●




●●
● ●●●

●● ● ●●









● ● ● ● ●●
● ●
●● ● ● ● ● ● ● ●● ●
●●● ●● ● ● ●●


● ●



● ● ● ●
● ● ● ● ●● ● ● ●
● ● ● ●● ●● ● ● ● ● ● ● ● ●
● ●●● ● ● ●●
● ● ●
●● ● ● ●● ● ●
● ●
0.001

● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ●
●●

● ●
● ● ●
● ● ●
●●



●●

● ●


●● ● ● ● ●
●● ● ●●● ● ●
●●● ● ● ● ● ● ● ● ● ●● ●● ●● ●

● ●
● ● ●●● ●
●● ●●

b
●● ●

●●●●●
●●●
● ●● ●●●● ● ● ●●
●●●●

●● ● ● ●●
● ●● ● ● ● ●●
●● ●
● ● ●●● ●
● ● ●
● ● ● ● ● ●●
●● ●
● ●● ● ● ●
●●●
● ●
● ● ● ● ●● ●●


●● ●
● ● ● ●
● ●

● ● ●●
● ● ● ● ● ● ● ● ●
● ● ● ● ●● ● ●

● ● ●
● ●●
● ● ● ●
● ●● ● ● ●
● ● ●
● ●
● ● ●● ● ●

● ●● ●
● ● ●● ● ●
−0.001 0.000

● ●● ●●
● ● ● ● ● ●
● ● ●
● ● ● ● ● ●
●●
● ● ● ● ●
● ● ● ●
●● ●●
● ●
● ● ● ● ●
● ●● ●● ● ●
● ●
● ●
● ●
● ●●
●● ●
● ● ● ●
● ●●●● ●

●● ●●
● ● ●● ●● ● ●● ●
● ● ● ●
● ● ● ●●

● ●●
● ● ● ●●

● ●
● ●
● ● ●● ●
● ● ●●
● ●● ● ●

● ● ● ● ●
● ● ●
● ● ●
● ●● ● ●


●● ● ●●● ●
● ●
● ●● ●
● ●● ●
●●● ● ●● ●
● ●
● ● ● ●
●●●●● ● ●
● ●●
● ●●
● ● ● ● ●
● ●

● ●

●●●●




● ● ● ●● ●
●●●●● ● ●● ●
●● ●●● ●●
● ● ●● ●● ●
● ● ● ●
●● ●
● ● ● ●● ● ● ●●
●●● ● ●
●● ●● ●● ● ● ●● ●
● ● ●
● ● ● ● ● ●
●●● ● ●
● ● ●●
● ● ●
● ● ●●●
● ●● ●● ● ●
●●●● ●●
● ● ●●
●●

I II III IV V

● ●●
● ●●
● ●●

● ●
● ● ●

transition

Figure 9.4 Local spatial permutation test of the summed probability distribution of radiocarbon dates
(SPDRD) from Neolithic Europe showing locations with higher (red) or lower (blue) geometric growth rates
than the expectation from the null hypothesis (i.e. spatial homogeneity in growth trajectories) at the transition
period between 6500–6001 and 6000–5501 cal BP. The insets on the right show the observed local geometric
growth rates and the simulation envelope for locations a and b on the map (see Crema et al., 2017, for details).
A colour version of this figure can be found in the plates section.
Non-­stationarity and local spatial analysis 165

with hollow dots) and its associated simulation envelope (grey region) obtained from 10,000 permuta-
tions. The result indicates statistically significant instances of spatial heterogeneity in growth rates, with
southern Britain, southern Ireland, the Baltic regions, and parts of central Germany experiencing higher
growth rates while most of continental Europe within the study area have the opposite pattern.

Conclusion
The substantial heterogeneity in the objectives, the type of data, and the scale of analysis, makes the appli-
cation of spatial analysis in archaeology a challenging and diverse task. Techniques are mostly developed
in other fields and come with assumptions that were valid for the particular contexts they were designed
for. Whilst generalised tools are highly desired, the underpinning assumptions are not easily transfer-
rable across different applications. The problem is exacerbated by the fact that too often we ignore the
assumptions and their implications entirely, leading to a divergence between archaeological theories and
spatial models.
The problem of non-­stationarity is a good example of this; the majority of spatial statistics used in archae-
ology assume spatial homogeneity, yet the theoretical stance and interest of archaeologists is often focused
much more on heterogeneity. Despite the availability of a substantial range of techniques that are designed
to tackle non-­stationarity (or to model spatially heterogeneous processes), archaeological applications are
comparatively rare with global statistics still being the most commonly adopted approach. The growing
amount of high quality data at increasingly larger spatial scales might however change this and promote the
use of local spatial analysis. This will no doubt provide new perspectives on the human past, enabling us to
answer questions that are perhaps in line with a wider range of theoretical approaches. Such a shift in scale
will, however, require the creation of more tailored techniques as well as the retrieval of data that can provide
the basis for exploring the effects of research bias. It is undeniable that with the increasing possibility to
engage with larger spatial scales, we will have to face the impact of heterogeneous research practices. These
will have a greater role in shaping the distributions we observe, hindering our ability to isolate the patterns
we truly seek to study. The adoption of local statistics can help this endeavour but it is worth noting that
these are ultimately exploratory tools and can never replace a global model where key missing variables are
correctly integrated. Detecting spatial heterogeneity tells us only that there is something missing; we might
estimate where and to some extent even how, but it will never tell us what they are. Furthermore, one should
also avoid the temptation to exclusively rely on the inductive insights offered by the output of local analyses
and conceiving them as the final stage of a research workflow. This is particularly so because the number
of statistical hypotheses is generally as many as the number of observations. As a consequence, there is an
increased possibility of incorrectly rejecting the null hypothesis even when this is false (type I-­error). This is
a known problem and one that cannot be easily solved by standard correction methods, such as Bonferroni,
as tests are not entirely independent from each other and consequently an indiscriminate use of p-­value
adjustment can lead to overly conservative conclusions (i.e. type II errors). This is also a known issue within
the literature of local spatial analysis (e.g. de Castro & Singer, 2006), and while some suggestions have been
proposed there is no consensus towards a single solution. Ultimately, local analyses should not be considered
as substitutes for global statistics but rather as a suite of complementary tools for evaluating assumptions,
providing clues for searching for missing variables, and refining hypotheses.

Acknowledgements
I would like to thank Mark Gillings, Gary Lock, and Piraye Hacıgüzeller for inviting me, providing
constructive feedbacks to the manuscript, and above all for being patient. I am also grateful to endless
166 Enrico R. Crema

discussions on these topics with a number of colleagues, in particular Andrew Bevan, Mark Lake, and
Alessio Palmisano. Analyses were performed using the spatstat (Baddeley, Rubak, & Turner, 2015) and
rcarbon (Bevan & Crema, 2018) packages within R statistical computing language (R Core Team, 2018).

References
Anselin, L. (1995). Local indicators of spatial association–LISA. Geographical Analysis, 27, 93–115.
Baddeley, A. (2017). Local composite likelihood for spatial point processes. Spatial Statistics, 22, 261–295.
Baddeley, A., Rubak, E., & Turner, R. (2015). Spatial point patterns: Methodology and applications with R. London: Chap-
man and Hall/CRC Press.
Bailey, T. C., & Gatrell, A. C. (1995). Interactive spatial data analysis. Harlow: Prentice Hall.
Biagetti, S., Merlo, S., Adam, E., Lobo, A., Conesa, F. C., Knight, J., . . . Madella, M. (2017). High and medium reso-
lution satellite imagery to evaluate late Holocene human–environment interactions in Arid lands: A case study
from the Central Sahara. Remote Sensing, 9, 351.
Benjamini, Y., & Hochberg, Y. (1997). Multiple hypotheses testing with weights. Scandinavian Journal of Statistics,
24m, 407–418.
Bevan, A. (2012). Spatial methods for analysing large-­scale artefact inventories. Antiquity, 86, 492–506.
Bevan, A., Colledge, S., Fuller, D., Fyfe, R., Shennan, S., & Stevens, C. (2017). Holocene fluctuations in human popu-
lation demonstrate repeated links to food production and climate. Proceedings of the National Academy of Sciences of
the United States of America, 114(49), E10524–E10531.
Bevan, A., & Conolly, J. (2009). Modelling spatial heterogeneity and nonstationarity in artifact-­r ich landscapes.
Journal of Archaeological Science, 36, 956–964.
Bevan, A., & Crema, E. R. (2018). Rcarbon v1.2.0: Methods for calibrating and analysing radiocarbon dates. Retrieved from
https://CRAN.R-­project.org/package=rcarbon
Carrer, F. (2013). An ethnoarchaeological inductive model for predicting archaeological site location: A case-­study of
pastoral settlement patterns in the Val di Fiemme and Val di Sole (Trentino, Italian Alps). Journal of Anthropological
Archaeology, 32, 54–62.
Carleton, W. C., Conolly, J., & Iannone, G. (2012). A locally-­adaptive model of archaeological potential (LAMAP).
Journal of Archaeological Science, 39, 3371–3385.
Chaput, M. A., & Gajewski, K. (2016). Radiocarbon dates as estimates of ancient human population size. Anthropo-
cene, 15, 3–12.
Chaput, M. A., Kriesche, B., Betts, M., Martindale, A., Kulik, R., Schmidt, V., & Gajewski, K. (2015). Spatiotemporal
distribution of Holocene populations in North America. Proceedings of the National Academy of Sciences of the United
States of America, 112(39), 12127–12132.
Clark, P. J., & Evans, F. C. (1954). Distance to nearest neighbour as a measure of spatial relationships in populations.
Ecology, 35, 445–453.
Crema, E. R., Bevan, A., & Lake, M. (2010). A probabilistic framework for assessing spatio-­temporal point patterns
in the archaeological record. Journal of Archaeological Science, 37, 1118–1130.
Crema, E. R., Bevan, A., & Shennan, S. (2017). Spatio-­temporal approaches to archaeological radiocarbon dates.
Journal of Archaeological Science, 87, 1–9.
Crema, E. R., & Bianchi, E. (2013). Looking for patterns in the noise: non-­site spatial analysis at Sebkha Kalbia Tuni-
sia. In S. Mulazzani (Ed.), Le Capsien de hergla Tunisie Culture, environnement et économie (pp. 385–395). Frankfurt:
Africa Magna.
Crema, E. R., Habu, J., Kobayashi, K., & Madella, M. (2016). Summed probability distribution of 14 C dates sug-
gests regional divergences in the population dynamics of the Jomon Period in Eastern Japan. PLoS One, 11.
doi:10.1371/journal.pone.0154809
de Castro, M. C., & Singer, B. H. (2006). Controlling the False Discovery Rate: A new application to account for
multiple and dependent tests in local statistics of spatial association. Geographical Analysis, 38, 180–208.
Eerkens, J. W., Spurling, A. M., & Gras, M. A. (2008). Measuring prehistoric mobility strategies based on obsidian
geochemical and technological signatures in the Owens Valley, California. Journal of Archaeological Science, 35,
668–680.
Non-­stationarity and local spatial analysis 167

Eve, S., & Crema, E. R. (2014). A house with a view? Multi-­model inference, visibility fields, and point process analy-
sis of a Bronze Age settlement on Leskernick Hill (Cornwall, UK). Journal of Archaeological Science, 43, 267–277.
Fortin, M.-­J., & Dale, M. (2005). Spatial analysis: A guide for ecologists. Cambridge: Cambridge University Press.
Fotheringham, A. S., Brunsdon, C., & Charlton, M. (1998). Geographically weighted regression: A natural evolution
of the expansion method for spatial data analysis. Environment and Planning A, 30, 1905–1927.
Fotheringham, S. A., Brunsdon, C., & Charlton, M. (2000). Quantitative geography: Perspectives on spatial data analysis.
London: Sage Publications.
Fotheringham, A. S., Brunsdon, C., & Charlton, M. (2002). Geographically weighted regression: The analysis of spatially
varying relationships. Chichester: John Wiley & Sons.
Gaffney, C. F., & van Leusen, P. M. (1995). Postscript–GIS, environmental determinism and archaeology. In G. Lock &
Z. Stančić (Eds.), Archaeology and geographic information systems (pp. 367–382). London: Taylor and Francis.
Getis, A., & Aldstadt, J. (2010). Constructing the spatial weights matrix using a local statistic. Geographical Analysis,
36, 90–104.
Getis, A., & Franklin, J. (1987). Second-­order neighborhood analysis of mapped point patterns. Ecology, 68, 473–477.
Getis, A., & Ord, J. K. (1992). The analysis of spatial association by use of distance statistics. Geographical Analysis,
24, 189–206.
Gil, A. F., Ugan, A., Otaola, C., Neme, G., Giardina, M., & Menéndez, L. (2016). Variation in camelid δ13C and δ15N
values in relation to geography and climate: Holocene patterns and archaeological implications in central western
Argentina. Journal of Archaeological Science, 66, 7–20.
Gkiasta, M., Russell, T., Shennan, S., & Steele, J. (2003). Neolithic transition in Europe: The radiocarbon record
revisited. Antiquity, 77, 45–62.
Hodder, I., & Orton, C. (1976). Spatial analysis in archaeology. Cambridge: Cambridge University Press.
Jones, J. P., & Casetti, E. (1992). Applications of the expansion method. London: Routledge.
Jones, J. P., & Hanham, R. Q. (1995). Contingency, realism, and the expansion method. Geographical Analysis, 27,
185–207.
Kelly, R. L., Surovell, T. A., Shuman, B. N., & Smith, G. M. (2013). A continuous climatic impact on Holocene
human population in the Rocky Mountains. Proceedings of the National Academy of Sciences of the United States of
America, 110(2), 443–447.
Löwenborg, D. (2010). Using geographically weighted regression to predict site representativity. In B. Frischer,
J. W. Crawford, & D. Koller (Eds.), Making history interactive, proceedings of CAA conference, 37th annual meeting
(pp. 203–215), Williamsburg, VA and Oxford: Archaeopress.
Manning, K., Stopp, B., Colledge, S., Downey, S., Conolly, J., Dobney, K., & Shennan, S. (2013). Animal exploitation
in the early Neolithic of the Balkans and central Europe. In S. Colledge, J. Conolly, K. Dobney, K. Manning, &
S. Shennan (Eds.), The origins and spread of domestic animals in southwest Asia and Europe (pp. 237–252). Walnut
Creek, CA: Left Coast.
Martindale, A., Morlan, R., Betts, M., Blake, M., Gajewski, K., Chaput, M., . . . Vermeersch, P. (2016). Canadian
Archaeological Radiocarbon Database (CARD 2.1). Retrieved April 1, 2017, from www.canadianarchaeology.ca/
Menze, B. H., Ur, J. A., & Sherratt, A. G. (2006). Detection of ancient settlement mounds. Photogrammetric Engineer-
ing & Remote Sensing, 72, 321–327.
Openshaw, S., Charlton, M., Wymer, C., & Craft, A. (1987). A Mark 1 geographical analysis machine for the auto-
mated analysis of point data sets. International Journal of Geographical Information Systems, 1, 335–358.
Orton, C. (2004). Point pattern analysis revisited. Archeologia e Calcolatori, 15, 299–315.
Palmisano, A. (2013). Zooming patterns among the scales: A statistics technique to detect spatial patterns among
settlements. In G. Earl, T. Sly, A. P. Chrysanthi, P. Murrieta-­Flores, C. Papadopoulos, I. Romanowska, & D. Wheat-
ley (Eds.), CAA 2012: Proceedings of the 40th Annual Conference of Computer Applications and Quantitative Methods
in Archaeology (CAA) (pp. 348–356). Amsterdam: Amsterdam University Press.
Pélissier, R., & Goreaud, F. (2001). A practical approach to the study of spatial structure in simple cases of heteroge-
neous vegetation. Journal of Vegetation Science, 12, 99–108.
Pinhasi, R., Fort, J., & Ammerman, A. J. (2005). Tracing the origin and spread of agriculture in Europe. PLoS Biol-
ogy. doi:10.1371/journal.pbio.0030410
Premo, L. (2004). Local spatial autocorrelation statistics quantify multi-­scale patterns in distributional data: An
example from the Maya Lowlands. Journal of Archaeological Science, 31, 855–866.
168 Enrico R. Crema

R Core Team. (2018). R: A language and environment for statistical computing: R Foundation for Statistical Computing.
Vienna, Austria. Retrieved from: www.R-­project.org/
Rick, J. W. (1987). Dates as data : An examination of the Peruvian Preceramic Radiocarbon Record. American
Antiquity, 52, 55–73.
Ripley, B. D. (1976). The second-­order analysis of stationary point processes. Journal of Applied Probability, 13,
255–266.
Riris, P. (2017). Towards an artefact’s-­eye view: Non-­site analysis of discard patterns and lithic technology in Neotrop-
ical settings with a case from Misiones province, Argentina. Journal of Archaeological Science: Reports, 11, 626–638.
Shennan, S., Downey, S. S., Timpson, A., Edinborough, K., Colledge, S., Kerig, T., . . . Thomas, M. G. (2013). Regional
population collapse followed initial agriculture booms in mid-­Holocene Europe. Nature Communications, 4.
doi:10.1038/ncomms3486
Stolar, J., & Nielsen, S. E. (2015). Accounting for spatially biased sampling effort in presence-­only species distribution
modelling. Diversity and Distributions, 21, 595–608.
Styring, A., Maier, U., Stephan, E., Schlichtherle, H., & Bogaard, A. (2016). Cultivation of choice: New insights into
farming practices at Neolithic lakeshore sites. Antiquity, 90, 95–110.
Syfert, M. M., Smith, M. J., & Coomes, D. A. (2013). The effects of sampling bias and model complexity on the
predictive performance of MaxEnt species distribution models. PLoS One. doi:10.1371/journal.pone.0055158
Timpson, A., Colledge, S., Crema, E., Edinborough, K., Kerig, T., Manning, K., . . . Shennan, S. (2014). Reconstruct-
ing regional population fluctuations in the European Neolithic using radiocarbon dates: A new case-­study using
an improved method. Journal of Archaeological Science, 52, 549–557.
Warden, L., Moros, M., Neumann, T., Shennan, S., Timpson, A., Manning, K., . . . Damsté, J. S. S. (2017). Climate
induced human demographic and cultural change in northern Europe during the mid-­Holocene. Scientific Reports,
7, 15251.
Williams, A. N. (2012). The use of summed radiocarbon probability distributions in archaeology: A review of meth-
ods. Journal of Archaeological Science, 39, 578–589.
10
Spatial fuzzy sets
Johanna Fusco and Cyril de Runz

Introduction
Archaeologists have, for many years, sought more and more data in order to accurately reflect and retrace
the spatial and temporal dynamics of past societies. We have now entered the era of ‘big data’ and are
becoming used to dealing with archaeological datasets that have grown “undigested” (Orton, 2010,
p. 4; see also Bevan, 2015; Cooper & Green, 2015; Green this volume). These datasets merge together
huge amounts of heterogeneous and fragmentary information from various sources, datasets and surveys
(Cooper & Green, 2015; McCoy, 2017), which are rarely at the same scale (Gattiglia, 2015) and do not
stem from the same interpretative and methodological frameworks (Cooper & Green, 2015; Roskams &
Whyman, 2007). Take, for example, dating, which is significantly affected by the inconsistency of chrono-
logical systems, methods and interpretations between surveys, even within one single archaeological site
(Kennedy & Hahn, 2017). This heterogeneity generates a high variability in data accuracy and reliability
within datasets, which impacts the quality and the reliability of models and analyses.
On the bright side, if this accumulation of heterogeneous data has little improved the accuracy of our
statements about past phenomena, it has contributed to highlighting the various aspects of imperfec-
tion that exist within archaeological information. While successful attempts at unifying interpretative
schemes and making datasets comparable have been put forward (Kennedy & Hahn, 2017; Roskams &
Whyman, 2007), this awareness has fuelled an added sense of urgency to tackle imperfection and to adopt
more sensitive methodological and theoretical approaches to past spatiotemporal phenomena. These new
approaches prevent archaeologists from rushing into inappropriate interpretation, and from confound-
ing the complexity of past phenomena with the analytical biases and complexities created by our own
failure to both manage dataset size and imperfection and shift towards more flexible methodological and
ontological frameworks (Bevan, 2015; Brouwer Burg, 2017). If swept under the carpet, data imperfec-
tion spreads throughout analyses, results and interpretation. It then grows out of control, and prevents
us from assessing the validity of our conclusions or from directly comparing situations and phenomena.
The extreme opposite approach is to remove from the analysis any data that do not display the required
quality level (see also Gupta, this volume). However, this upward standardisation process often results in a
great loss of information which might potentially impact the statistical representativeness of the sample.
We would argue that a more profitable pathway emerges from thinking within imperfection, making it
170 Johanna Fusco and Cyril de Runz

intelligible not only at the data level but at every step of analysis and data treatment and even in our own
reasoning schemes.
According to Veregin (1989), Plewe (2002) and Fisher (2005), identifying and classifying sources of
error, and elaborating a deep theoretical basis on data imperfection and its consequences on models and
analyses, is an indispensable step for managing it correctly, whether it concerns the attributes of the con-
sidered object, or its spatial and temporal dimensions. While an extensive literature on spatial and geo-­
historical data imprecision exists (see Longley, Goodchild, Maguire, & Rhind, 2005; Plewe, 2002; Zhang &
Goodchild, 2002 for extensive discussion and review), Peter Fisher’s classification on data imperfection
(Fisher, 2005) is frequently considered to be one of the clearer and most useful. Fisher breaks down the
various dimensions of what we often call ‘imperfection’ or ‘uncertainty’ in a broad sense, into the specific
concepts of vagueness, incompleteness, uncertainty and ambiguity amongst other terms, in order to facilitate
their detection among datasets, and to anticipate their potential consequences on analyses. We propose to
synthetize this classification while focusing on its adaptation to the classical imperfections of archaeologi-
cal data (for further discussion on the following classification see Fisher, 2005; Fisher, Comber, & Wad-
sworth, 2006; Longley, Goodchild, Maguire, & Rhind, 2005; Plewe, 2002; de Runz, Desjardin, Piantoni, &
Herbin 2011 in the specific context of archaeological data).
What we perceive about past human activities is limited to the materials that cross the ages to reach us,
and obviously to the areas investigated and methods used. This “lack of evidence” (Plewe, 2002, p. 11) is
referred to here as incompleteness and prevents archaeological objects, larger structures or even regions and
time periods from being completely described and us from perceiving their functioning at different scales.
Our knowledge of the spatial, temporal and functional aspects of an archaeological object might be
questioned by the reliability of the source, by measuring or processing errors, or by our own interpreta-
tion and classification of the object. All of these cause uncertainty in archaeological datasets. For example,
the spatial (location) or temporal (period of time) measure of an archaeological object may simply be
wrong, an error may occur in the coding of these attributes, or the object may be attributed to the wrong
class in error. The validity of the information and knowledge concerning the object is thus uncertain.
Imprecision occurs when the boundaries of the categories used to classify archaeological objects are
inaccurately defined. This imprecision can be qualified as vagueness when we use subjective knowledge or
inaccurate measuring instruments. Most archaeological categories are inherently vague: relative chronolo-
gies (see Crema, 2012; Desachy, 2012; Kennedy & Hahn, 2017; Niccolucci & Hermon, 2015 for extensive
discussion on time classification and chronological inconsistency); typological classifications (Hermon &
Niccolucci, 2002); or even the use of what Zadeh (1975) calls “linguistic variables”, i.e. “variables whose
values are not numbers but words or sentences in a natural or artificial language” (Zadeh, 1975, p. 3).
These variables are mostly used in predictive modelling, to refer to ‘high’,‘medium’ and ‘low’ potentialities
of finding settlement, or ‘preferred, ‘indifferent, ‘avoided’ areas (Balla, Pavlogeorgatos, Tsiafakis, & Pavlidis,
2013; Jaroslaw & Hildebrandt-­Radke, 2009; Vaughn & Crawford, 2009). Extending that logic brings us
to question several categories that might seem so common, so trivial, we often actually forget they are
categories and take them for granted:

When, exactly, is a house a house; a settlement, a settlement; a city, a city; a podsol, a podsol; an oak
woodland, an oak woodland? The questions always revolve around the threshold value of some
measurable parameter or the opinion of some individual, expert or otherwise.
(Fisher, 2005, p. 7)

Ambiguity intervenes when there is doubt in the definition of an archaeological object or a phenom-
enon, i.e. when they might belong to several categories or scales, or when their description is subject to
Spatial fuzzy sets 171

opposing interpretations. Ambiguity is thus a specific form of imperfection combining uncertainty and
imprecision. It is typical of urban archaeology, which is characterized by an intense superposition and
frequent reutilization of remains (de Runz et al., 2011).
Even though the ‘classic’ probabilistic framework developed in predictive archaeology and chrono-
logical construction – notably the aoristic approach (Crema, Bevan, & Lake, 2010; Johnson, 2004) and
Bayesian networks (Buck, 2004; Lanos & Philippe, 2015; Litton & Buck, 1995) – have been investigated
by many in archaeology (Crema, 2012), this framework is not the most suited to modelling imprecision.
Indeed, as probabilities have a frequency interpretation, they are not appropriate to tackle imprecision.
For instance, a 35-­year-­old man may be partially considered as young – at least if we consider the global
definition of young, even though he is older than a 20-­year-­old man. Thus, if one considers to model
the concept young with probabilities, and assign a probability of 0.75 to the partially young 35-­year-­old
man, it will mean that 75% of the 35-­year-­old people have the chance to be considered as young and the
other 25% not, which has no sense. By introducing fuzzy logic, Zadeh proposed an adapted alternative to
probabilities to quantify and estimate how much a value of a set characterizes the concept associated to
it, with a membership value. With fuzzy logic, giving the 35-­year-­old man a membership value of 0.75
to the concept young means that he is young but also not fully young. Niccolucci and Hermon (2015)
extensively discuss this argument and show its relevance in the field of archaeology. They argue that

fuzziness is about an imprecise concept; probability is about the unknown condition of a precise
concept. (. . .) a probabilistic model assumes that something is either true or false, and we do not
know which is the case, for various reasons: incomplete information about the past, or because the
condition concerns the future, etc. A fuzzy model concerns instead those situations in which ask-
ing if something is true or false is meaningless, because there are various degrees of being true (and
false), even concerning cases which can be thoroughly inspected.
(Niccolucci & Hermon, 2015, p. 69).

These cases are numerous in of archaeology, whether it concerns dating, prediction of archaeologi-
cal site location, classification of archaeological artefacts . . . Thus, several authors have recently adopted
fuzzy logic, and consider it a more adapted framework to tackle archaeological imperfection (Hatziniko-
laou, 2006; Niccolucci & Hermon, 2004; Niccolucci & Hermon, 2015). Hatzinikolaou, Hatzichristos,
Siolas, and Mantzourani (2003) investigated fuzzy logic potential for predictive archaeology and applied
it on Melos Island, Greece, in order to estimate the degree to which archaeological objects belonged to
a set of functional and cultural categories. Banerjee, Srivastava, Pike, and Petropoulos (2018) also used
it for predictive archaeology to identify possible rock art sites in Central India, while Balla et al. (2012)
exploited it for finding Macedonian tombs in Northern Greece. Taking a similar approach, Hermon and
Niccolucci (2002, 2003) have built fuzzy typologies in order to define archaeological artefacts through
object classes with indefinite boundaries, and have also applied this approach to temporal classification
(Farinetti, Hermon, & Niccolucci, 2004; Niccolucci & Hermon, 2015) while Niccolucci, D’andrea, and
Crescioli (2001) use it to assign a reliability coefficient to imprecise attributes of statistical data deriving
from archaeometry. As the flexibility inherent to fuzzy logic makes it easily combinable with other meth-
ods, Machàlek, Cimler, Olševičová, and Danielisová (2013) associated it with agent-­based modelling to
describe patterns of agricultural use in Iron Age Europe, and Baxter (2009) used it in combination with
cluster analysis. While fuzzy logic is, as we have seen, increasingly used in archaeological data analysis,
Niccolucci and Hermon (2017) have made a compelling case for its upstream use, embedding it from the
documentation phase onwards, in order to include reliability assessments from the very beginning of the
data collection and gathering process.
172 Johanna Fusco and Cyril de Runz

Method
As argued above, imprecision should be considered in the modelling of information and as shown by the
Sorites paradox,1 probabilistic modelling is not well adapted to tackle imprecision. Zadeh (1965, 1978)
thus introduced fuzzy set theory, which defines the notion of partial and valued membership of a value to
a class. A fuzzy set A (also called a Type-­1 fuzzy set) is characterized by a membership µA function tak-
ing the values [0, 1]. For each domain value x, a membership degree µA(x), defined as [0, 1], is proposed.
Therefore, concepts like ‘young’ (see Figure 10.1), ‘old’, etc. can easily be modelled by fuzzy sets. If the
membership degree is equal to 1, then the domain value fully belongs to the concept; if the member-
ship degree is equal to 0, then the domain value does not belong to it. Finally, if the membership degree
falls somewhere in between, then the domain value partially belongs to it. So for instance, according to
Figure 10.1, a 2-­year-­old is young, a 50-­year-­old is not young, and a 30-­year-­old is neither young or not
young but somewhere in between.
Two critically important concepts in fuzzy set theory are connectivity and the notion of the Alpha-­cut
(α-­cut). An α-­cut Aα, for all α > 0, is the set of the domain values (the set of x) having a membership
value higher or equal to α (µA(x) ≥ α), Figure 10.2. By convention, A0 is the set of x such that µA(x) >
0 and is also called the support of the fuzzy set A. A1 is the core. If A1 is not null, then the fuzzy set A is
referred to as normalized.
In a spatial context, fuzzy geometries refer to ‘fuzzy polygons’, ‘fuzzy lines’ and ‘fuzzy points’, cor-
responding to spatial objects whose boundaries cannot be determined accurately (Figure 10.3). In order
to obtain consistent fuzzy sets for these spatial shapes, the second important concept to be considered is
convexity for 1-­dimensional fuzzy sets, and connectivity for fuzzy sets defined on higher dimensions.
By definition, a fuzzy set A is connected if, and only if, for all α in [0; 1] Aα is connected. Therefore,
a fuzzy set A is considered to be connected if, and only if, for all α in [0; 1], each pair (x, y) of domain
values of Aα can be connected by a path included in Aα, meaning that Aα is not composed of separate sets

Figure 10.1 Illustration of a possible fuzzy definition of the concept ‘young’ for humans.
Spatial fuzzy sets 173

Figure 10.2 Interpretation examples of membership values, and illustration of the α-­cut concept in the case
where the fuzzy set is representing a possibility distribution. For instance, the domain value subset [v9, v10] is
the core (the α-­cut A1) of the fuzzy set and means very possible, while [min, max] is the support (the α-­cut
A0) and means almost impossible. Any domain values outside [min, max] are impossible.
Source: after Zoghlami, de Runz, and Akdag (2016), Figure 5

(see Figure 10.4). In order to compute information from fuzzy sets, it is important (but not compulsory)
to deal with connected (or convex) normalized fuzzy sets. The connexity allows the modelling of simple
geographic shape (fuzzy points, fuzzy lines, fuzzy polygons) and the composition of connected fuzzy sets
may allow definition of more complex shapes.
There are several ways to combine fuzzy sets: arithmetic operations are possible (see Zadeh, 1965), such
as logical operations. The main approach is to use t-­norms (for the AND) and t-­conorms (for the OR).
The probabilistic t-­norm is a multiplication of the two event values, and the t-­conorm is their addition
minus the value of the operation AND on them. Zadeh (1965), proposed use of their minimum for the
t-­norm, and their maximum for the t-­conorm. For example, let A and B be two fuzzy sets with µA, µB,
their associated membership function.

Probabilistic

t − norm : µ A and B ( x ) = µ A ( x )∗ µ B ( x )

t − co -norm : µ A or B ( x ) = µ A ( x ) + µ B ( x ) − µ A and B ( x ) = µ A ( x ) + µ B ( x ) − (µ A ( x )∗ µ B ( x ))

Zadeh

t-norm: ∝A and B ( x ) = min ( ∝A ( x ), ∝B ( x ))

t-co-norm: ∝A or B ( x ) = max ( ∝A ( x ), ∝B ( x ))
174 Johanna Fusco and Cyril de Runz

Figure 10.3 Illustration of a fuzzy wooded area. Experts determine 3 main areas: area 1, where the concept of
the wooded area is plainly respected (the α-­cut A1); area 2, which encloses area 1, where the concept is partially
respected (the α-­cut A0.5); and area 3, which encloses the two previous areas, representing the limits for which
the concept can at least partially be defined (the α-­cut A0). Considering several α-­cuts (at least 2, A1 and A0),
the membership degree of each domain value can be obtained: it is at least the highest degree of the α-­cuts it
belongs to and can potentially be obtained by spatial interpolation as well.
Source: after Zoghlami, de Runz, and Akdag (2016), Figure 3

However, using this definition, for each value in the domain set we may obtain a unique value (its
membership degree) which may thus be considered as precise. According to Mendel (2003), it may be
paradoxical to consider that a precise value (a membership degree) may represent imprecision. As a result,
Type-­2 fuzzy sets (Zadeh, 1975) have been introduced as an extension of Type-­1 fuzzy sets, in order to
define values of membership that are themselves fuzzy. Thus, at each value of the primary variable, here
for example the number of sites, the membership degree is a function (e.g. an interval, as far as Interval
Type-­2 fuzzy sets are concerned), and not just a point value. In this elaborated approach the membership
function is blurred, and becomes a surface representing the “footprint of uncertainty” (Mendel & Bob
John, 2002).
Spatial fuzzy sets 175

Figure 10.4 Illustration of a connected spatial α-­cut.

Case studies
Here we present two examples, the first one based on data modelling and the second one on data analysis.
In the first example Type-­1 fuzzy set modelling is used while in the second Type-­2 modelling is also
employed. In the second example, the issue of intermediary scale of analysis is also discussed.

FGISSAR: an urban archaeological GIS handling fuzziness in Reims (France)


An excavation site is both an archaeological entity and an aggregation of smaller archaeological entities.
The different databases exploited in this section have been produced by the SIGREM (Geographical
Information System of Remes city) project. The first one, BDRues, describes the excavation of Roman
streets in Reims. In this database, the entities are defined using several features among which are their
geographic location (as point), their orientation (as degree) and their period of activity (as interval).
F-­BDRues is the fuzzified version of BDRues, where objects are characterized by a fuzzy point, a fuzzy
orientation and a fuzzy date/period of time. The other one, the general GISSAR (Geographic Informa-
tion System for Spatial Analysis in aRchaeology) database, considers data about the whole set of informa-
tion harvested during excavations made in Reims relating to the Roman period. In its fuzzified version
FGISSAR (Fuzzy Geographic Information System for Spatial Analysis in aRchaeology), an archaeological
entity is considered and modelled through a fuzzy polygon and a fuzzy period for its estimated period
of activity.
If the aim of a given study is to query data (archaeological entities) in order to know their anteriority
with respect to a considered date or period of time, determining the chronological order between crisp
periods of time that may overlap, as for instance the periods [-­100, 100] and [-­150, 150], can be a difficult
176 Johanna Fusco and Cyril de Runz

issue (Allen, 1983). This is made more complex in a fuzzy context because there is no precise and abso-
lute definition of anteriority or posteriority between fuzzy periods or dates. The proposition, introduced
by de Runz, Desjardin, Piantoni, and Herbin (2010), is to moderate the decision, i.e. anterior or not, by
assigning it a confidence index. For that purpose, they defined an index, which evaluates the anteriority
between two fuzzy periods using the areas of non-­overlap (a, b, c and d) of the fuzzy periods F and G, as
presented in Figure 10.5 and defined as follows:

Ant ( F ,G ) = (b + c ) / (a + b + c + d )

Considering the non-­common areas between µF and µG, this index defines the anteriority between
F and G by the ratio of the areas that validate the anteriority hypothesis over the total of non-­common
areas.
A sample output is shown in Figure 10.6 for the Reims (France) database on Roman street excavations
(F-­BDRues).
Building upon this idea of fuzzy data selection, Zoghlami, de Runz, Pargny, Desjardin, and Akdag
(2012) present some examples of spatiotemporal requests to find entities (e.g. as Roman sites, streets, walls)
within the FGISSAR database that are respecting soft constraints. For instance, a given user may want to
select archaeological entities satisfying the two following constraints:

• Their activity is in a specific period (e.g. the 2nd century) with a user-­defined membership degree
of at least 0.4.;
• Their shape belongs to a specific site (e.g. “PC 88”) with at least a membership degree of 0.8.

This request corresponds in fuzzy logic to:


ActivityPeriod( x ) ~ mid 2nd Century >= 0.4 AND Shape( x ) ~ Shape( PC88) >= 0.8

In the proposed system, the Zadeh t-­norm is used for the logical operator AND. The principle used to
model the BELONG operator is simple too. First, for each entity x, the algorithm determines the set of

Figure 10.5 Membership functions of two fuzzy periods/dates (f and g); a, b, c and d are the areas of
non-­overlap.
Spatial fuzzy sets 177

Figure 10.6 Visualization of Roman streets in Reims that were anterior to the period “around 200 AD”
according to the confidence we have in the results.

coupled α-­cuts from x and PC88 where the α-­cut from x is included in the α-­cut of PC88. Then, the
degree of each couple is computed according to the Zadeh t-­norm (minimum between the two degrees).
After that, the final degree of the spatial relation is obtained by taking the maximum of the α-­cut couple
degrees. The visualization of this query, which combines both spatial and temporal imperfections is
illustrated in Figure 10.7.
In de Runz, Desjardin, Piantoni, and Herbin (2014), a more complex fuzzy spatiotemporal query is
performed in order to extract the configuration of Roman streets in Reims during different fuzzy time
periods. We take as an example the 3rd century AD (Figure 10.8). The black lines represent the possible
design of Roman streets and therefore inform about the Roman road network during Reims during the
3rd century AD. In order to compute it an adaptation of the Hough Transform, a pattern recognition
method, has been introduced, by considering separately the fuzzy periods, the fuzzy shape and the fuzzy
orientation of each object of the database. For more details, please refer to (de Runz et al., 2014).
This case study presents a simple application context for the use of a GIS where the fuzzy queries
allow representing and selecting data, or, in the last case, producing new layers. The associated databases
are built in order to handle archaeological data considering data fuzziness.

A bundle of past possibilities”: modelling spatiotemporal structures of past settlement with fuzzy
logic. Application to Syrian Arid Margins during the Bronze Age.
(3600–1200 BC)

This case study proposes an exploratory method to model the spatiotemporal structures and dynamics
of settlements described by imperfect archaeological data (Fusco, 2015, 2016). It is based on the Bronze
178 Johanna Fusco and Cyril de Runz

Figure 10.7Visualization of entities which have an activity dated to the “Middle of the 2nd century” and
which belong to site PC 88.
Source: after Zoghlami, de Runz, and Akdag (2016), Figure 29

Age Syrian Fertile Crescent’s “Arid Margins”,2 an area considered as transitional between the steppe and
the desert that was occupied by people from the Neolithic period through to the present day (Geyer,
2011), and which is described by highly imperfect data (Figure 10.9) for three particular reasons. First,
several zones within this area have been poorly surveyed by archaeologists or not surveyed at all, leading
to data incompleteness (designated under ‘intensity of archaeological survey’ on Figure 10.9). Therefore,
if detailed systematic survey zones characterized by an absence of sites can be considered as ‘unoccupied’
during the Bronze Age, the voids in unsurveyed or Google Earth survey zones are ‘ambiguous’. Second,
many sites display functional uncertainty, even in the well-­surveyed zones, which means we cannot tell for
sure if they were habitation sites or not. ‘Reliable sites’ are the ones whose habitat function is unambigu-
ous for archaeologists and are displayed with full-­colours in Figure 10.9. ‘Unreliable sites’ are the ones
whose habitat function is more ambiguous, and these are displayed using hatching. Third, the dating of
several Early Bronze Age sites is imprecise. Most of them are assigned to the last part of the sub-­period
Early Bronze Age IV, from 2500 to 2000 BC (displayed in bright red on Figure 10.9), while some others
could not be dated as precisely. These undetermined sites have been assigned to the whole Early Bronze
Age period, from 3600 to 2000 BC (displayed in pale red on Figure 10.9).
This approach, presented by Fusco (2016), combines the theoretical frameworks of exploratory spatial
data analysis and fuzzy sets in a methodological chain (Figure 10.10) whose purpose is twofold: first, to
detect and model settlement spatiotemporal structures and dynamics in well-­surveyed zones in order
to describe and partly explain site patterning, while taking into account the impact of data quality on
our results (steps 1 to 3 in Figure 10.10). Second, to use the information obtained about settlement
Spatial fuzzy sets 179

Figure 10.8 Simulated map of Reims’ streets during the 3rd century AD generated from the fuzzy spatio­
temporal data stored in FGISSAR (Fuzzy Geographic Information System for Spatial Analysis in aRchaeology)
with an adaptation of a pattern recognition method (the Hough Transform). The darker the object, the higher
the possibility of its presence during the 3rd century AD.

spatiotemporal structures in well-­surveyed zones to make estimates and assumptions about potential
settlement location in unsurveyed areas (step 4 in Figure 10.10).
Archaeological sites are not a random set of points; their location and pattern reflect specific land use
and management logics whose deciphering is fundamental to understand the functioning of past settle-
ments and to model the potential of unsurveyed zones for hosting settlement. The first step of Fusco’s
methodology presented in Figure 10.10 serves to reveal the diversity of spatial patterns, while estimating
the impact of data imperfection on the results. The local spatial autocorrelation measurement (Anselin,
1995) has been chosen in order to detect spatial clusters and outliers and to reveal homogeneous or more
atypical sub-­areas in the location and evolution of archaeological sites. The Local Indicators of Spatial
Association3 (LISA), based on the Local Moran statistics (Anselin, 1995), have the ability to reveal the
spatial patterns created by local spatial autocorrelation, under the form of ‘spatial clusters’ and ‘spatial
outliers’ (Crema, this volume).
180 Johanna Fusco and Cyril de Runz

Figure 10.9 Spatiotemporal trajectories and imperfection of archaeological data in Syrian Arid Margins dur-
ing the Bronze Age. A colour version of this figure can be found in the plates section.
Source: after Fusco, 2016, Figure 77

Figure 10.10 The four steps of the proposed methodology.

In order to evaluate the impact of data imperfection on spatial structures, analyses were first carried
out on reliable sites only, i.e. sites which are known to be habitat sites (mentioned as ‘sites with reliable
functional information’ on Figure 10.11), and then with the whole dataset, including reliable and unreli-
able archaeological sites (mentioned as ‘all sites’ on Figure 10.11). The impact of temporal accuracy on
spatial structures is also examined: the two upper grids on Figure 10.11 show the LISA results for the
Early Bronze Age sites dated more accurately (‘Early Bronze Age IV’), while the two grids below show
all the sites broadly related to the Early Bronze Age. The four LISA results on Figure 10.11 thus present
Spatial fuzzy sets 181

Figure 10.11 Detecting local spatial configurations from reliable and unreliable archaeological sites with Local
Indicators of Spatial Association.
Source: after Fusco, 2016, Figures 89–91.

four possible local spatiotemporal configurations in the form of ‘spatial clusters’ and ‘spatial outliers’,
following the considered levels of temporal and functional accuracy and reliability (see Fusco (2016) for
further information on the methodology).
The second step of the methodology involves finding out the location parameters which may have
influenced these spatial patterns. A strong correlation between the location of waterways4 and Early
Bronze Age sites has been shown and this variable has thus been chosen as an example to model Early
Bronze Age settlement potential in the Arid Margins area. Whilst the overall method will be presented
and illustrated here for only a small part of the area, the final results will be presented for the study area
as a whole.
The approach carried out in the third step implies starting from what we know, i.e. archaeological site
locations and shapes, and their relationship to waterways, to infer what we want to know, i.e. the high,
medium and low potential of attracting settlement in the Arid Margins (step 4). These levels of potential
are assessed by the proximity between sites and waterways: the smaller the distance between them, the
higher the attractiveness potential. The general postulate stemming from the preliminary statistical and
morphological analyses stated above is that the most ‘attractive’ zones for settlement will be located close
to the waterways, and that this attractiveness will decrease as the distance increases. Several ‘sub-­spaces’
have thus been identified in the studied zone: each of them representing remoteness from waterways (Fig-
ure 10.12). The objective is to model the relationship between density and distance of sites to waterways
in order to determine systematically what characterizes a sub-­space’s ‘high’, ‘medium’ or ‘low’ potential of
182 Johanna Fusco and Cyril de Runz

Figure 10.12 Delineation of ‘sub-­spaces’ from waterways demonstrated on a small area of the Arid Margins
(after Fusco, 2016, Figure 110).

attracting settlement. This model is calibrated with the characteristics of surveyed zones (step 3) which
is then interpolated in the unsurveyed parts of each sub-­space (step 4).
The variation of attractiveness throughout the studied area is modelled through the variation of sites
density between each sub-­space. The attractiveness model is thus computed as follows:
t ′ ×100
x� =
t/n
where
x′ is the over or under-­representation of the considered sub-­space in terms of sites respect of the average
number of sites per sub-­space in the studied area;
t′ is the number of sites in the considered sub-­space;
n is the total number of sub-­spaces in the area; and
t is the total number of sites in the area.

In other words, a sub-­space which is over-­represented in terms of sites (i.e. a sub-­space whose site density
is higher than the average number of sites in each subspace for the studied area) is considered as more
attractive than the others, and vice versa. Each sub-­space is thus given a percentage traducing its over-­
under-­or average representation in terms of sites.
The next task consists of classifying each sub-­space into a ‘high’, ‘medium’ or ‘low’ attractiveness cat-
egory according to its representation in terms of sites. To do so the thresholds of each class need to be
fixed, i.e. to determine the site percentage which justifies the passage from one category to another. How-
ever, if a sub-­space containing 1% of sites can be easily considered as having a ‘low’ potential of attracting
settlement, distinguishing between ‘low’ and “medium’ potential will be more ambiguous when the site
percentage is around 80% or 90% for example. The ability of fuzzy logic to handle “linguistic variables”
(Zadeh, 1975, p. 3) is thus used during the modelling phase.
After testing various possibilities of threshold values, it appeared that ‘fuzzifying’ site density between
65%, 90%, 110% and 135% of the average was the most relevant option (Figure 10.13).
Spatial fuzzy sets 183

Figure 10.13 Fuzzy sets framework set up to estimate each sub-­space’s potential of attracting settlement
throughout the studied area (after Fusco, 2016, Figure 122).

The framework represented on Figure 10.13 means literally that a sub-­space containing a number of
sites representing 0 to 65%, 90% to 110% and more than 135% of the average will be considered as hav-
ing respectively ‘low’, ‘medium’ and ‘high’ potential of attracting settlement. The two ‘vague’ zones fall
between 65% and 90%, and 110% and 135%, where the definition of ‘high’, ‘medium’ and ‘low’ is more
uncertain. As a consequence, the potential of attracting settlement of a sub-­space which contains a site
density representing 70% of the average will be considered as ‘low’ with a membership function – or to
put it differently, a ‘possibility degree’ – of 0.8, and ‘medium’ with a membership function of 0.2 (the
dotted line on Figure 10.13).
Comparing each sub-­space number of sites to the average, and replacing them on the fuzzy sets
framework presented on Figure 10.13 then makes it possible to place each of them in one or more ‘high’,
‘medium’ or ‘low’ category (Figure 10.14). These results can then be mapped (Figure 10.15).
The numbers in red calibrate the fuzzy sets presented on Figure 10.13 (i.e. as the average number
of sites is 26, 65% of the average equals 17, 90% of the average equals 23 and so on.). The numbers in
black correspond to the observed site density in each sub-­space. Placing these observed site densities in
the calibrated fuzzy sets framework enables us to determine which categories each sub-­space belongs to.
We know, however, that the Arid Margins have not been completely surveyed and that survey inten-
sity plays a role in our perception of Arid Margins settlement. The method and the resulting models,
therefore, have been calibrated with data from the well-­surveyed areas on the first three steps of the
methodology. In the fourth step of this methodology, each subset, and its attractiveness or site location
potential, has been categorised depending on how intensely it has been surveyed (Figure 10.16). What is
184 Johanna Fusco and Cyril de Runz

Figure 10.14 Application of the fuzzy sets framework to each sub-­space. A colour version of this figure can
be found in the plates section.
Source: after Fusco, 2016, Figure 122

Figure 10.15 Mapping each sub-­space’s fuzzy sets category as in Figure 10.14.
Source: after Fusco, 2016, Figure 110

called ‘attractiveness’ thus refers to the ‘known zones’, where an absence of sites is more likely to represent
absence of Bronze Age settlement rather than absence of information. On poorly or un-­surveyed areas,
called ‘grey zones’, it makes more sense to talk about a ‘potential of unknown site location’, that is to say,
to the possibility of finding settlement or not, rather than calling it ‘attractiveness’.
Spatial fuzzy sets 185

Figure 10.16 Mapping the attractiveness of and possibilities of finding settlements at the sub-­spaces using
fuzzy set estimates and survey intensity levels. A colour version of this figure can be found in the plates section.
Source: after Fusco, 2016, Figure 126

Type-­1 fuzzy sets have laid the basis for a relevant method of modelling uncertainty, vagueness and
imprecision (John & Coupland, 2007). But applying this concept in a highly uncertain context faces a
paradox: if we cannot determine the precise value of a certain quantity, how could we determine its exact
membership grade in a fuzzy set (Karnik & Mendel, 1998; Mendel, 2003)? As discussed earlier, in order
to address this problem, Zadeh (1975) introduced type-­2 fuzzy sets, where membership grades are no
longer crisp values but fuzzy membership functions, i.e. the membership value for each element of this
set is itself a fuzzy number in [0,1].
This paradox clearly interferes with our approach, as our fuzzy sets calibration is entirely based on the
distribution of data, i.e. habitat sites, in each sub-­space of our studied area (Figures 10.13, 10.14). How-
ever, as stated above (Figure 10.9), a part of our data displays what we called ‘functional uncertainty’, i.e.
the impossibility to know if the concerned sites were habitat sites or not. As a consequence, the choice to
consider only reliable data or to take the whole dataset into account changes considerably the number of
sites in each sub-­space, the average on which fuzzy set calibration is based, and as a result, the potential of
attracting settlement attributed to each sub-­space. As they are not able to encompass all the dimensions
of our data’s uncertainty in our models, type-­1 fuzzy sets considerably limit the robustness of our results.
The ‘higher order vagueness’ (Fisher & Arnot, 2006, p. 1) introduced by type-­2 fuzzy sets is the addi-
tional uncertainty ‘layer’ we need in order to deal with our data properly. Indeed, we may consider that
186 Johanna Fusco and Cyril de Runz

the actual number of sites in each sub-­space is not a crisp value, but an interval between the minimum
possible number of sites (i.e. the number of reliable habitat sites) and the maximum possible number of
sites (i.e. all the sites found in the considered sub-­space, reliable and unreliable).
Figure 10.17 shows that type-­2 fuzzy sets are defined by the boundaries of type-­1 fuzzy sets for reli-
able (the lower boundary of the type-­2 fuzzy set) and unreliable sites (the upper boundary of the type-­2
fuzzy set). The intervals defining the membership of each fuzzy set thus represent our higher and lower
degree of knowledge.
For example, if we consider only reliable sites, a sub-­space containing 29 sites will be considered as
“medium attractive” with a membership degree of 1 (upper graph of Figure 10.17). However, if we take
the totality of sites, a sub-­space containing 29 sites will be considered as “medium attractive” and “low
attractive” with membership degrees of respectively 0.8 and 0.2 (middle graph of Figure 10.17). From
a type-­2 fuzzy sets perspective (lower graph of Figure 10.17), this sub-­space will be thus considered as
“medium attractive” with a membership degree of [0.8;1], and “low attractive” with a membership
degree of [0;0.2]. However, this example relates to sub-­spaces that do not contain any unreliable sites:
the number of sites remains 29 whether we deal with reliable sites or the totality of sites; the attractive-
ness categories and their membership degrees of this sub-­space change because the composition of its
surroundings (i.e. the number of unreliable sites in other sub-­spaces) and thus, the resulting fuzzy sets
framework, change.
An additional complexity in these type-­2 fuzzy sets models emerges from the fact that each sub-­
space is composed with reliable and unreliable archaeological sites. Thus, each sub-­spaces’ sites number
is an interval between the minimum possible number of sites (reliable sites only) and the maximum
possible number of sites (totality of sites). That way, the real number of sites contained in sub-­space C
is comprised somewhere between the interval [10;22]. We may then picture out sub-­space C as a slider
between 10 and 22, whose raising or lowering changes its attractiveness category and the associated
membership degree. Following its needs, the user may choose to keep these nested intervals (interval
on the number of sites, and interval on the type-­2 fuzzy membership degrees), or choose one reference
point in sites number interval, in order to represent each sub-­spaces’ site number by a crisp value (mean,
maximum, minimum . . .). In order to facilitate attractiveness degree mapping and as an example, we
chose to keep the maximum number of sites (i.e. the totality of sites). The sub-­space C will then be
represented by its maximum possible number of sites (i.e. 22 sites) in our type-­2 fuzzy sets mapping,
which makes it “medium attractive” with a membership degree of [0;0.85] and “low attractive” with
a membership degree of [0.15;1].
The defined type-­2 fuzzy sets are then allowed for mapping of the attractiveness levels of ‘known
zones’ and the settlement potential of ‘grey zones’ as described in Figure 10.14 and Figure 10.15. In order
to evaluate the impact of local spatial patterns on the projected attractiveness and settlement potential,
the whole process was carried out separately on the ‘clusters’ and the ‘outliers’ detected with the LISA
(Figure 10.11). Figure 10.18 shows these results on the whole extent of the Arid Margins.
In summary, two types of information have been assessed in this case study: the degree of attractiveness
of well-­surveyed zones which constitutes the descriptive dimension of the maps shown by the lighter
colours; and the settlement potential of ‘grey zones’ which constitutes the possibilistic dimension of the
maps, shown by the darker colours.
Mapping all of this information and the location of known archaeological sites on the same map is
of great interest as we can see at the same time the data, the results, and the deviations from the model as
shown by the sites that fall into unattractive zones. These ‘anomalies’ suggest two possibilities, which may
both be true at the same time. Either the model has to be recalibrated and refined or, the sites which devi-
ate from the model have different location logics or spatial patterns than the ones assumed by the model.
Figure 10.17 From type-­1 to type-­2 fuzzy sets: introducing the reliability of archaeological sites in fuzzy set
calibration. A colour version of this figure can be found in the plates section.
Source: after Fusco, 2016, Figure 125
188 Johanna Fusco and Cyril de Runz

Figure 10.18 Type-­2 fuzzy set settlement estimates for Early Bronze Age IV Arid Margins comparing results
for ‘cluster’ sites and ‘outlier’ sites. A colour version of this figure can be found in the plates section.
Source: after Fusco, 2016, Figures 128 and 131

Whatever the case may be, our point has been to highlight the importance of integrating multiple
observations and levels of analysis, represented here by the imperfection of spatiotemporal information
and spatial patterns. Considering information in a fuzzy dimension offers an alternative method which
prevents us from making restrictive choices in modelling and/or forcing us to reject all unreliable data.
The latter not only limits the information that is available, but by artificially homogenising heterogeneous
data, we run the very real risk of giving priority to quantity over quality.

Conclusion
The various dimensions of archaeological data imperfection should prevent us from assessing hypotheses
on past settlement patterns that are too rigid and restrictive. The amount and quality of data, but also
our own theoretical and methodological frameworks as well as our implicit or explicit choices have
Spatial fuzzy sets 189

obviously strong impacts on the results. This does not mean, however, that we have to be resigned to
the assumption that ‘everything is possible everywhere’. Instead, by detecting and revealing the hidden
flaws in our reasoning, defining and assessing different levels of data imperfection, and taking them into
account throughout the research process, we can begin to constructively fold them into our analyses.
Reasoning within uncertainty or with imperfection through fuzzy logic and fuzzy set theory broadens
the horizons of archaeological research as it enables the formal exploration and ordering of a variety of
possibilities without being restrained by big trends, and allows consideration of outliers which are not
encompassed by a probability framework. We end with a challenge. Is it enough to simply acknowledge
‘valid’ or ‘false’ results due to data imperfection, and/or rush to eliminate data imperfection by any means?
We would argue not. Instead we must examine deliberately and systematically how the various forms of
uncertainty and imperfection arise in our results and hypotheses, and consider how we can consider them
as a ‘bundle of past possibilities’, where the options are controlled by the level and the type of uncertainty
we consciously decide to assume.

Notes
1 A usual formulation of the paradox involves a heap of sand. If we remove a single grain from it, is it still a heap?
What happens when this process is reiterated enough times: Is a single residual grain still a heap? If not, when did
it switch from a heap to a non-­heap?
2 This study was carried out in the context of the PaleoSyr/PaleoLib project, “Holocene palaeoenvironments and
settlement patterns in Western Syria and Lebanon”, directed by Frank Braemer and Bernard Geyer.
3 The LISA have been calculated with the freeware GeoDa 0.9 (Anselin, 2005)
4 For data availability reasons, the study refers to today’s waterways, not palaeo-­environmental estimates.

References
Allen, J. F. (1983). Maintaining knowledge about temporal intervals. Communication of the ACM, 26(11), 832–843.
Anselin, L. (1995). Local indicators of spatial association: LISA. Geographical Analysis, 27, 93–115.
Anselin, L. (2005). Exploring spatial data with GeoDa: A workbook. University of Illinois, Urbana-­Champaign: Spatial
Analysis Laboratory, Department of Geography.
Balla, A., Pavlogeorgatos, G., Tsiafakis, D., & Pavlidis, G. (2013). Locating Macedonian tombs using predictive model-
ling. Journal of Cultural Heritage, 14(5), 403–410.
Banerjee, R., Srivastava, P. K., Pike, A. W. G., & Petropoulos, G. P. (2018). Identification of painted rock-­shelter sites
using GIS integrated with a decision support system and fuzzy logic. ISPRS International Journal of Geo-­Information,
7, 326–346.
Baxter, M. (2009). Archaeological data analysis and fuzzy clustering. Archaeometry, 51, 1035–1054.
Bevan, A. (2015). The data deluge. Antiquity, 89(348), 1473–1484.
Brouwer Burg, M. (2017). It must be right, GIS told me so! Questioning the infallibility of GIS as a methodological
tool. Journal of Archaeological Science, 84, 115–120.
Buck, C. E. (2004). Bayesian chronological data interpretation: where now? In C. E. Buck & A. R. Millard (Eds.),
Tools for constructing chronologies: crossing disciplinary boundaries (pp. 1–24). London: Springer-­Verlag.
Cooper, A., & Green, C. (2015). Embracing the complexities of Big Data in archaeology: The case of the english
landscape and identities project. Journal of Archaeological Method and Theory, 23(1), 271–304.
Crema, E. R. (2012). Modelling temporal uncertainty in archaeological analysis. Journal of Archaeological Method and
Theory, 19, 440–461.
Crema, E. R., Bevan, A., & Lake, M. (2010). A probabilistic framework for assessing spatiotemporal point patterns in
the archaeological record. Journal of Archaeological Science, 37(5), 1118–1130.
de Runz, C., Desjardin, E., Piantoni, F., & Herbin, M. (2014). Reconstruct street network from imprecise excavation
data using fuzzy Hough transforms. Geoinformatica, 18(2), 253–268.
de Runz, C., Desjardin, E., Piantoni, F., & Herbin, M. (2011). Towards handling uncertainty of excavation data into
a GIS. In E. Jerem, F. Redő, & V. Szeverényi (Eds.), On the road to reconstructing the past: Proceedings of the 36th
190 Johanna Fusco and Cyril de Runz

computer applications and quantitative methods in archaeology (CAA) international conference (pp. 187–191). Budapest:
Archeaeolingua.
de Runz, C., Desjardin, E., Piantoni, F., & Herbin, M. (2010). Anteriority index for managing fuzzy dates in archaeo-
logical GIS. Soft Computing-­A Fusion of Foundations, Methodologies and Applications, 14(4), 339–344.
Desachy, B. (2012). Formaliser le raisonnement chronologique et son incertitude en archéologie de terrain. Cybergeo:
European Journal of Geography, Systèmes, Modélisation, Géostatistiques, document 597.
Farinetti, E., Hermon, S., & Niccolucci, F. (2004). Fuzzy logic application to artefact surface survey data. In F. Nicco-
lucci & S. Hermon (Eds.), Beyond the artifact: Digital interpretation of the past: Proceedings of CAA 2004 (pp. 125–129).
Budapest: Archaeolingua.
Fisher, P. F. (2005). Models of uncertainty in spatial data. In P. A. Longley, M. F. Goodchild, D. J. Maguire, &
D. W. Rhind (Eds.), Geographical information systems: Principles, techniques, management and applications (pp. 191–205).
Hoboken, NJ: Wiley.
Fisher, P. F., & Arnot, C. (2006). Mapping type 2 change in fuzzy land cover. In A. Morris & S. Kokhan (Eds.),
Geographic uncertainty in environmental security: Proceedings of the NATO advanced research workshop on fuzziness and
uncertainty in GIS for environmental security and protection (pp. 167–186). The Netherlands: Springer.
Fisher, P. F., Comber, A., & Wadsworth, R. (2006). Approaches to uncertainty in spatial data. In R. Devillers &
R. Jeansoulin (Eds.), Fundamentals of spatial data quality (pp. 43–59). London: ISTE.
Fusco, J. (2015). Detection of spatio-­morphological structures on the basis of archaeological data with Mathemati-
cal Morphology and Variography: Application to archaeological sites. In A. Traviglia (Ed.), Across space and time,
selected papers from the 41st annual conference of computer applications and quantitative methods in archaeology (CAA)
(pp. 249–260). Amsterdam: Amsterdam University Press.
Fusco, J. (2016). Analyse des dynamiques spatio-­temporelles des systèmes de peuplement dans un contexte d’incertitude: Applica-
tion à l’archéologie spatiale (Unpublished doctoral dissertation). University Nice Sophia Antipolis. Retrieved from
https://tel.archives-­ouvertes.fr/tel-­01341554
Gattiglia, G. (2015). Think big about data: Archaeology and the Big Data challenge. Archäologische Informationen, 38,
113–124.
Geyer, B. (2011). The steppe: Human occupation and potentiality, the example of northern Syria’s Arid Margins.
Syria, 88, 7–22.
Hatzinikolaou, E. G. (2006). Quantitative methods in archaeological prediction: from binary to fuzzy logic. In
M. W. Mehrer & K. L. Wescott (Eds.), GIS and archaeological site location modelling (pp. 437–446). New York:
Taylor & Francis.
Hatzinikolaou, E. G., Hatzichristos, T., Siolas, A., & Mantzourani, E. (2003). Predicting archaeological site locations
using GIS and fuzzy logic. In M. Doerr & A. Sarris (Eds.), The digital heritage in archaeology: Computer applications
and quantitative methods in archaeology (pp. 169–178). Heraklion: Archive of Monuments and Publications, Hellenic
Ministry of Culture.
Hermon, S., & Niccolucci, F. (2002). Estimating subjectivity of typologists and typological classification with fuzzy
logic. Archeologia e Calcolatori, 13, 217–232.
Hermon, S., & Niccolucci, F. (2003). A fuzzy logic approach to typology in archaeological research. In M. Doerr &
A. Sarris (Eds.), The digital heritage in archaeology: Computer applications and quantitative methods in archaeology
(pp. 169–178). Heraklion: Archive of Monuments and Publications, Hellenic Ministry of Culture.
Jaroslaw, J., & Hildebrandt-­Radke, I. (2009). Using multivariate statistics and fuzzy logic system to analyse settlement
preferences in lowland areas of the temperate zone: An example from the Polish Lowlands. Journal of Archaeological
Science, 36(10), 2096–2107.
John, R., & Coupland, S. (2007). Type-­2 fuzzy logic: A historical view. IEEE Computational Intelligence Magazine,
2, 57–62.
Johnson, I. (2004). Aoristic analysis: Seeds of a new approach to mapping archaeological distributions through time.
In M. de Stadt Wien, R. K. Erbe, and S. Wien (Eds.), Enter the past: The E-­way into the four dimensions of cultural
heritage: Proceedings of the 31st computer applications in archaeology (pp. 448–452). Oxford: Archaeopress.
Karnik, N. N., & Mendel, J. M. (1998). Introduction to type-­2 fuzzy logic systems. Proceedings of the 1998 IEEE
FUZZ Conference (pp. 915–920).
Kennedy, W. M., & Hahn, F. (2017). Quantifying chronological inconsistencies of archaeological sites in the Petra
area. eTopoi, 6, 64–106.
Spatial fuzzy sets 191

Lanos, P., & Philippe, A. (2015). Event model: A robust bayesian tool for chronological modeling. Retrieved from https://
hal.archives-­ouvertes.fr
Litton, C. D., & Buck, C. E. (1995). The Bayesian approach to the interpretation of archaeological data. Archaeometry,
37(1), 1–24.
Longley, P. A., Goodchild, M. F., Maguire, D. J., & Rhind, D. W. (2005). Geographic information systems and science (2nd
ed.). London: John Wiley & Sons.
Machálek, T., Cimler, R., Olševičová, K., & Danielisová, A. (2013). Fuzzy methods in land use modeling for
archaeology. In H. Vojackova (Ed.), Proceedings of the 31st international conference mathematical methods in economics
(pp. 552–557). Jihlava: College of Polytechnics.
Mccoy, M. D. (2017). Geospatial Big Data and archaeology: Prospects and problems too great to ignore. Journal of
Archaeological Science, 84, 74–94.
Mendel, J. M. (2003). Type-­2 fuzzy sets: Some questions and answers. IEEE Neural Networks Society Newsletter, 1,
10–13.
Mendel, J. M., & Bob John, R. I. (2002). Type-­2 fuzzy sets made simple. IEEE Transactions on fuzzy systems, 10(2),
117–127.
Niccolucci, F., D’andrea, A., & Crescioli, M. (2001). Archaeological applications of fuzzy databases. In Z. Stančič &
T. Veljanovski (Eds.), Computing archaeology for understanding the past: Proceedings of the 28th computer applications in
archaeology (pp. 107–116). Oxford: Archaeopress.
Niccolucci, F., & Hermon, S. (2004). A fuzzy approach to reliability in archaeological virtual reconstruction. In
F. Niccolucci & S. Hermon (Eds.), Beyond the artifact: Digital interpretation of the past: Proceedings of computer applica-
tions in archaeology (pp. 28–35). Budapest: Archaeolingua.
Niccolucci, F., & Hermon, S. (2015). Time, chronology and classification. In J. A. Barceló & I. Bogdanovic (Eds.),
Mathematics and archaeology (pp. 257–271). New York: Taylor & Francis.
Niccolucci, F., & Hermon, S. (2017). Documenting archaeological science with CIDOC CRM. International Journal
of Digital Libraries, 18, 281.
Orton, C. (2010). Fit for purpose? Archaeological data in the 21st century. Archeologia e Calcolatori, 11, 249–260.
Plewe, B. (2002). The nature of uncertainty in historical Geographic information. Transactions in GIS, 6(4), 431–456
Roskams, S., & Whyman, M. (2007). Categorising the past: Lessons from the archaeological resource assessment for
Yorkshire. Internet Archaeology, 23.
Vaughn, S., & Crawford, T. (2009). A predictive model of archaeological potential: An example from northwestern
Belize. Applied Geography, 29(4), 542–555.
Veregin, H. (1989). A taxonomy of error in spatial databases. Technical Paper, 89.12, Santa Barbara, CA: National
Center for Geographic Information and Analysis.
Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8, 338–355.
Zadeh, L. A. (1975). The concept of a linguistic variable and its application to approximate reasoning. Information
sciences, 8, 199–249.
Zadeh, L. A. (1978). Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets and Systems, 1, 3–28.
Zhang, J. X., & Goodchild, M. F. (2002). Uncertainty in geographical information. New York: Taylor and Francis.
Zoghlami, A., de Runz, C., & Akdag, H. (2016). F-­Perceptory: An approach for handling fuzziness of spatiotemporal
data in geographical databases. International Journal of Spatial, Temporal and Multimedia Information Systems, 1(1),
30–62.
Zoghlami, A., de Runz, C., Pargny, D., Desjardin, E., & Akdag, H. (2012). Through an archaeological urban data model
handling data imperfection. Paper presented at the Computer Applications and Quantitative Methods in Archaeol-
ogy (CAA) Conference, Southampton, UK.
11
Spatial approaches to assignment
John Pouncett

Introduction
Two deceptively simple questions are pivotal to definitions of geographic information and, by exten-
sion, spatial analysis – what? and where? (Goodchild, 2003). This chapter addresses the second of these
questions, focusing on spatial approaches to assignment that can be used to ascertain the likelihood of an
archaeological sample originating from a particular geographic region. These approaches are explored
with reference to evidence for mobility and migration that can be inferred from isotope tracers from
cremated bone and tooth enamel. Questions of geographic origins, however, are equally important
with regard to evidence for trade and exchange that can be inferred from the biological or chemical
signatures of raw materials. Geochemical analysis of the worked flint from the Gravettian sites at Rhens
and Koblenz-­Metternich, Germany for example, has identified the use of flint from western Belgium c.
260km away (Moreau et al., 2016). Conversely, wood species identification and strontium isotope analysis
has suggested that the wooden artefacts analysed from Pitch Lake, Trinidad, were predominantly manu-
factured from locally sourced materials (Ostapkowicz et al., 2017). In both instances, the distance that raw
materials were transported is critical to the understanding of trade and exchange. A degree of caution,
however, should be exercised when identifying possible sources of raw materials. In the case of Bronze
Age copper metals, theoretical frameworks that move beyond the concept of provenance have been
developed that recognise that metal chemistry is the product of a longer life-­history of a unit of metal
which may reflect re-­use and recycling as well as the original source of the ore (Bray et al., 2015; Pollard,
2018). These theoretical frameworks highlight the need to understand the complex range of processes
that can contribute to variation in the measurements that are commonly used to determine the possible
geographic origins of archaeological samples. The spatial approaches to assignment described below are
widely used in bioarchaeology but are equally applicable to other areas of archaeology.
Isotope tracers are used to make inferences about mobility and migration, comparing observed isotope
ratios or values for archaeological samples to baseline data to determine whether an individual is local
or non-­local to a geographic region (e.g. Bentley & Knipper, 2005; Montgomery, Budd, & Evans, 2000;
Price, Burton, & Bentley, 2002). For the purposes of determining geographic origins of humans and
animals in archaeology, strontium and oxygen isotopes are the most commonly employed, with multiple
isotope tracers used to narrow down the range of possible locations from where an individual could have
Spatial approaches to assignment 193

originated (Emery, Prowse, Elford, Schwarcz, & Brickley, 2017; Evans, Chenery, & Fitzpatrick, 2006; Laf-
foon et al., 2017). Traditional approaches to identifying locals and non-­locals based on the identification
of outliers or extreme values using parametric and non-­parametric statistics (Lightfoot & O’Connell,
2016; Wright, 2005) are increasingly being supplemented by spatial approaches based on geographic
assignment using isoscapes and Bayesian statistics (Pellegrini, Pouncett, Jay, Parker-­Pearson, & Richards,
2016; Schulting et al., 2019; Snoeck et al., 2018; Snoeck, Pouncett et al., 2016). The key principles
underpinning these spatial approaches to determining the geographic origins of individuals are outlined
below with reference to case studies from Annaghmare in Northern Ireland and Duggleby Howe on the
Yorkshire Wolds.

Strontium
Strontium is an alkaline earth metal with four naturally occurring isotopes (84Sr, 86Sr, 87Sr and 88Sr) which
are formed during primordial nucleogenesis. 87Sr is radiogenic and is formed as a result of the decay of
the radioactive alkaline metal rubidium (87Rb). The 87Sr/86Sr ratios for bedrock and surface geology are
related to both the mineral composition and the age of the geological formation (Faure, 1986). Strontium
derived from the weathering of a geological formation will have the same 87Sr/86Sr ratio as the parent
geology. Consequently, the 87Sr/86Sr ratios of soil are related to the geological formations from which they
are derived. The 87Sr/86Sr ratio for rainwater is related to the mineral composition of the water vapour
(evaporated sea-­or freshwater) and aerosolised particles in the atmosphere (e.g. Saharan dust) which act as
condensation nuclei for excess water vapour. In coastal regions, the 87Sr/86Sr ratio of rainfall is equivalent
to that of seawater (Hodell, Mueller, McKenzie, & Mead, 1989). Strontium from soil, groundwater and
rainwater is absorbed into plants and once it enters the food chain becomes incorporated into the tissues
of humans and animals (Capo, Stewart, & Chadwick, 1998).
87
Sr/86Sr ratios of tooth enamel (Montgomery, Evans, & Cooper, 2007; Neil, Evans, Montgomery, &
Scarre, 2018) and cremated bone (Snoeck et al., 2018; Snoeck, Pouncett et al., 2016) reflect the food/
drink consumed by an individual and, assuming that the food/drink is locally sourced, can be used to
infer the geographic locations where an individual spent specific times of their lives. The time of life
for which these inferences can be made is dependent on the tissue (bone, dentine or enamel) analysed
which is in turn dependent on the mode of burial. In the case of inhumations, the crystalline structure
of tooth enamel preserves the original in vivo 87Sr/86Sr ratio, while bone is often susceptible to diagenesis
and through the exchange of calcium for strontium equilibrates with the value of the soil in which it is
buried (Budd, Montgomery, Barreiro, & Thomas, 2000; Hoppe, Koch, & Furutani, 2003). 87Sr/86Sr ratios
of tooth enamel represent an average of the food/drink consumed during crown formation and depend-
ing on the tooth sampled can indicate where an individual spent different stages of their childhood. In
the case of cremations, high temperatures cause spalling and loss of tooth enamel, while at the same time
result in the crystallisation of bone making it resistant to diagenesis such that fully calcined bone retains its
original in vivo 87Sr/86Sr ratio (Snoeck et al., 2015). 87Sr/86Sr ratios of cremated bone represent an average
of the food/drink consumed over the decade or so before death and can indicate where an adult spent
the last c. 10 years of their life. The time of life represented for non-­adults will be shorter given that it
reflects the growth stage of the skeleton.

Oxygen
Oxygen is a non-­metal with three naturally occurring stable isotopes, a primary isotope (16O) formed
during primordial nucleogenesis and two secondary isotopes (17O and 18O) formed during the
194 John Pouncett

carbon-­nitrogen-­oxygen cycle. Spatial and temporal variation in 18O of rainwater occurs as a result of
Rayleigh fractionation or distillation, i.e. the preferential condensation of water with 18O in air masses,
depleting the 18O relative to the 16O in the vapour phase (Sharp, 2007). The d18O value of rainwater is
dependent on a wide range of variables including latitude, temperature, elevation, amount of rainfall,
and distance to surface water (Bowen & Wilkinson, 2002). The d18O value of groundwater is related
to that of the local rainfall but may vary due to evaporation of surface water, fractionation within
aquifers, and recharge from rivers with water from higher elevations (Gat, 1971). Drinking water and
water from food are typically derived from a combination of local groundwater and local rainwater.
The d18O values of food/drink consequently will approximate that of these sources. Oxygen from
food/drink is incorporated into body water and is in turn incorporated into the tissues of humans
and animals.
A linear relationship has been demonstrated between the d18O values of phosphate from the bioapa-
tite fraction of teeth and bones (d18Op) and the mean annual d18O values of rainwater (d18Ow) from the
region where an individual lived (Longinelli, 1984; Luz, Kolodny, & Horowitz, 1984). d18Op values from
tooth enamel represent an average of the food/drink consumed during crown formation and depend-
ing on the tooth sampled can be used to infer the regions where an individual spent different stages
of their childhood (Evans et al., 2006). A number of issues have been raised regarding the utility of
oxygen as an isotope tracer (Lightfoot & O’Connell, 2016; Pouncett, 2019). This is because d18O values
may be affected by a number of factors, including: variation in 18O due to changes in climate conditions
(Daux et al., 2008); fractionation of 18O as a result of the preparation of food/drink (Brettell, Mont-
gomery, & Evans, 2012); physiological variation between individuals (White, Spence, Longstaffe, &
Law, 2004); and enrichment of 18O due to weaning and/or the consumption of milk (Lin, Rau, Chen,
Chou, & Fu, 2003). Unlike strontium, oxygen cannot be used to infer the geographic origins of indi-
viduals who were cremated, since cremation alters the d18O values of bone and teeth, reflecting pyre
characteristics such as temperature and ventilation rather than diet and mobility (Snoeck, Schulting,
Lee-­Thorp, Lebon, & Zazzo, 2016).

Local or non-­local?
The geographic origins of an individual can be inferred by comparing observed values of one or more
isotope ratios to the expected values from baseline data for the area of interest – typically biologically
available strontium (BASr) from modern plants or animals in the case of strontium, and modern ground-
water or rainwater in the case of oxygen. If the isotope ratio from an archaeological sample has a value
that is similar to that for the baseline data for a given geographic region the individual could have been
local to that region, i.e. spent part of their childhood (tooth enamel) or the last decade or so of their
adult life (cremated bone) there. Conversely, if the isotope ratio from an archaeological sample has a value
that differs from the baseline data for a given geographic region, the individual is unlikely to have been
local to that region, i.e. did not spend part of their childhood (tooth enamel) or the last decade or so of
their life (cremated bone) there. The ability to identify whether an individual is local or non-­local to a
geographic location is dependent upon a number of factors, most significantly in this context: analytical
errors associated with the isotope ratios for the archaeological samples; sampling errors associated with
the generation of the isotopic baseline data; equifinality with the same values of isotope ratios found in
multiple geographic regions; and the scale of analysis/spatial extent of the ‘local’ signal. The uncertainty
introduced as a result of these factors is integral to the methods of geographic assignment described below
and applied to the case studies for this chapter.
Spatial approaches to assignment 195

Methodology
Two principal methods have been used to determine the geographic origins of humans and animals. The
first method, based on the calculation of residuals between the expected isotope measurement from the
baseline data for the area of interest and the observed isotope measurement for an archaeological sample,
is applied to a case study from Annaghmare in Northern Ireland using a single isotope tracer. The second
method, based on the use of Bayesian statistics to determine the likelihood that an individual came from
a particular location given the observed isotope measurement, is applied to a case study from Duggleby
Howe on the Yorkshire Wolds using two isotope tracers.

Calculation of residuals
The simplest method of determining the geographic origins of humans and animals is to calculate the
residuals between the expected isotope measurement for a location and the observed measurement for an
individual (Pellegrini et al., 2016; Snoeck et al., 2018):

ei = δs ,i − δs  (11.1)

where:
ei = the residual between the expected and observed isotope measurements.
ds,i = the expected isotope measurement for location i.
ds = the observed isotope measurement for the individual.

The expected isotope measurement for a location can be estimated from the baseline data for the area
of interest. A threshold can be applied to the residuals to identify locations from which an individual
could have originated (cf. Laffoon et al., 2017). Typically, the threshold used to identify the locations
from which an individual could have originated will be based on the sum of the sampling error for the
baseline data and the analytical error for the measured isotope. For example, in the case of the fragments
of cremated bone analysed from Aubrey Hole 7 at Stonehenge, Wiltshire locations with residuals less than
±0.0005 (equivalent to the sum of the sampling error for the BASr baseline and the analytical error for
the strontium measurements) were identified as possible geographic origins for the cremated individuals
(Snoeck et al., 2018).

Bayesian statistics
Bayesian statistics have been widely employed by biologists and archaeologists for the purposes of deter-
mining the geographic origins of humans and animals (Bowen, Liu, Vander Zanden, Zhao, & Takahashi,
2014; Laffoon et al., 2017; Wunder, Kester, Knopf, & Rye, 2005). The likelihood that an individual came
from a particular location given the observed isotope measurement can be calculated using Bayes’ theo-
rem which can be expressed as:
P (B | Ai ) P ( Ai )
P ( Ai|B ) =  (11.2)
∑ P (B | A )P ( A )
i i

where:
P(Ai|B) = the posterior probability distribution of individual A originating from location i, given
observed isotope measurement B.
196 John Pouncett

P(B|Ai) = the sampling probability distribution of observing isotope measurement B, given all loca-
tions i from which individual A could have originated.
P(Ai) = the prior probability distribution of individual A originating from location i, given assump-
tions or knowledge prior to observing isotope measurement B.

If there are no prior assumptions or knowledge of the likely geographic origin of an individual, it
is assumed that all locations are equally possible and a non-­informative prior is used as the prior
probability distribution – typically, this will be a uniform distribution (a,b) with probability density
function:
1
f (x ) =  (11.3)
b −a
where:
a = the minimum isotope measurement for all locations i.
b = the maximum isotope measurement for all locations i.

If there are prior assumptions or knowledge of the likely origin of an individual, the probability density
function which best describes the prior assumptions or knowledge should be used as the prior probability
distribution. For example, in a test of geographic assignment of mountain plover chicks (Charadrius mon-
tanus) using isotope tracers in feathers, it was assumed that the chick feathers were exclusively of known
geographic origin and the sample sizes per location were used to estimate the prior probability density
(Wunder et al., 2005).
It is generally assumed that the observed isotope measurement for a sample is an outcome of a random
process and that the sampling probability distribution can consequently be estimated using the probabil-
ity density function for a normal distribution (m,s) with location m and scale s:
1 2
−(x −µ) / 2 σ 2
f (x ) = e  (11.4)
σ 2π
The parameters of the normal distribution are commonly estimated using the observed isotope value
or ratio for the individual, the total of the sampling error for location i and the analytical error for the
observed isotope measurement:
µ = δs  (11.5)
σ = σs ,i + σε  (11.6)
where:
ds = the observed isotope measurement for the individual.
ss,i = the sampling error for location i estimated from the baseline data.
se = the analytical error for the observed isotope measurement.

If a uniform distribution is used as a non-­informative prior and a normal distribution is used as the sam-
pling probability function, Eq. (11.2) can be rewritten as:
 
−(δs ,i −δs ) / 2(σs ,i 2 +σε2 ) 

 1 2
 1 
 e  
(σ s ,i + σε ) 2π 
 δ s ,imax − δ s ,imin   (11.7)
P ( Ai|δ s ) =

i −(δs ,i −δs ) / 2(σs ,i 2 +σε2 ) 
 
1 1 
2

∑ 1(σ + σ ) 2π
 e  δ 
 s ,i ε  s ,imax − δ s ,imin 
Spatial approaches to assignment 197

where:
ds,i = the expected isotope measurement for location i estimated from the baseline data.

Which can be simplified to:


1 −(δ −δ ) / 2(σs ,i 2 +σε 2 )
2

e s ,i s
(σs,i + σε ) 2π
P ( Ai|δs ) =  (11.8)
1 −(δs ,i −δs ) / 2(σs ,i 2 +σε 2 )
2

∑ 1 (σ + σ ) 2π e
i

s ,i ε

The posterior probability distributions are commonly rescaled by the largest observed density, with the
resultant probability densities ranging between 0 and 1 (cf. Wunder, 2010). Where multiple isotope trac-
ers are used to determine the geographic origin of an individual, Bayes’ theorem can be applied iteratively
with the posterior probability density from the iteration for one isotope tracer used as the prior prob-
ability density for the iteration for the next isotope tracer.
Maximum likelihood estimation is commonly used to determine the geographic origin of an
individual, with the individual assigned to the location with the highest probability density. Ultimately,
the validity of this assignment is dependent on the robustness of the baseline data used to determine
the geographic origin of an individual. If the baseline does not adequately account for all of the factors
that influence spatial variation in the ratios or values of the isotope tracer within the area of interest (see
earlier), the location to which an individual is assigned may not be valid and the process by which it was
derived will not be robust.

Case study 1: Annaghmare, Northern Ireland


The Neolithic court tomb at Annaghmare, known locally as The Black Castle, is located c. 1km to the
north of Crossmaglen in County Armagh, close to the Republic of Ireland border. The tomb, excavated
between 1963 and 1964, is comprised of a trapezoidal cairn with an open forecourt, burial gallery and
two lateral chambers (Waterman & Morton, 1965). It is defined by orthostatic and dry-­stone walls con-
structed from local Silurian rocks. The burial gallery has three chambers (Chambers 1 to 3). Fragments of
cremated bone were present in all three chambers. A child mandible and an adult femur, both of which
were unburnt, were also found in Chamber 2. The mortuary deposits from Annaghmare can be dated
to the second half of the fourth millennium BC (Schulting, Murphy, Jones, & Warren, 2012; Snoeck,
Pouncett et al., 2016), with the unburnt child mandible from Chamber 2 dated to 3485–3105 cal. BC
(UB-­6741: 4556 ± 35 BP) and cremated cranial fragment A2 dated to 3370–3116 cal. BC (OxA-­32110:
4572 ± 28 BP; OxA-­30188: 4532 ± 36 BP). These dates agree with a date that had previously been
obtained for a charcoal sample sealed by the primary blocking of the forecourt, which was dated to
3330–2900 cal. BC (UB-­241: 4935 ± 55 BP) (Smith, Pilcher, & Pearson, 1971).
87
Sr/86Sr ratios have been obtained for two of the fragments of cremated bone (A1: 0.71055 ±
0.00001; A2: 0.70900 ± 0.00001) from Annaghmare (Snoeck, Pouncett et al., 2016). The magnitude
of the difference between the 87Sr/86Sr ratios (0.00155) is larger than the variation between duplicate
samples from the same individual (maximum 0.00016) from Bronze Age urns in Northern Ireland
(Snoeck, Pouncett et al., 2016), suggesting that the two fragments of cremated bone come from dif-
ferent individuals. Comparison of the observed isotope ratios for the two individuals to baseline data
from modern plants suggests that while Individual A1 could have been spent the last decade or so of
their life in the immediate vicinity of the site, Individual A2 could not. If Individual A2 did not spend
the last decade or so of their life in the immediate vicinity of the site, this raises the obvious question –
where did they come from?
198 John Pouncett

A BASr baseline suitable for geographic assignment at a national or regional scale has recently been
produced for Ireland (Snoeck et al., 2019). This baseline is based on published 87Sr/86Sr measurements
for modern plants (Ryan, Snoeck, Crowley, & Babechuk, 2018; Snoeck, Pouncett et al., 2016; Snoeck
et al., 2019; Wilson & Standish, 2016) and Geological Survey of Ireland (GSI) Bedrock Geology 500k
Series data (https://data.gov.ie/dataset/gsi-­bedrock-­geology-­500k-­series). Strontium isotope ratios for
the plant samples were aggregated in order of preference by outcrop (single part polygons for individual
outcrops of bedrock), formation (multi-­part polygons for each geological formation) and type/age
(multi-­part polygons for formations of similar type/age), with a range of descriptive statistics calculated
for the polygons corresponding to each outcrop/formation. The 87Sr/86Sr ratios for the modern plants are
not normally distributed (Shapiro-­Wilk test: W = 0.949, df=228, p = 0.000) and the median and median
absolute deviation (MAD) are consequently used to describe the spatial variation in biologically available
strontium rather than the mean and standard deviation.
The expected 87Sr/86Sr ratio for the outcrops of Silurian sandstone, greywacke and shale (Formation
49, Snoeck, Pouncett et al., 2016) in the immediate vicinity (<5km) of Annaghmare based on the BASr
baseline for Ireland is 0.71081 ± 0.000491 (Figure 11.1). The observed 87Sr/86Sr ratio for Individual A2
falls more than 3 MAD below the median for the Silurian sandstone, greywacke and shale (Figure 11.2),
consistent with the interpretation of this individual as a non-­local. Marked variation, however, can be seen
in the 87Sr/86Sr ratios for the modern plants from the Silurian sandstone, greywacke and shale (n=24). Six
of the plant samples have 87Sr/86Sr ratios which fall more than 3 MAD below the median for the forma-
tion, including two plant samples from the deposits of till overlaying the formation in County Kildare
to the south of Annaghmare, and two plant samples from the gravel deposits overlying the formation in
County Cavan to the west of Annaghmare. In theory, the low 87Sr/86Sr ratios for these plant samples raise
the possibility that Individual A2 could have originated from areas to the south and west of Annaghmare
where the Silurian sandstone, greywacke and shale are locally overlain by drift deposits. In practice, these
plant samples are outliers – the 87Sr/86Sr ratio from cremated bone represents an average all of the foods
consumed by an individual over c. 10 years and given the localised nature of the till and gravel Individual
A2 is unlikely to have only eaten foods corresponding to these outliers (Warham, 2011).
Point-­based comparisons between observed 87Sr/86Sr ratios for archaeological samples and expected
87
Sr/86Sr ratios based on baseline data from modern plants are problematic for several reasons: they fail
to account for imprecision in the coordinates for the sites from which samples were taken – a particular
problem with legacy samples from nineteenth century excavations; they do not account for localised dif-
ferences in surface geology; and past populations would not have obtained their food from a single source.
These problems can in part be addressed by calculating BASr catchments, with focal medians calculated
based on the expected values of 87Sr/86Sr ratios for all of the BASr comparanda within a specified distance
of a site (cf. Snoeck, Pouncett et al., 2016). The size of the BASr catchment should be appropriate to
the scale of analysis and the distance from which food would have been sourced, with catchments <5km
representing locally sourced food, catchments <20km representing food sourced from the wider region
and catchments >20km representing food sourced from further afield based on analysis of comparable
Neolithic and Bronze Age sites in Ireland. In contrast to point-­based comparisons which will only reflect
the 87Sr/86Sr ratio of food sourced from a single geological formation, comparisons based on BASr catch-
ments will also reflect the 87Sr/86Sr ratios of food sourced from multiple geological formations depending
upon the spatial extent of the geological formations and the size of the BASr catchments.
The BASr baseline for Ireland was converted to a raster dataset with a cell size of 100m to preserve
localised outcrops of bedrock and focal medians, representing the expected 87Sr/86Sr ratios for 5km BASr
catchments, were calculated for each cell in the raster dataset (Figure 11.3, left). Residuals between the
observed 87Sr/86Sr ratio for Individual A2 and the expected values of the 87Sr/86Sr ratios for the 5km BASr
Figure 11.1 Expected 87Sr/86Sr ratios in the immediate vicinity of Annaghmare based on the BASr baseline for
Ireland after Snoeck et al. (2019) (top: median; bottom: median absolute deviation from the median). The black
dots indicate the locations of the modern plant samples used to generate the biologically available strontium (BASr)
baseline and the white dots indicate outliers for the Silurian sandstone, greywacke and shale (Formation 49).
Figure 11.2 Boxplot showing the variation in the observed 87Sr/86Sr ratios of modern plants from the outcrops of
Silurian sandstone, greywacke and shale (Formation 49) in the Annaghmare region. The grey shaded area shows
the 87Sr/86Sr ratios that lie within 1 median absolute deviation (MAD) of the median. Samples with 87Sr/86Sr ratios
less than 3 MAD below the median or greater than 3 MAD above the median can be considered to be non-­locals.
Spatial approaches to assignment 201

Figure 11.3 Focal medians for 5km BASr catchments calculated from the BASr baseline for Ireland (left) and
residuals between expected 87Sr/86Sr ratios based on the BASr catchments and the observed 87Sr/86Sr ratio for
Individual A2 (right). Locations from which Individual A2 could have originated are shown in white. Loca-
tions from which Individual A2 is unlikely to have originated are shown in blue (more depleted) and orange
(more enriched). A colour version of this figure can be found in the plates section.

catchments were subsequently calculated (Figure 11.3, right). The residuals were symbolised using gradu-
ated colours with defined values based on the sum of the sampling error for the baseline data – defined
as the median absolute deviation for the geological formation – and the analytical error for the observed
isotope ratio for Individual A2. Residuals with a magnitude less than the sum of the sampling error and
the analytical error represent possible locations from which Individual A2 could have originated, i.e.
spent the last decade or so of their lives. The possible locations from which Individual A2 could have
originated are largely confined to the area to the south of Annaghmare, including the Boyne Valley, and
the area to the west of Annaghmare. These areas mirror the spatial distribution of passage tombs – in
contrast, the distribution of court tombs is confined to the northern third of Ireland (Darvill, 1979). The
radiocarbon dates from Annaghmare fall within a similar time-­frame to the megalithic tombs at Ballyna-
hatty 1855 and Millin Bay in County Down, both of which have an affinity with the developed passage
tomb tradition of the late fourth millennium BC (Schulting et al., 2012).

Case study 2: Duggleby Howe, Yorkshire Wolds


The Neolithic round barrow at Duggleby Howe is located c. 300m to the south-­east of the village of
Duggleby at the head of the Great Wold Valley and lies at the centre of a circular enclosure that was
constructed during the Late Neolithic or Early Bronze Age (Gibson et al., 2011; Gibson & Bayliss, 2009;
202 John Pouncett

Riley, 1980). It was partially excavated during the late eighteenth century by the Reverend Christopher
Sykes and was re-­opened by John Robert Mortimer in July and August 1890 (Cole, 1901; Mortimer,
1892, 1893, 1905). The structural sequence at Duggleby Howe is complex, with five phases of construc-
tion proposed (Pouncett, 2019) on the basis of recent radiocarbon dates (Gibson et al., 2011; Gibson &
Bayliss, 2009):

• Phase 1 (Early Neolithic) – the earliest phase of the monument was characterised by a shaft grave
(Grave B), marked by an up-­cast mound of chalk;
• Phase 2 (Middle Neolithic) – two burials on the old land surface and a burial in a shallow grave
(Grave A) were added respecting the position of the shaft grave;
• Phase 3 (Late Neolithic) – an interim mound was constructed, and a series of burials and cremations
were inserted into the mound;
• Phase 4 (Chalcolithic) – a circular enclosure, defined by a causewayed ditch c. 350m in diameter, was
built around the interim mound; and
• Phase 5 (Early Bronze Age) – the interim mound was enlarged substantially with the construction
of a chalk outer mound over 22 feet in height.

Thirteen inhumations and fifty-­three cremations, spanning a period of more than 1,000 years from the
middle of the fourth millennium BC to the late third millennium BC, were found within or beneath
the inner mound. This sequence of burials represents the full spectrum of Middle and Late Neolithic
funerary practices and is considered pivotal in understanding the transition between inhumation and
cremation as the dominant funerary rite during the Neolithic (Loveday, 2002). Prestige goods, including
a polished flint adze, a polished discoidal knife, a perforated antler mace head and a series of boar’s tusk
blades, were found with inhumations from the shaft grave and the old land surface. Duggleby Howe was
a lynch-­pin in the framework established for the classification and dating of Neolithic round barrows
and ring ditches (Kinnes, 1979).
87
Sr/86Sr ratios and d18Op values have been obtained for tooth enamel from seven of the burials from
Duggleby Howe, including the inhumation that was buried at the base of the shaft grave (Burial K:
87
Sr/86Sr = 0.70859; d18Op = 18.9 ‰) associated with the earliest phase of the monument (Evans, Chen-
ery, & Montgomery, 2012; Montgomery, Cooper, & Evans, 2007; Montgomery, Evans et al., 2007). The
87
Sr/86Sr ratios and d18Op values have been used to suggest that none of the individuals from Duggleby
Howe spent their childhood on the chalk of the Yorkshire Wolds and that Burial K could have come
from as far away as Western Scotland or Cornwall – assertions which have been woven into narratives
about mobility during the Neolithic and Early Bronze Age (Gibson, 2016; Hutton, 2014; Loveday, 2016).
These assertions have been accepted at face value for several reasons: (1) they fit with current models of
settlement practice which regard the uplands of the Yorkshire Wolds as a place where people buried their
dead and the lower-­lying areas to the south and east as the epicentre of Neolithic settlement (Carver,
2012; Harding, 2006; Manby, 1988); (2) they explain the prestige goods found with several of the burials,
including the polished flint adze and discoidal knife thought to have been manufactured in the specialist
workshops at North Dale and South Landing (Durden, 1995; Loveday, 2011; Pierpoint, 1980); (3) they fit
with narratives about mobility which prioritise the exotic over the mundane, with the possible origins of
the Amesbury Archer in the Austrian Alps (Evans et al., 2006) more captivating than an ‘everyday tale of
country folk’ in Wiltshire. Although the individuals buried at Duggleby Howe might not have spent their
childhood on the chalk of the Yorkshire Wolds, this does not necessarily mean that they were not locals.
The geographic origins of Burial K from Duggleby Howe are re-­evaluated below using Bayesian sta-
tistics and maximum likelihood estimation, by calculating probability density surfaces for the burial and
Spatial approaches to assignment 203

Figure 11.4 Expected 87Sr/86Sr ratios based on the BASr baseline for mainland Britain (after Snoeck et al.,
2018, Figure 2) (left) and expected d18O values based on the ground water baseline for the United Kingdom
and Republic of Ireland (after Darling et al., 2003, Figure 6) (right). The expected d18O values have been con-
verted from d18Ow values to d18Op values using the equation from Daux et al. (2008) to allow direct comparison
with the observed isotope value for Burial K at Duggleby Howe.

assigning the individual to the geographic region with the highest probability density. In contrast to the
Annaghmare case study that was reliant upon a single isotope tracer, two isotope tracers can be used to
determine locations from which Burial K could have originated. Two baselines were consequently used
for the purposes of the geographic assignment of Burial K (Figure 11.4): (1) a BASr baseline for mainland
Britain, based on published 87Sr/86Sr measurements (Chenery, Müldner, Evans, Eckardt, & Lewis, 2010;
Evans, Montgomery, Wildman, & Boulton, 2010; Schulting et al., 2019; Snoeck et al., 2018) and Brit-
ish Geological Survey DiGMapGB-­625 bedrock geology data (www.bgs.ac.uk/products/digitalmaps/
digmapgb_625.html); and (2) a d18O baseline for mainland Britain, based on modern groundwater values
(Darling, Bath, & Talbot, 2003) converted to phosphate values using the equation d18Op = 0.501 d18Ow +
20.71 published by Daux et al. (2008). Both of these baselines are suitable for geographic assignment at
a national or regional scale. The expected 87Sr/86Sr ratio for the Cretaceous chalk of the Yorkshire Wolds
from the BASr baseline is 0.70818 ± 0.00036 and the expected d18Op for the Yorkshire Wolds from the
converted modern d18Ow values is 16.7 ± 0.3‰. No baseline is without limitations and the drawback of
the baselines currently available for mainland Britain is that they are based on bedrock geology and mod-
ern groundwater and do not directly take into account the other factors which might influence spatial
variation in 87Sr/86Sr ratios or d18O values highlighted in the introduction to this chapter.
204 John Pouncett

Building on the approach used for the Annaghmare case study, the BASr and d18Op baselines were
converted into raster datasets with a cell size of 100m, and focal means were calculated to represent 5km
BASr and d18O catchments for every cell in the resultant raster datasets. Probability density surfaces were
calculated from the focal means using Bayes’ theorem, with the prior probability distribution defined
using either the probability density function for a continuous distribution as a non-­informative prior
(single) or using the posterior density distribution for strontium (dual), and the sampling probability
distribution defined using the probability density function for a normal distribution with location μ and
scale s. Parameters for the normal distribution for each of the isotope tracers were estimated using the
observed 87Sr/86Sr ratio and d18Op value for the burial and the standard deviation of the expected 87Sr/86Sr
ratio and the converted modern d18Ow value for the Cretaceous chalk respectively, and the resultant pos-
terior probability densities were rescaled by the largest observed density with values ranging between 0
and 1. A Euclidean distance surface with a cell size of 100m was calculated for Duggleby Howe. Zonal
statistics were then calculated from the probability density and Euclidean distance surfaces using geo-
graphic regions based on National Character Areas (https://naturalengland-­defra.opendata.arcgis.com/
datasets/national-­character-­areas-­england), National Landscape Character Areas (https://landmap-­maps.
naturalresources.wales) and Landscapes of Scotland (https://gateway.snh.gov.uk/natural-­spaces/). Geo-
graphic assignments were determined for Burial K based on the zonal statistics, with the regions ranked
by highest probability density and lowest Euclidean distance (Figure 11.5).
The geographic assignments for Burial K based on Bayesian statistics and maximum likelihood esti-
mation paint a different picture to the previous analysis (Montgomery, Evans et al., 2007; Montgomery,
Cooper et al., 2007; Evans et al., 2012; Montgomery & Jay, 2013). Where a single isotope tracer is used,
the observed 87Sr/86Sr ratio for Burial K suggests that the individual spent their childhood in Eastern
Britain with the closest match on the Yorkshire Wolds while the observed d18Op value suggests that the
individual spent their childhood in Western Britain with the closest match on Barra and Uist, Outer
Hebrides (Table 11.1). The geographic assignment based on the observed 87Sr/86Sr ratio for Burial K
reflects averaging of the expected 87Sr/86Sr ratios of the Cretaceous Chalk of the Yorkshire Wolds, and
the Jurassic Clay of the Howardian Hills and the Triassic Rocks of the Humberhead Levels, the Vale of
Pickering and the Vale of York adjacent to the Yorkshire Wolds. In contrast to point-­based comparisons,
comparisons based on catchments take into account the possibility of locally obtaining food/drink from
more than one source (cf. Montgomery, 2010). Where multiple isotope tracers are used, the observed
87
Sr/86Sr ratio and d18Op value for Burial K suggest that the individual spent their childhood in Western
Britain with the closest match on The Lizard Peninsula, Cornwall. Links with Western Scotland and
Cornwall cannot be evidenced on the basis of the observed 87Sr/86Sr ratio for the burial and are instead
based on the observed d18Op value for the burial. This discrepancy is repeated for the other burials from
Duggleby Howe (Pouncett, 2019).
Whilst d18O values are commonly used to narrow the range of possible locations based on 87Sr/86Sr
ratios in multi-­isotope tracer approaches, the interpretation of Burial K from Duggleby Howe is dispro-
portionately skewed by the d18O value – to the point where the geographic assignments are effectively
based on a single tracer isotope, with the local origins supported by the strontium isotope measurement
overridden by the distant origins supported by the oxygen isotope measurement. This analysis raises
significant questions about the utility of oxygen as a tracer isotope and the modern groundwater values
that are used as a baseline for the study of mobility and migration in mainland Britain. Analysis of the
oxygen isotope ratios carried out as part of the Beaker People Project has shown that burials from several
of the burial mounds from Eastern Yorkshire, including Garton Slack 37 on the Yorkshire Wolds, exhibit
more than half of the national variation (Pellegrini et al., 2016). This degree of variation is perhaps not
surprising given that d18Op values from tooth enamel can be affected by a wide range of factors, including
Figure 11.5 Probability density surface (Top Left) and maximum likelihood estimations showing the locations
where Burial K from Duggleby Howe could have originated from based on the observed 87Sr/86Sr ratio (Top
Right), the observed d18O value (Bottom Left), and both the observed 87Sr/86Sr ratio and d18O value (Bottom
Right). The geographic assignments based on dual isotope tracers are unduly influenced by one of the isotopes
(oxygen) raising further questions about the utility of oxygen as a tracer isotope. A colour version of this figure
can be found in the plates section.
206 John Pouncett

Table 11.1 Geographic assignments for Burial K from Duggleby Howe ranked by highest probability density and
lowest Euclidean distance, using regions based on National Character Areas (England), National Landscape Character
Areas (Wales) and Landscapes of Scotland (Scotland).

Rank 87
Sr/86Sr d18Op Sr/86Sr + d18Op
87

1 Yorkshire Wolds Uist and Barra The Lizard


2 Vale of Pickering Harris Shetland and Fair Isle
3 Lincolnshire Wolds Lewis Skye
4 The Fens Isles of Scilly Small Isles and Ardnamurchan
5 North West Norfolk West Penwith South Devon
6 Cheviot Fringe Cornish Killas Isle of Wight
7 Rockingham Forest Skye Mull
8 Bedfordshire and Cambridgeshire Claylands Carnmenellis Argour and Morven
9 Central North Norfolk The Lizard Blackdowns
10 Northamptonshire Vales Lundy South Coast Plain

short-­term climate conditions, sourcing waters from reservoirs, preparation of food and drink, analytical
errors and physiological differences between individuals. The uncertainty introduced by this variability is
compounded by the process of converting from d18O values from modern water to d18O values for tooth
enamel which is known to be problematic (Pollard, Pellegrini, & Lee-­Thorp, 2011). Different formula
for converting between d18O values for modern groundwater and d18O values for tooth enamel (Chenery,
Pashley, Lamb, Sloane, & Evans, 2012; Daux et al., 2008; Longinelli, 1984; Luz et al., 1984; Pollard et al.,
2011) would potentially result in an individual being assigned to different geographic regions.

Conclusion
The case studies used to illustrate the spatial approaches to assignment which are commonly used to
determine the possible geographic origins of humans and animals highlight two key points. First, the
analysis of the individuals from Annaghmare and Duggleby Howe highlights the ambiguity in the pos-
sible geographic regions from which the individuals originated. Where a single isotope tracer is used
more than one geographic region may have the same isotope measurement as the individual, and where
multiple isotope tracers are used each isotope may suggest that the individual originated from a differ-
ent geographic region. Secondly, the analysis of the individuals from Annaghmare and Duggleby Howe
highlights the importance of the baseline data that are used for the purposes of geographic assignment. If
the baseline data do not adequately account for key factors that influence variation in the isotope tracer
measurements, the resultant geographic assignments will not be reliable.
At Annaghmare, the residuals calculated between the expected 87Sr/86Sr ratio for 5km BASr catch-
ments and the observed 87Sr/86Sr ratio from cremated bone suggested that Individual A2 was non-­local
and could have spent the last decade or so of their life in Central or Western Ireland. The baseline data for
Ireland is based solely on bedrock geology and 87Sr/86Sr ratios comparable to the observed 87Sr/86Sr ratio
for Individual A2 can be found in the superficial deposits that locally overlay the geological formation
on which the court tomb is located but are not reflected in the plant samples taken from the immediate
vicinity of the tomb (Snoeck, Pouncett et al., 2016). At Duggleby Howe, Bayesian statistics and maximum
Spatial approaches to assignment 207

likelihood estimation highlighted a discrepancy between the geographic regions from which Burial K
could have originated based on the observed 87Sr/86Sr ratio and d18O value. The geographic assignment
based solely on the 87Sr/86Sr ratio suggested that the individual was local and could have spent part of
their childhood on the Yorkshire Wolds or adjacent regions, while the geographic assignments based on
the d18Op value (either as a single isotope tracer or a multi-­isotope tracer) suggested that the individual
was non-­local and could have spent part of their childhood on the Lizard Peninsula, Cornwall. This dis-
crepancy raises significant questions about the utility of oxygen as an isotope tracer and in particular the
converted d18O values of modern groundwater which are often used as a baseline.
Although both of the case studies in this chapter related to the use of isotope tracers from tooth enamel
or cremated bone to ascertain the likelihood that an individual spent the last c. 2–3 years or c. 10 years
of their lives respectively in a particular geographic region, the spatial approaches that were introduced
can be applied to other types of archaeological samples and analytic measurements providing that suit-
able comparative data are available to create a robust baseline for the purposes of geographic assignment.
Both the approach based on the calculation of residuals and the approach based on Bayesian statistics
and maximum likelihood estimation will yield similar results. The approach based on the calculation of
residuals retains a direct link to the measured values and, as such, is perhaps more intuitive and easier to
interpret – particularly in instances where sources of error are poorly understood at the time the analysis
is carried out.

Acknowledgements
This chapter would not have been possible without the support of Christophe Snoeck who undertook
the original analysis on the samples from Annaghmare, and Maura Pellegrini who carried out the research
which prompted the re-­analysis of the isotope data from Duggleby Howe. It arose from work carried
out with Joanna Ostapkowicz as part of the Black Pitch, Carved Histories project funded by the AHRC
(AH/L00268X/1), and the Stone Interchanges within the Bahama archipelago project funded by the
AHRC (AH/N007476/1). Emma Gowans, Chris Green, Stuart Pouncett, Mark Gillings and Rick Schult-
ing have commented on earlier drafts of this chapter and have greatly improved the final text and figures.
Any errors or omissions are entirely my own. Lastly, I would like to thank the editors for inviting me to
contribute to this volume.

Note
1 The expected 87Sr/86Sr ratio is quoted as the median ± median absolute deviation for the geological formation.

References
Bentley, R. A., & Knipper, C. (2005). Geographical patterns in biologically available strontium, car-
bon and oxygen isotope signatures in prehistoric SW Germany. Archaeometry, 47(3), 629–644. https://doi.
org/10.1111/j.1475-­4754.2005.00223.x
Bowen, G. J., Liu, Z., Vander Zanden, H. B., Zhao, L., & Takahashi, G. (2014). Geographic assignment with stable
isotopes in IsoMAP. Methods in Ecology and Evolution, 5(3), 201–206. https://doi.org/10.1111/2041-­210X.12147
Bowen, G. J., & Wilkinson, B. (2002). Spatial distribution of d18O in meteoric precipitation. Geology, 30(4), 315–318.
https://doi.org/10.1130/0091-­7613(2002)030 < 0315:SDOOIM>2.0.CO;2
Bray, P., Cuénod, A., Gosden, C., Hommel, P., Liu, R., & Pollard, A. M. (2015). Form and flow: The “karmic cycle”
of copper. Journal of Archaeological Science, 56, 202–209. https://doi.org/10.1016/j.jas.2014.12.013
208 John Pouncett

Brettell, R., Montgomery, J., & Evans, J. (2012). Brewing and stewing: The effect of culturally mediated behaviour
on the oxygen isotope composition of ingested fluids and the implications for human provenance studies. Journal
of Analytical Atomic Spectrometry, 27(5), 778–785. https://doi.org/10.1039/C2JA10335D
Budd, P., Montgomery, J., Barreiro, B., & Thomas, R. G. (2000). Differential diagenesis of strontium in archaeological
human dental tissues. Applied Geochemistry, 15(5), 687–694. https://doi.org/10.1016/S0883-­2927(99)00069-­4
Capo, R. C., Stewart, B. W., & Chadwick, O. A. (1998). Strontium isotopes as tracers of ecosystem processes: theory
and methods. Geoderma, 82(1), 197–225. https://doi.org/10.1016/S0016-­7061(97)00102-­X
Carver, G. (2012). Pits and place-­making: Neolithic habitation and deposition practices in East Yorkshire c. 4000–
2500 BC. Proceedings of the Prehistoric Society, 78, 111–134. https://doi.org/10.1017/S0079497X00027134
Chenery, C. A., Müldner, G., Evans, J., Eckardt, H., & Lewis, M. (2010). Strontium and stable isotope evidence
for diet and mobility in Roman Gloucester, UK. Journal of Archaeological Science, 37(1), 150–163. https://doi.
org/10.1016/j.jas.2009.09.025
Chenery, C. A., Pashley, V., Lamb, A. L., Sloane, H. J., & Evans, J. A. (2012). The oxygen isotope relationship between
the phosphate and structural carbonate fractions of human bioapatite. Rapid Communications in Mass Spectrometry,
26(3), 309–319. https://doi.org/10.1002/rcm.5331
Cole, E. (1901). Duggleby Howe. Transactions of the East Riding Antiquarian Society, 9, 57–61.
Darling, W. G., Bath, A. H., & Talbot, J. C. (2003). The O and H stable isotope composition of freshwaters in the
British Isles. 2. Surface waters and groundwater. Hydrology and Earth System Sciences, 7(2), 183–195. https://doi.
org/10.5194/hess-­7-­183-­2003
Darvill, T. C. (1979). Court cairns, passage graves and social change in Ireland. Man, 14(2), 311. https://doi.
org/10.2307/2801570
Daux, V., Lécuyer, C., Héran, M.-­A., Amiot, R., Simon, L., Fourel, F., . . . Escarguel, G. (2008). Oxygen isotope frac-
tionation between human phosphate and water revisited. Journal of Human Evolution, 55(6), 1138–1147. https://
doi.org/10.1016/j.jhevol.2008.06.006
Durden, T. (1995). The production of specialised flintwork in the later Neolithic: A case study from the Yorkshire
Wolds. Proceedings of the Prehistoric Society, 61, 409–432. https://doi.org/10.1017/S0079497X00003157
Emery, M. V., Prowse, T. L., Elford, S., Schwarcz, H. P., & Brickley, M. (2017). Geographic origins of a War of 1812
skeletal sample integrating oxygen and strontium isotopes with GIS-­based multi-­criteria evaluation analysis. Jour-
nal of Archaeological Science: Reports, 14(Supplement C), 323–331. https://doi.org/10.1016/j.jasrep.2017.06.007
Evans, J. A., Chenery, C. A., & Fitzpatrick, A. P. (2006). Bronze Age childhood migration of individuals near Stone-
henge, revealed by strontium and oxygen isotope tooth enamel analysis. Archaeometry, 48(2), 309–321. https://
doi.org/10.1111/j.1475-­4754.2006.00258.x
Evans, J. A., Chenery, C. A., & Montgomery, J. (2012). A summary of strontium and oxygen isotope variation
in archaeological human tooth enamel excavated from Britain. Journal of Analytical Atomic Spectrometry, 27(5),
754–764. https://doi.org/10.1039/C2JA10362A
Evans, J. A., Montgomery, J., Wildman, G., & Boulton, N. (2010). Spatial variations in biosphere 87Sr/86Sr in Britain.
Journal of the Geological Society, 167(1), 1–4. https://doi.org/10.1144/0016-­76492009-­090
Faure, G. (1986). Principles of isotope geology (2nd ed.). Chichester: Wiley.
Gat, J. R. (1971). Comments on the stable isotope method in regional groundwater investigations. Water Resources
Research, 7(4), 980–993. https://doi.org/10.1029/WR007i004p00980
Gibson, A. (2016). Who were these people? A sideways view and a non-­answer of political proportions. In
K. Brophy, G. MacGregor, & I. Ralston (Eds.), The neolithic of mainland Scotland (pp. 57–73). Edinburgh: Edinburgh
University Press.
Gibson, A., Allen, M., Bradley, P., Carruthers, W., Challinor, D., French, C., . . . Walmsley, C. (2011). Report on the
excavation at the Duggleby Howe causewayed enclosure, North Yorkshire, May–July 2009. Archaeological Journal,
168(1), 1–63. https://doi.org/10.1080/00665983.2011.11020828
Gibson, A., & Bayliss, A. (2009). Recent research at Duggleby Howe, North Yorkshire. Archaeological Journal, 166(1),
39–78. https://doi.org/10.1080/00665983.2009.11078220
Goodchild, M. F. (2003). The nature and value of geographic information. In M. Duckham, M. F. Goodchild, &
M. Worboys (Eds.), Foundations of geographic information science (pp. 18–30). London: Taylor & Francis.
Harding, J. (2006). Pit-­digging, occupation and structured deposition on Rudston Wold, Eastern Yorkshire. Oxford
Journal of Archaeology, 25(2), 109–126. https://doi.org/10.1111/j.1468-­0092.2006.00252.x
Spatial approaches to assignment 209

Hodell, D. A., Mueller, P. A., McKenzie, J. A., & Mead, G. A. (1989). Strontium isotope stratigraphy and
geochemistry of the late Neogene ocean. Earth and Planetary Science Letters, 92(2), 165–178. https://doi.
org/10.1016/0012-­821X(89)90044-­7
Hoppe, K. A., Koch, P. L., & Furutani, T. T. (2003). Assessing the preservation of biogenic strontium in fossil bones
and tooth enamel. International Journal of Osteoarchaeology, 13(1–2), 20–28. https://doi.org/10.1002/oa.663
Hutton, R. (2014). Pagan Britain. New Haven, CT: Yale University Press.
Kinnes, I. (1979). Round barrows and ring-­ditches in the British Neolithic. London: British Museum.
Laffoon, J. E., Sonnemann, T. F., Shafie, T., Hofman, C. L., Brandes, U., & Davies, G. R. (2017). Investigating human
geographic origins using dual-­isotope (87Sr/86Sr, d18O) assignment approaches. PLoS One, 12(2), e0172562. https://
doi.org/10.1371/journal.pone.0172562
Lightfoot, E., & O’Connell, T. C. (2016). On the use of biomineral oxygen isotope data to identify human migrants
in the archaeological record: Intra-­sample variation, statistical methods and geographical considerations. PLoS
One, 11(4), e0153850. https://doi.org/10.1371/journal.pone.0153850
Lin, G. P., Rau, Y. H., Chen, Y. F., Chou, C. C., & Fu, W. G. (2003). Measurements of dD and d18O stable isotope
ratios in milk. Journal of Food Science, 68(7), 2192–2195. https://doi.org/10.1111/j.1365-­2621.2003.tb05745.x
Longinelli, A. (1984). Oxygen isotopes in mammal bone phosphate: A new tool for paleohydrological and paleoclimato-
logical research? Geochimica et Cosmochimica Acta, 48(2), 385–390. https://doi.org/10.1016/0016-­7037(84)90259-­X
Loveday, R. (2002). Duggleby Howe revisited. Oxford Journal of Archaeology, 21(2), 135–146. https://doi.
org/10.1111/1468-­0092.00153
Loveday, R. (2011). Polished rectangular flint knives–elaboration or replication? In A. Saville (Ed.), Flint and stone in
the Neolithic period (pp. 37–61). Oxford: Oxbow Books.
Loveday, R. (2016). Monuments to mobility? Investigating cursus patterning in Southern Britain. In J. Leary &
T. Kador (Eds.), Moving on in Neolithic studies: Understanding mobile lives (pp. 67–109). Oxford: Oxbow Books.
Luz, B., Kolodny, Y., & Horowitz, M. (1984). Fractionation of oxygen isotopes between mammalian bone-­
phosphate and environmental drinking water. Geochimica et Cosmochimica Acta, 48(8), 1689–1693. https://doi.
org/10.1016/0016-­7037(84)90338-­7
Manby, T. (1988). The Neolithic in Eastern Yorkshire. In T. Manby (Ed.), Archaeology in Eastern Yorkshire: Essays in hon-
our of T. C. M. Brewster (pp. 35–88). Sheffield: Department of Archaeology and Prehistory, University of Sheffield.
Montgomery, J. (2010). Passports from the past: Investigating human dispersals using strontium isotope analysis of
tooth enamel. Annals of Human Biology, 37(3), 325–346. https://doi.org/10.3109/03014461003649297
Montgomery, J., Budd, P., & Evans, J. (2000). Reconstructing the lifetime movements of ancient people: A
Neolithic case study from Southern England. European Journal of Archaeology, 3(3), 370–385. https://doi.
org/10.1177/146195710000300304
Montgomery, J., Cooper, R. E., & Evans, J. (2007). Foragers, farmers or foreigners? An assessment of dietary stron-
tium isotope variation in Middle Neolithic and Early Bronze Age East Yorkshire. In M. Larsson & M. Parker-­
Pearson (Eds.), From Stonehenge to the Baltic: Living with cultural diversity in the third millennium BC (pp. 65–75).
Oxford: Archaeopress.
Montgomery, J., Evans, J. A., & Cooper, R. E. (2007). Resolving archaeological populations with Sr-­isotope mixing
models. Applied Geochemistry, 22(7), 1502–1514. https://doi.org/10.1016/j.apgeochem.2007.02.009
Montgomery, J., & Jay, M. (2013). The contribution of Skeletal Isotope Analysis to understanding the Bronze Age in
Europe. In H. Fokkens & A. Harding (Eds.), The Oxford handbook of the European Bronze Age. Retrieved from www.
oxfordhandbooks.com/view/10.1093/oxfordhb/9780199572861.001.0001/oxfordhb-­9780199572861-­e-­10
Moreau, L., Brandl, M., Filzmoser, P., Hauzenberger, C., Goemaere, É., Jadin, I., . . . Schmitz, R. W. (2016). Geochemi-
cal sourcing of flint artifacts from Western Belgium and the German Rhineland: Testing hypotheses on Gravettian
period mobility and raw material economy. Geoarchaeology, 31(3), 229–243. https://doi.org/10.1002/gea.21564
Mortimer, J. (1892). An account of the opening of the tumulus “Howe Hill” Duggleby. Proceedings of the Yorkshire
Geological and Polytechnical Society, 12, 215–225.
Mortimer, J. (1893). Further observations on the contents of the Howe Tumulus [Duggleby]. Proceedings of the York-
shire Geological and Polytechnical Society, 12, 242–245.
Mortimer, J. (1905). Forty years’ researches in British and Saxon burial mounds of East Yorkshire: Including Romano-­British
discoveries, and a description of the ancient entrenchments of a section of the Yorkshire Wolds. London: A Brown and Sons,
Limited.
210 John Pouncett

Neil, S., Evans, J., Montgomery, J., & Scarre, C. (2018). Isotopic evidence for landscape use and the role of causewayed
enclosures during the earlier Neolithic in Southern Britain. Proceedings of the Prehistoric Society, 1–21. https://doi.
org/10.1017/ppr.2018.6
Ostapkowicz, J., Brock, F., Wiedenhoeft, A. C., Snoeck, C., Pouncett, J., Baksh-­Comeau, Y., . . . Boomert, A. (2017).
Black pitch, carved histories: Radiocarbon dating, wood species identification and strontium isotope analysis of
prehistoric wood carvings from Trinidad’s Pitch Lake. Journal of Archaeological Science: Reports, 16, 341–358. https://
doi.org/10.1016/j.jasrep.2017.08.018
Pellegrini, M., Pouncett, J., Jay, M., Parker-­Pearson, M., & Richards, M. P. (2016). Tooth enamel oxygen “isoscapes”
show a high degree of human mobility in prehistoric Britain. Scientific Reports, 6. https://doi.org/10.1038/
srep34986
Pierpoint, S. (1980). Social patterns in Yorkshire prehistory 3500–750 B.C. Oxford: British Archaeological Reports.
Pollard, A. M. (Ed.). (2018). Beyond provenance: New approaches to interpreting the chemistry of archaeological copper alloys.
Leuven: Leuven University Press.
Pollard, A. M., Pellegrini, M., & Lee-­Thorp, J. A. (2011). Technical note: Some observations on the conversion of
dental enamel d18Op values to d18Ow to determine human mobility. American Journal of Physical Anthropology, 145(3),
499–504. https://doi.org/10.1002/ajpa.21524
Pouncett, J. (2019). Neolithic occupation and stone working on the Yorkshire Wolds (Unpublished doctoral dissertation).
University of Oxford, Oxford.
Price, T. D., Burton, J. H., & Bentley, R. A. (2002). The characterization of biologically available strontium isotope ratios
for the study of prehistoric migration. Archaeometry, 44(1), 117–135. https://doi.org/10.1111/1475-­4754.00047
Riley, D. (1980). Recent air photographs of Duggleby Howe and the Ferrybridge henge. Yorkshire Archaeological
Journal, 52, 174–178.
Ryan, S. E., Snoeck, C., Crowley, Q. G., & Babechuk, M. G. (2018). 87Sr/86Sr and trace element mapping of geosphere-­
hydrosphere-­biosphere interactions: A case study in Ireland. Applied Geochemistry, 92, 209–224. https://doi.
org/10.1016/j.apgeochem.2018.01.007
Schulting, R. J., le Roux, P., Gan,Y. M., Pouncett, J., Hamilton, J., Snoeck, C., . . . Lock, G. (2019). The ups & downs
of Iron Age animal management on the Oxfordshire Ridgeway, south-­central England: A multi-­isotope approach.
Journal of Archaeological Science, 101, 199–212. https://doi.org/10.1016/j.jas.2018.09.006
Schulting, R. J., Murphy, E., Jones, C., & Warren, G. (2012). New dates from the north and a proposed chronology
for Irish court tombs. Proceedings of the Royal Irish Academy. Section C: Archaeology, Celtic Studies, History, Linguistics,
Literature, 112C, 1–60.
Sharp, Z. (2007). Principles of stable isotope geochemistry. Upper Saddle River, NJ: Pearson Education.
Smith, A. G., Pilcher, J. R., & Pearson, G. W. (1971). New radiocarbon dates from Ireland. Antiquity, 45(178), 97–102.
https://doi.org/10.1017/S0003598X00069246
Snoeck, C., Lee-­Thorp, J., Schulting, R., Jong, J. de, Debouge, W., & Mattielli, N. (2015). Calcined bone provides
a reliable substrate for strontium isotope ratios as shown by an enrichment experiment. Rapid Communications in
Mass Spectrometry, 29(1), 107–114. https://doi.org/10.1002/rcm.7078
Snoeck, C., Pouncett, J., Claeys, P., Goderis, S., Mattielli, N., Parker-­Pearson, M., . . . Schulting, R. J. (2018). Strontium
isotope analysis on cremated human remains from Stonehenge support links with west Wales. Scientific Reports,
8(1), 10790. https://doi.org/10.1038/s41598-­018-­28969-­8
Snoeck, C., Pouncett, J., Ramsey, G., Meighan, I. G., Mattielli, N., Goderis, S., . . . Schulting, R. J. (2016). Mobility
during the Neolithic and Bronze Age in Northern Ireland explored using strontium isotope analysis of cremated
human bone. American Journal of Physical Anthropology, 160(3), 397–413. https://doi.org/10.1002/ajpa.22977
Snoeck, C., Ryan, S. E., Pouncett, J., Pellegrini, M., Claeys, P., Wainwright, A., . . . Schulting, R. J. (2019). Towards a
biologically available strontium baseline for Ireland. Manuscript submitted for publication.
Snoeck, C., Schulting, R. J., Lee-­Thorp, J. A., Lebon, M., & Zazzo, A. (2016). Impact of heating conditions on the
carbon and oxygen isotope composition of calcined bone. Journal of Archaeological Science, 65, 32–43. https://doi.
org/10.1016/j.jas.2015.10.013
Warham, J. O. (2011). Mapping biosphere strontium isotope ratios across major lithological boundaries: A systematic inves-
tigation of the major influences on geographic variation in the 87Sr/86Sr composition of bioavailable strontium above the
Cretaceous and Jurassic rocks of England (Doctoral dissertation). Retrieved from https://bradscholars.brad.ac.uk/
handle/10454/5500
Spatial approaches to assignment 211

Waterman, D. M., & Morton, W. R. M. (1965). The court cairn at Annaghmare, Co. Armagh. Ulster Journal of
Archaeology, 28, 3–46.
White, C. D., Spence, M. W., Longstaffe, F. J., & Law, K. R. (2004). Demography and ethnic continuity in the Tlai-
lotlacan enclave of Teotihuacan: The evidence from stable oxygen isotopes. Journal of Anthropological Archaeology,
23(4), 385–403. https://doi.org/10.1016/j.jaa.2004.08.002
Wilson, J. C., & Standish, C. D. (2016). Mobility and migration in late Iron Age and Early Medieval Ireland. Journal
of Archaeological Science: Reports, 6, 230–241. https://doi.org/10.1016/j.jasrep.2016.02.016
Wright, L. E. (2005). Identifying immigrants to Tikal, Guatemala: Defining local variability in strontium isotope
ratios of human tooth enamel. Journal of Archaeological Science, 32(4), 555–566. https://doi.org/10.1016/j.
jas.2004.11.011
Wunder, M. B. (2010). Using isoscapes to model probability surfaces for determining geographic origins. In J. West,
G. Bowen, T. Dawson, & K. Tu (Eds.), Isoscapes: Understanding movement, pattern, and process on earth through isotope
mapping (pp. 251–270). Dordrecht: Springer.
Wunder, M. B., Kester, C. L., Knopf, F. L., & Rye, R. O. (2005). A test of geographic assignment using isotope tracers
in feathers of known origin. Oecologia, 144(4), 607–617. https://doi.org/10.1007/s00442-­005-­0071-­y
12
Analysing regional environmental
relationships
Kenneth L. Kvamme

Introduction
The study of the distributions of archaeological settlements and sites across the landscape has a long
history. Frequently referred to as “settlement pattern” research or as “settlement archaeology” (Chang,
1968), most scholars agree that it first became a serious focus with the publication of Willey’s (1953)
Prehistoric Settlement Patterns in the Virú Valley, Peru. That work aimed to describe “prehistoric sites with
reference to geographic . . . position” and “to reconstruct cultural institutions insofar as these may be
affected by settlement configurations” (Willey, 1953, p. 1). Billman (1997, p. 1) argues that “no other
single project has so fundamentally changed the manner in which archaeology is conducted.”
Yet, there were antecedents. Trigger (1967) observes that archaeologists have long noticed environ-
mental associations, such as the one between Linearbandkeramic sites and loess soils in Europe. It was
Julian Steward, a cultural anthropologist active in archaeology, who probably played the largest role in
getting settlement archaeology off the ground by combining a focus in cultural ecology with the study
of regions as units in such pioneering papers as “Ecological Aspects of Southwestern Society” (Steward,
1937), where regional archaeological distributions yielded insights into the development of Puebloan
society. Murphy (1977, p. 26) concludes that Steward pioneered settlement archaeology, noting that the
Virú Valley project was largely planned by Steward (along with Wendall Bennett) who “assigned” Willey
to the settlement pattern investigations.
Just what is settlement archaeology and the study of settlement patterns? Before proceeding, it should
be clarified that although settlements are commonly regarded as villages or communities of multiple
households, the investigation of settlement patterns pertains to regional archaeological distributions of
any kind, so hereafter “site” and “settlement” are used interchangeably. Early investigators (e.g. Chang,
1968; Parsons, 1972; Trigger, 1968) commonly recognized several areas of investigation, but my focus
is on the distribution of settlements over the landscape. In this context Kantner (1996, p. 636) defines
settlement pattern as “the distribution of human activities across the landscape and the spatial relationship
between these activities and features of the natural and social environment.” These two environments are
critical to understanding the perspectives and methodologies that have developed.
It is generally argued that the natural environment is central to settlement because regional popula-
tion distributions are largely governed by the nature and availability of natural resources. At the same
Analysing regional environmental relationships 213

time, social, political, and religious institutions also frame patterns of settlement (Trigger, 1968). Winters
(1969, p. 110) views settlement pattern as “geographic and physiographic relationships” (the natural
environment), while settlement system pertains to “the functional relationships among the sites con-
tained within a settlement pattern” (the social environment). The Southwestern Archaeological Research
Group (SARG), a consortium of researchers in the American Southwest that focused on the question
of why “prehistoric populations located sites where they did”, explicitly recognized that sites had to be
investigated in the context of these two environments (Plog & Hill, 1971, pp. 8–9). Separate analytical
approaches have formed around these distinct perspectives in settlement pattern studies.
The foregoing duality is emphasized because it fits well within contemporary spatial analytic perspec-
tives. At the scale of regions an archaeological distribution may be regarded as a point pattern, which is
a realization of one or more spatial processes (O’Sullivan & Unwin, 2003, pp. 64–66). Bevan, Crema,
Li, and Palmisano (2013) describe first-­and second-­order characteristics of a realized point pattern (see
Bevan, this volume). The former influence the intensity of points, causing regional variation. Favour-
able environmental circumstances (soils, access to water) commonly encourage such first-­order trends.
Second-­order effects occur when interactions between the points themselves impact their distributions,
such as repulsion or attraction. In other words, the presence of a site, such as a central place, may influence
the distribution of other sites. This chapter focuses exclusively on first-­order spatial characteristics and
ways that have been employed to analyse environmental relationships with archaeological distributions.

Core methods

Comparisons against environmental backgrounds: categorical data


Early approaches to the regional examination of settlement pattern focused on archaeological distribu-
tions by place, tallying site frequencies according to various environmental categories. The SARG research
design explicitly called for the recording of site frequencies in various environmental settings to under-
stand how they affect site locations (Plog & Hill, 1971). Such approaches have been dubbed “zonal” by
Chapman (2000). Pearson (1978), for example, undertook an “environmental analysis” of Late Mississip-
pian settlements in Georgia by examining frequencies of site size classes by soil type, forest community
type, and distance classes to creeks, finding that larger sites were associated with more valued settings.
Probably the earliest statistical test for investigating site location relationships with environment was
the chi-­square goodness-­of-­fit test because it was well-­suited for zonal approaches with its comparison of
observed site frequencies against expectations derived from the nature of the background environment.
Its pioneering use has been attributed to an MA thesis by Fred Plog (1968), but it soon formed an integral
part of SARG practice (Plog & Hill, 1971). Shennan (1988) devotes an entire chapter to it in Quantify-
ing Archaeology and it remains much-­used in many applications (e.g. Gaffney & Stančič, 1996; Mink II,
Stokes, & Pollack, 2006). Attwell and Fletcher (1987) present a different testing approach for the same
question based on randomization methods which also has received attention (e.g. Kammermans, 2000).

Site catchment analysis


Concurrent with much of the foregoing was the development of site catchment analysis (SCA), which
Chapman (2000, p. 533) suggests was a reaction against zonal approaches to settlement studies. It was origi-
nally established by Vita-­Finzi and Higgs (1970) for understanding relationships between settlements and
their environmental surroundings. They posited exploitation territories around sites that formed areas in which
economic activities were performed with catchments of 10 km (a two-­hour walk) for hunter-­gatherers and
214 Kenneth L. Kvamme

5 km (one hour) for farming peoples. An application by Peebles (1978) examined siting preferences of Mis-
sissippian settlements in the vicinity of Moundville, a ceremonial centre in Alabama. Early twentieth century
pre-­hybrid corn yields in bushels per acre by soil type were employed as a proxy for Mississippian period
corn yields (recognizing that absolute yields might be overestimated, but relative scaling would be accurate).
Catchment radii of 0.6 and 1.2 miles (approximately 1 and 2 km, respectively) were investigated around
each site and corn productivity was cumulated by soil type. The data showed that larger settlements were
associated with more productive catchments and strong correlations, some as high as r = .87, were shown
between catchment productivity in bushels of corn and village size.

Two-­sample comparisons for continuous data


Archaeologists also quantify characteristics observed at sites themselves, such as elevations and slope
gradients, requiring different statistical tests, all of which employ some sort of background comparison.
Two-­sample tests compare measurements acquired at site samples against measurements from a sample of
points taken at random loci from the region, sometimes referred to as “non-­sites” (Warren & Asch, 2000).
Using data from Maine, Kellogg (1987) analysed 190 coastal sites against 190 locations randomly sampled
from 100 m segments of the coastline. Distance to mudflats, distance to fresh water, and aspect azimuth
were compared using two-­sample Kolmogorov-­Smirnov tests, with only the last two variables showing
significant differences. Shermer and Tiffany (1985) compared an environmental “diversity index” (the
number of environmental communities) within catchment areas surrounding Woodland period sites
against 100 random points in Iowa through t-­tests that showed greater diversity in the site catchments.
Kvamme (1985, 2006) theorized that if people select particular contexts for placing settlements then the
locations selected should exhibit less variance than the region as a whole, a circumstance shown to hold
true in several empirical studies (Kvamme, 1990, 1996).

Comparisons between two or more site types


The foregoing has focused exclusively on methods for assessing locational tendencies of a single class of
archaeological sites with respect to the environment of a defined region. A related question is whether
the distributions of two (or more) site types differ in a region with respect to environment. To illustrate,
Maschner (1996) hypothesized that Late Phase villages should be sited in defensible locations compared
to more peaceful Middle Phase villages in the Tebenkof Bay of Alaska, owing to increasing warfare.
Reasoning that a commanding view of the bay on which all sites are situated forms a suitable proxy for
defensibility (because enemies could easily be seen approaching by water, the most likely approach), a
t-­test between the two site types showed significantly greater areas of the bay within viewsheds seen by
Late Phase villages.

One-­sample comparisons for continuous data


Hodder and Orton (1976, pp. 226–229) present a one-­sample testing strategy in a rare investigation of
a socially created landscape variable by examining Iron Age coin find spots and their proximities to the
late Roman road network in the south of England. By comparing the cumulative distribution of find
spot distances to roads against the estimated cumulative distribution of road distances for all the land
area of southern England, they show via a Kolmogorov-­Smirnov one-­sample test that coins exhibit a
significant tendency for proximity to these roads, suggesting that Roman roads may have followed the
courses of earlier ones. In this study the background “population distribution” of distances to roads was
Analysing regional environmental relationships 215

approximated from only three data points – the remainder of the cumulative curve was simply “sketched
in” between them. It would have to wait until the advent of geographical information systems (GIS) more
than a decade later before more accurate approximations of background distributions could be achieved.

Comparing site intensity against environmental categories


Bevan et al. (2013) illustrate how environmental variables under investigation can be “binned” into
discrete ranges composed of a dozen or so levels. Within each bin site density is also computed. A scat-
terplot is then generated by plotting bin centre-­points against corresponding site densities. This enables
a correlation coefficient to be computed with significant results pointing to relationships between the
covariate and site intensity, a tactic they successfully employ in a case study in Israel-­Palestine.

GIS modifications
By the 1990s GIS technology was making a tremendous impact in regional studies. Besides ease of data
handling, Harris and Lock (1990) realized that it provided a means to link the two primary approaches of
regional investigations: visual and subjective appraisals of map-­type information and quantitative analysis.
It also encouraged a change in the direction of “spatial analysis” which began to lean heavily toward
visualization and away from quantification. Yet, although data quality, volume, and ease of analysis was
improved, in practice GIS approaches delivered few advances over analysis methods established decades
earlier, it just made them easier to do.

New variables
With GIS a variety of new variables could be explored that were previously difficult or impossible to
compute. Two that stand out are viewsheds, areas visible from a point or points, and cost distances, quan-
tifications of non-­linear distances computed by considering difficulty of human travel over landscapes
(see Gillings & Wheatley; Herzog, this volume; Conolly & Lake, 2006, p. 215). Wheatley (1995) exam-
ined inter-­visibility between barrows in the region of Stonehenge, in the United Kingdom, where their
prominence on the landscape has been long noted. A viewshed was computed from each barrow and
the multiple viewsheds were then summed to yield a cumulative viewshed, where each cell in the raster
data structure held the number of barrows visible. Comparing the cumulative distributions measured at
the barrows against the cumulative distributions for the entire region using a one-­sample Kolmogorov-­
Smirnov test showed significantly greater inter-­visibility between barrows, suggesting their siting for
visibility and a ritual authority that required them to be seen.
In GIS-­based SCA, Hunt (1992) showed quite early how GIS increase the accuracy of environmental
features represented within catchments and how more realistic boundaries could be formed utilizing drain-
ages or their basins (as opposed to arbitrary circular territories). However, more profound changes were
introduced by Gaffney and Stančič (1996) who proposed cost-­distance weighted catchments based on
difficulty of travel to or from a site that formed more realistic territories of irregular shape within which
catchment calculations could be based. Ullah (2011) gives a more contemporary example of this approach.

New methods
Raster GIS permit quantification of entire regions on a cell-­by-­cell basis (e.g. every 10 m) allowing
entire “population distributions” of covariates to be more closely approximated for better estimates of
216 Kenneth L. Kvamme

background proportions in goodness-­of-­fit tests. Kvamme (1992a), for example, replicated Hodder and
Orton’s (1976, pp. 226–229) analysis of Iron Age coin finds against the Roman road network in the
south of England, but instead of approximating the background distribution of distances to roads with
only three data points, more than 23,000 were employed, one for each square kilometre in the study
region. This more accurate analysis yielded the same conclusion, but the statistical significance was
somewhat lower. Digital characterization of entire background environments also allows replacement
of two-­sample tests with one-­sample forms, because it is no longer necessary to sample the background.
The unusualness of a single archaeological sample’s locational tendencies can be examined against entire
background populations (Kvamme, 1990).
GIS also facilitate use of resampling or randomization methods where, given a sample of n archaeologi-
cal sites, raster methods can generate thousands of random samples of the same size (with replacement)
for a variable of interest. The result is a sampling distribution of some focus statistic (e.g. mean, median,
variance) against which the actual sample is compared. If the archaeological sample statistic lies in the
extreme five percent of all samples, then tendencies significantly different from the sampled region can be
claimed (Fischer Farrelly, Naddocks, & Ruggles, 1997; Kvamme, 1996). As a form of permutation test, this
approach is resistant to extreme values and offers freedom from the limiting assumptions (e.g. normality,
homogeneity) associated with many statistical tests (Berry, Johnson, & Mielke, 2014, p. 6).

Archaeological location modelling


Significantly, GIS facilitated the growth of regional archaeological location modelling (ALM), also known
as “predictive modelling”, which permitted mappings of models and projections of archaeological distri-
butions across entire landscapes based largely on relationships between sites and environment (Mehrer &
Wescott, 2006; Verhagen & Whitley, 2012, this volume). Such work further promoted the investigation of
first-­order relationships in order that they could be combined as a basis for modelling. Warren and Asch
(2000), for example, compare measurements from 265 sites against 5,208 non-­sites using Mann-­Whitney
tests for 15 ratio scaled variables and chi-­square or Fisher’s exact test for nine nominally scaled variables
as a means to screen for possible predictors in an ALM in Illinois. Bevan et al. (2013) refer to ALM as one
of the primary tools by which archaeologists have investigated and modelled first-­order effects.

Multivariate approaches
Multivariate approaches came to the forefront with the advent of GIS and ALM. While some ALM only
focus on modelling as an end-­product, largely for cultural resource management and planning, others
employ multivariate statistical models for the insights they offer into relationships between sites and
environment. When all variables are considered jointly in a multivariate analysis many frequently lose
significance owing to inter-­correlations and redundancies between them. Moreover, some models yield
coefficient weights that have interpretive value. Warren and Asch (2000), for example, based on a logistic
regression analysis between samples of sites and non-­sites in central Illinois, interpret findings to show
increasing site probability near streams, in regions of higher local relief, on steep slopes, in floodplains,
and in upland knolls.
Another multivariate strategy comes from species distribution modelling in the biological sciences
(Browning, Beaupré, & Duncan, 2005; Dunn & Duncan, 2000; Rotenberry, Preston, & Knick, 2006)
where an unusual application of principal components analysis (PCA) offers an approach for isolating
dimensions most relevant to analysed locations. It recognizes that many measured variables are obtained
through convenience or ease of measurement through GIS, that some variables may be partially or highly
Analysing regional environmental relationships 217

correlated representing redundant expressions of similar phenomena, and that other variables may not
actually be relevant. PCA permits new independent dimensions to be defined that exhibit minimum
variance at the locations under analysis (i.e. the lowest principal components), thereby removing the most
variable siting aspects of a point pattern and leaving location commonalities in a reduced set of dimen-
sions. This approach was investigated by Kvamme (in press) using historic farmstead data from northwest
Arkansas where the lowest location-­constraining components demonstrate natural and social features of
environment as dimensions relevant to farmstead placement (see the final case study in this chapter).

Issues
One of the greatest difficulties in the investigation of first-­order environmental relationships with archae-
ological distributions is constancy of environment. This is particularly true for zonal approaches that
investigate frequencies of site occurrence with respect to environmental communities, whose boundaries
may have varied through time. Modern alterations to landscape, changing courses of rivers and streams,
and climate change frequently challenge assumptions of unchanged environments (Lock & Harris, 2006).
Such considerations are frequently ignored in locational analyses.
Also ignored is the nature and extent of the “study region” itself. Most studies, particularly in GIS set-
tings, simply employ an arbitrary rectangle surrounding a point pattern of interest. Yet, in virtually all of
the foregoing, statistical conclusions are reached through comparisons against data from the background
region, so the nature of how that region is defined can profoundly affect results. A region that extends
beyond the point pattern might include higher elevations, lower slopes, poorer soils, and the like, that
will bias results when compared against the more restrictive space occupied by sites. These effects were
illustrated by Kvamme (1996) in a central Arizona study that analysed environmental variables measured
at 30 settlements. When the background was defined as the area within a minimum polygon encompass-
ing the settlement distribution, less significant and even insignificant differences were found compared
to an analysis that bounded the distribution with a larger and arbitrary rectangle containing large areas
devoid of settlements.

Method
In the following, a generic null hypothesis is one of “no difference” between sites and background or
between site types in the samples analysed. Depending on the statistical test, this hypothesis might vary
from no differences from expectation to no difference between means, medians, variances, or cumula-
tive distribution functions. Unless otherwise stated, two-­tailed testing forms are presented for simplicity.

Comparisons against environmental backgrounds: categorical data


A study region is categorized into discrete environmental zones when considering such nominal-­level
variables as soil types or plant communities. To illustrate, with k = 3 zones in a region, A, B, and C, sup-
pose respective site frequencies of 25, 25, and 50 are observed. Early approaches would typically argue that
category C was “preferred” because of its larger frequency. The chi-­square goodness-­of-­fit test examines
such frequencies against background expectations. The proportion of the region’s total area that each
class occupies is employed to generate an expected frequency under an assumption (or null hypothesis)
of “no zonal preferences.” In other words, if category C occupies 50% of the region then about half of
the sites, or 50, are expected simply by chance, which exactly matches the observed frequency so no real
tendency for preference or avoidance of that zone may be argued. However, if category A includes only
218 Kenneth L. Kvamme

10% of the region, producing an expectation of 10 sites, a real orientation for sites in that zone is realized
because the 25 observed are far more than expected. The foregoing conclusions are subjective; statistical
evaluation is achieved by computing:
2
k
(Oi − Ei )
2
χObs =∑  (12.1)
i =1 Ei
(where Oi are observed and Ei are expected frequencies per category), which follows a chi-­square dis-
tribution with k -­1 degrees of freedom (df; when, conventionally, all Ei ³ 5; Conover, 1999, p. 240). A
significant result (p << .05) indicates that observed frequencies in one or more categories deviate mark-
edly from expectation, showing preferences or avoidances for those categories.

Two-­sample comparisons for continuous data


The classic t-­test for differences between means is frequently employed for comparing site against back-
ground samples, which technically assumes normality and equality of variances. A modification known
as Welch’s t-­test, robust against both assumptions, is preferred:
x1 − x 2
tObs =  (12.2)
s12 s2
+ 2
n1 n2

where xi are the respective sample means (sites and background or two site types), si2 the sample vari-
ances, and ni the sample sizes. This statistic is compared against a t-­distribution with ν df, where

2
 s12 s22 
 + 
 n1 n2 
ν=
 s14 s24 
 
 n 2 ( n − 1) + n 2 ( n − 1) 
 1 1 2 2 

(Hays, 1994, pp. 325–328; R Core Team, 2016).


The nonparametric Mann-­Whitney test (also known as the Wilcoxon rank-­sum test) examines loca-
tion shifts in central tendency between two samples. It combines both samples, ranks them ignoring
group membership, and then assesses whether the sum of ranks in one group is unusually large (or small)
which suggests that values in that group are larger (smaller) than the other group. The test statistics is:

T = ∑ i =1 R( xi ) 
nk
(12.3)

where R(xi) is the rank of the ith case in group k. When n1 < 20 and n2 < 20 and there are not ties
exact probabilities associated with T may be obtained; otherwise, various approximations are employed
(Conover, 1999, p. 272).
The Kolmogorov-­Smirnov test looks for any distributional differences between two samples based on
their empirical cumulative distribution functions. It is nonparametric and has the advantage of a graphical
solution, useful for illustrating differences (see following). The test statistic is simply:

T = supx S1 (x ) − S2 (x )  (12.4)
Analysing regional environmental relationships 219

which indicates the greatest vertical distance (denoted by “sup” for supremum) between the Si(x), the
respective sample empirical distribution functions for variable, x. Under the null hypothesis of no distri-
butional differences, the distribution of T follows the two-­sample Smirnov distribution (Conover, 1999,
p. 456).
The parametric approach for evaluating differences between two sample variances computes an
F-­ratio:
s12
FObs =  (12.5)
s22
which follows an F-­distribution with n1 -­1 and n2 -­1 df (Hays, 1994, p. 362). Yet this test is highly sensi-
tive to departures from normality so a nonparametric test, such as Levene’s (R Core Team, 2016; Levene,
1960), is preferred. In the two-­sample case it computes:

(n1 + n2 − 2) ∑ i =1ni (xi − xTot )


2 2

W=  (12.6)
∑ ∑ (x − xi )
2 ni 2

i =1 j =1 ij

where xTot is the grand mean over all cases in both groups. This statistic follows an F-­distribution with
1 and n1 + n2 -­2 df. Kvamme, Stark, and Longacre (1996) demonstrate the superiority of a version of this
test (based on medians instead of means) in an archaeological application.

One-­sample comparisons for continuous data


The Kolmogorov-­Smirnov one-­sample test is a goodness-­of-­fit test that examines the empirical cumula-
tive distribution function of a single sample, S(x), against a hypothesized theoretical distribution from
which the sample was drawn, F*(x). The full population of all raster cells in a GIS region is employed for
describing F*(x) (Kvamme, 1990). It also has a graphical solution useful for illustrating differences (see
following). The test statistic is:

T = supx F * (x ) − S (x )  (12.7)

which indicates the greatest vertical distance between sample and theoretical distribution functions.
Under a null hypothesis of no distributional differences, the distribution of T follows the one-­sample
Kolmogorov distribution (Conover, 1999, p. 430).
With an entire regional population encoded within GIS, randomization methods or Monte Carlo
significance tests, may be performed. A sample of archaeological locations can be regarded as a subset
(S′) of n observations from the population. GIS can generate k random samples of size n from the same
population (S1, S2, . . ., Sk). A summary statistic of interest, T, is computed for each sample, yielding k + 1
values (T ′, T1, T2, . . ., Tk). Assuming each sample is an equally probable outcome from the population the
probability of the observed test statistic, T′, or one more extreme, is the proportion or resampled statistics
with values equal to or more extreme than that value. To illustrate, if k = 999 and R(T ′) = 40 (i.e. the
fortieth smallest), then p = .04. This value should be doubled for two-­tailed probabilities.

Comparing site intensity against environmental categories


An environmental variable is binned into a dozen or so categories (Bevan et al., 2013). Within each
bin site density is computed (site frequency divided by bin area). Bin midpoints are then plotted against
220 Kenneth L. Kvamme

corresponding site densities. Significant trends suggest site intensity variations with the covariate. With
a null hypothesis of “no correlation”, statistical significance is evaluated with Pearson’s r, the product-­
moment correlation coefficient, given by:


n
(xi − x )(yi − y )
r= i =1  
√ 2  (12.8)
 ∑ i =1(xi − x ) √ ∑ i =1(yi − y ) 
n 2 n

(where the xi are bin midpoints, the yi are bin densities, n is the number of bins, and -­1 £ r £ 1). The
ratio r √(n – 2) / √(1 – r2) follows a t-distribution with n – 2 (Hays, 1994, p. 647). Owing to the bivariate
normality assumption of this test a safer course is to rely on the nonparametric Spearman’s rho, rs, which
assesses correlation based on data ranks (Conover, 1999, p. 314).

Multivariate approaches
Bevan et al. (2013) go a step further with the foregoing method by examining site intensity relationships
against a suite of environmental covariates which are then subjected to multiple linear regression (Hays,
1994, p. 687) as a means to examine (and model) multivariate relationships. This section, however, exam-
ines the more commonly employed multiple logistic regression model, a nonparametric method that compares
k-­independent variables for differences between two classes, such as site-­presence and site-­absence, which
form a dichotomous dependent variable, Y (coded 1, 0, respectively). Computing the logarithm of the
odds ratio
 P ( Y = 1) 
ln  

 1− P [ Y = 1] 
is known as a logit transformation which produces a dependent variable that varies between plus and
minus infinity and becomes increasingly large as the odds ratio increases. The relationship between this
dependent variable and the independent variables then becomes:

logit( Y ) = α + β1X 1 + β 2 X 2 +…+ β k X k

The probability for Class 1 can be found by:


e α +β1X1 +β2 X 2 +…+βk X k 1
P (Y = 1) = =  (12.9)
(1 + e α + β1 X 1 + β2 X 2 +…+ βk X k
) (1 + e −(α + β1 X 1 + β2 X 2 +…+ βk X k )
)
The significance of individual coefficients is evaluated by:
βˆ j
Wj =
 βˆ
SE ( )j

where the hat, “^”, means “estimate of ” and “SE” is the standard error of estimate. Under a null hypothesis
that these coefficients are zero (the variable has no effect), the Wj may be evaluated for significance against
the standard normal distribution to ascertain variables bearing noteworthy relationships with site presence
(Hosmer & Lemeshow, 2000, p. 37). Moreover, the coefficients lend themselves to interpretation. The signs,
when positive, indicate that high values of the associated variable are related to site presence, with the reverse
for negative coefficients. Coefficient sizes may also be compared when measurement scales are the same
(standardization may achieve this), with larger absolute coefficients giving greater influence.
Analysing regional environmental relationships 221

PCA is a multivariate method that can be applied to an n ´ k matrix of k variables measured at n settle-
ments. The method produces k uncorrelated principal components (PCs) that represent independent sets
of relationships between a settlement distribution and the measured variables. Each PC, 1 through k, is
associated with an eigenvalue (variance), λ, representing the portion of the total variance included in the
k variables, such that λ1 > λ2 > . . . > λk. Each PC is also associated with an eigenvector, k coefficients
that when multiplied against the original variables and summed form the linear composites that are the
PCs. The absolute sizes of these coefficients indicate the relative importance of each variable to each
component, forming a basis for PCA interpretation. In other words, the meaning of a PC is gained by
determining which variables are associated with the largest absolute coefficients. The data rearrangement
of PCA can generate unanticipated insights into data structures and reveal latent underlying dimensions
(Jolliffe, 2002). Unstandardized PCA is based on the original or raw measurements where different scales,
magnitudes, and variances can profoundly influence results. Most applications therefore employ standard-
ized PCA where each variable is standardized to a variance of unity and therefore offers an equal influence
in the analysis. While the highest PCs maximize variation, the lowest exhibit minimum variance. When
the latter are derived from settlement data they portray dimensions that isolate least variable contexts for
settlement locations.

Case studies
Many of the foregoing methods are illustrated through a case study from the Sonoran Desert of southern
Arizona that contains a prehistoric Hohokam agricultural field complex of the Classic Period (13th–14th
centuries CE). This field, originally mapped by Fish, Fish, Miksicek, and Madsen (1985), contains an
abundance of agricultural features including terraces, check dams, and the ubiquitous “rock pile.” Rock
piles are circular lens-­shaped mounds of earth about 1.5 m in diameter and .75 m high covered with
fist-­sized cobbles. They were employed for agave growing, an important plant for food and fibre. Local
soils have a high clay content that causes rainfall to run-­off rather than penetrate. The mounds enhance
the plant-­growing environment because their relatively porous surfaces permit absorption of run-­off
and direct rainfall during the monsoon season (July–August). Moreover, the rocks act like a mulch by
preserving interior moisture. An early GIS-­based analysis of these rock piles (Kvamme, 1992b) revealed
tendencies for higher elevations (perhaps to reduce water run-­off volume), but on steep slopes (necessary
to capture run-­off), with an avoidance of drainages (where too much run-­off could be damaging) and
ridge-­like concave-­down surfaces (that reduce run-­off), and an insignificant tendency for north-­facing
slopes (reducing the effects of intense solar radiation). The same sample of n = 50 rock piles analysed by
that study, and for brevity only the slope data (as percent grade at 4 m spatial resolution), are examined
in the 400 ´ 400 m study region.

Comparisons against environmental backgrounds: categorical data


GIS methods were employed to generate three equal-­area categories, forcing identical expected values
for each category. A map and tally of the data are illustrated in Figure 12.1(a). The chi-­square statistic
(Eq.(12.1)) was considered as presented, but with the expected values, E, constant, it simplifies in this case
= E −1 ∑ i =1O i2 − n, which follows a chi-­square distribution with 2 df. Here, the observed chi-­
2 3
to: χObs
2
2
square value is χObs = 31.36 which, compared to the theoretical χ2, yields p < .0001, indicating profound
location tendencies on the part of the rock piles relative to the defined study region. Inspection of the data
reveals avoidance of level ground and preferences for steep slopes.
222 Kenneth L. Kvamme

Figure 12.1 Slope data and rock pile locations in the Hohokam agricultural field complex: (a) slope in 3 equal-­
area categories, (b) slope data showing rock pile and background samples with boxplot and cumulative graphs,
(c) slope data showing rock pile and check dam samples with boxplot and cumulative graphs, (d) slope data
and rock piles with cumulative graphs, (e) sampling distribution of 9,999 sample means from the region with
indication of realized sample mean.

Two-­sample comparisons for continuous data


A random sample of n = 50 background locations in the study area was generated (Figure 12.1(b)) and
a slope measurement was extracted from each for comparison against the n = 50 rock pile slopes. Basic
descriptive statistics are given in Table 12.1. Welch’s t-­test (Eq. (12.2)) was performed yielding p = .0019
(t = 3.198, df = 86.378), indicating a profound difference between means, with rock piles exhibiting the
Analysing regional environmental relationships 223

larger. While this test is robust against its normality assumption, the boxplot of the background distri-
bution illustrates great skewness. For comparison, the nonparametric Mann-­Whitney test ((Eq. (12.3))
based on ranks yields p = .0003 (W = 1781). Finally, the Kolmogorov-­Smirnov test (Eq. (12.4)), which
detects distributional differences of any kind between the empirical cumulative distribution functions
(right, Figure 12.1(b)), yields a maximum difference of D = 0.44, significant at p < .0001, with the rock
pile distribution skewed markedly toward higher slopes.
Because a priori theory suggests that human activities and occupations are placed in lower variance
settings compared to a region at large (Kvamme, 1985, 2006), a conventional one-­tailed F-­test for
comparing two variances (Eq. (12.5)) is explored, which yields a highly significant result (FObs = 2.159;
the theoretical F49,49 gives p = .004). Yet, because this test is highly sensitive to departures from normality the
nonparametric Levene’s test (Eq. (12.6)) was also run, which yields a significant result (FObs = 6.5763; the
theoretical F1,98 gives p = .006), so evidence suggests rock pile placements occur in less variable settings
with respect to slope.

Comparisons between two site types


Another Hohokam agricultural feature is the check dam. Placed across small drainages to slow run-­off
and encourage dropping of sediments to enhance agricultural productivity (Fish et al., 1985), they were
sited differently, but is this true in terms of slope characteristics? The 82 check dams are depicted in
Figure 12.1(c). Boxplots and cumulative distribution graphs suggest little location difference in slopes,
agreeing with the descriptive statistics (Table 12.1). Confirming this, a t-­test (Eq. (12.2)) yields p = .224
(t = -­ 1.224, df = 117.53); the Mann-­Whitney test (Eq. (12.3)) gives p = .3816 (W = 1863); and the
Kolmogorov-­Smirnov test (Eq. (12.4)) gives p = .6283 (D = 0.128).

One-­sample comparisons for continuous data


Making use of raster GIS, the rock pile sample may be compared against the full background population
of all 10,000 slope gradients in the study region, thereby removing the vagaries of background sampling
variation. The more robust and continuous cumulative distribution function in Figure 12.1(d) yields a
larger maximum distance of D = .45, significant by a one-­sample Kolmogorov-­Smirnov (Eq. (12.7)) test
at p < .0001.
A permutation test was also applied to these data. Focusing on the sample mean of x = 7.79%
(Table 12.1) computed from the n = 50 rock piles, 9,999 random samples of size 50 were extracted with
replacement from the study region and the sample mean was computed for each, yielding the sampling
distribution illustrated in Figure 12.1(e). With the realized (rock pile) sample the most extreme this result
is significant at p = .0002.

Table 12.1 Descriptive slope (percent grade) statistics for rock piles, check dams, and background samples.

Group n Min Median Mean Max Variance

Rock piles 50 1.462 8.170 7.792 14.76 7.326


Check dams 82 1.473 8.101 8.432 17.730 10.452
Background 50 1.189 4.498 5.617 15.470 15.813
224 Kenneth L. Kvamme

Figure 12.2 Slope data and rock pile locations in the Hohokam agricultural field complex: (a) slope in eight
categories with midpoints plotted against rock pile density, (b) rock piles and slopes within a circumscribed
study region with cumulative distribution plots, (c) elevation, slopes, and a logistic regression model for rock
piles based on these data.

Comparing site intensity against environmental categories


The continuous slope data were divided into eight steepness categories (Figure 12.2(a)) and rock pile density
(rock piles per hectare) was computed in each. Slope category midpoints were then plotted against densities
(Figure 12.2(a)) and Pearson’s correlation coefficient (Eq. (12.8)) was run, yielding r = 0.885, significant at
p = .0034 (Spearmans’ rs = .837, p = .0096), indicating that point intensity increases with slope.

Issue: redefining “environmental background”


The study area, although well-­centred about the point distribution (Figures 12.1 and 12.2), is an arbitrary
rectangle (quite unlike the point distribution) with substantial regions devoid of rock piles. In these areas
the ground tends to be quite level, which undoubtedly has biased the foregoing analyses by making the
comparative background data appear flatter. One solution to this problem is to bound the study region
more closely to the point distribution of interest (Kvamme, 1996). This can be accomplished by defining
Analysing regional environmental relationships 225

“background” as lying within a minimum bounding polygon surrounding the points, a solution not
without problems (e.g. when the point distribution is highly irregular or non-­contiguous). In any case,
for the present analysis the first nearest-­neighbour distance was computed for each rock pile in the sample
and the mean nearest-­neighbour distance was determined (21.75 m). This distance was then employed
as a buffer radius around each rock pile (via GIS), and a minimum bounding polygon was established
around these buffered areas to redefine a more relevant background region (Figure 12.2(b)). The empiri-
cal cumulative distribution function of the new background is substantially less different from the rock
pile distribution compared to the full background (Figure 12.2(b)), with a maximum difference in a one-­
sample Kolmogorov-­Smirnov test (12.7) of D = .23, compared to the previous D = .45 (Figure 12.1(d)).
The result, nevertheless, remains significant (p < .01).

Multivariate approaches
The slope data are again considered together with the three variables from the original analysis of the
Marana field complex (Kvamme, 1992b) to examine how they simultaneously relate to the rock pile loca-
tions. Globally, these variables, elevation, a ridge-­drainage index, and aspect (measured on a north-­south
scale), exhibit low inter-­correlations (Table 12.2, left) indicating they are largely independent (the largest
correlation, between slope and aspect, is r = -­.2311, indicating only 100r2 = 5.3% of the variance in com-
mon). Utilizing a random sample of n = 500 background points (to more fully characterize the background
environment) and the n = 50 rock piles, these data were subjected to a logistic regression analysis (Eq. (12.9))
with coefficients and associated statistics given in Table 12.2 (right). The positive coefficients indicate that
rock pile placement is associated with high values of slope, elevation, and the ridge-­drainage index (pointing
to ridge-­like settings), although the last is insignificant (p = .28), and aspect is inconsequential (p = .77). The
data set was standardized to remove scale differences and the analysis was re-­run to obtain beta coefficients
whose absolute values may be directly compared (Table 12.2). They indicate that slope is nearly twice as
influential as higher elevation settings in the rock pile placements. The coefficients may be directly applied
to the entire region via GIS map algebra to produce a mapping of the modelled relationships using:
1
p (Rock pile ) =
 
  500  
+.160 Slope +.129 Elevation +.004 RidgeIndex −.001 Aspect  
 −−7.554 +ln 
  50   
 1 + e 
 

Table 12.2 Global correlation matrix (left) for the four variables measured in the agricultural field complex and
logistic regression parameter estimates (right) indicating multivariate relationships between rock pile presence and
the four variables.

Correlation matrix Logistic regression

Slope Elevation Ridge Estimated Std. Error z value p(>|z|) Beta Coeff
Index Coefficient

Slope 1 .160 .0424 3.780 .0002 .5834


Elevation .2223 1 .129 .0620 2.080 .0376 .3395
Ridge –.0011 .1218 1 .004 .0039 1.090 .2756 .1804
Index
Aspect –.2311 –.0234 .0033 –.001 .0036 –0.291 .7709 –.0440
–7.5537 1.8716 –4.036 .0001 –2.567
(Intercept)
226 Kenneth L. Kvamme

(where the log-­ratio removes sample size imbalances). Clearly, the effects of slope are dominant (Fig-
ure 12.2(c)), but the contribution of elevation may also be discerned.
A second case study utilizes the species distribution modelling technique pioneered in the biologi-
cal sciences based on the lowest PCs of a PCA analysis (Browning et al., 2005; Dunn & Duncan, 2000;
Rotenberry et al., 2006). An historic data set includes 589 farmsteads and roads in an 18 ´ 27 km area of
northwest Arkansas derived from surveyor-­grade maps made in 1892 (Kvamme, in press, Figure 12.3(a)).

Figure 12.3 A northwest Arkansas historic data set from 1892: (a) the 18 ´ 27 km study region with 589
historic farmsteads and roads plotted over topography with towns outlined, (b) maps of the four principal
components of historic settlement with central values of legend indicating most preferred locations. A colour
version of this figure can be found in the plates section.
Analysing regional environmental relationships 227

Table 12.3 Lowest four principal components derived from 10 environmental variables in historic Northwest
Arkansas with largest absolute coefficients of eigenvectors shown in boldface for interpretive purposes.

Principal Component: PC7 PC8 PC9 PC10


Interpretation: Soils Landform Hydrology Social

Eigenvalue, λ (variance at farmsteads): .727 .518 .307 .201


% of total variance 7.3 5.2 3.1 2.0
Population variance, s2 * .962 .622 .336 .236
(when PC mapped over full region)

Eigenvectors

Aspect-­NS: 0o (N) – 180o (S) –.300 .051 .014 –.001


Aspect-­EW: 90o (E) – 270o (W) –.263 –.094 –.039 –.031
Elevation (m) .103 –.831 .010 .099
Slope (%) –.277 .086 .070 –.057
Soil quality index .491 .104 .026 –.047
Soil neighbourhood variance .663 .017 –.203 –.013
Stream distance (m) .119 .394 .638 .125
Stream density (tot distance in 400m radius) .083 –.318 .695 .222
Road distance (m) .186 –.016 .148 –.682
Road density (total distance in 1 mi radius) –.134 –.149 .198 –.674
(n−1)λ
* The ratio follows a chi-­square distribution with n -­ 1 df (Hays, 1994, p. 362), where n = 589 farmsteads; all four PCs
σ2
illustrate significantly constrained or lower variance (p < .001).

Eight GIS-­generated variables of the natural environment were acquired from the study region (aspect on
a north-­south scale, aspect on an east-­west scale, elevation, slope, a soil quality index, soil neighbourhood
variance, stream distance, stream density), and two of the social environment (road distance and road den-
sity), all at 10 m spatial resolution. These data were extracted from the loci of the farmsteads and subjected
to a standardized PCA. The four lowest components are of interest because they are location-­constraining,
illustrated by their low eigenvalues compared to their “population variances” when they are mapped to the
full region (Table 12.3). Collectively, they represent less than 18% of the total variance in the data. They also
each appear to isolate very different dimensions relevant to farmstead placement, pointing to the importance
of soils, terrain, hydrology, and the cultural network of roads as the most constraining dimensions pertinent
to farmstead placement as revealed by the farmsteads themselves. The farmsteads exhibit more constancy in
locational variation when measured on these minimum variance components making them suitable vari-
ables for subsequent locational analyses and modelling. GIS map algebra methods were employed to apply
the eigenvectors associated with each component (Table 12.3) to the corresponding standardized data from
throughout the region permitting visualization of these principal dimensions of settlement (Figure 12.3(b))
and an improved understanding of preferred siting contexts.

Conclusion
The analysis of first-­order characteristics of archaeological point patterns at the regional level generally
focuses on their relationships with characteristics of the natural environment, although fixed aspects of the
social environment, such as proximities to road networks or central places, have also been considered. A
228 Kenneth L. Kvamme

wide variety of methods have been utilized, parametric and nonparametric, for examining distributional,
central tendency, or variance characteristics of site samples relative to a region of interest. Moreover,
differences may also be examined between specific site types to examine locational variations between
them. Findings can give insights that help address the question first posed by SARG of why archaeological
sites are located in the places we find them (Plog & Hill, 1971). Positive and negative results increase the
knowledge base necessary for building explanatory models of location, and some of those relationships
can yield unanticipated insights. These methods are also important for screening relevant variables in
ALM settings as a start-­point in the model-­building process. At the same time, ALM models that combine
multiple first-­order characteristics may be the best means for characterizing them. Bevan et al. (2013)
employ ALM to “remove the effects” of first-­order locational mechanisms from a regional point-­pattern
in an effort to better explore second-­order characteristics of a settlement system.
The investigation of first-­order environmental relationships has clear advantages over studies that
focus on second-­order processes of social influences that structure patterns of settlement. For the former,
archaeological samples can be widely spread in non-­contiguous survey areas through a region. Inferences
can then be drawn from the sample to the larger population of sites that hypothetically exist in the region
of study. This is not true in the study of second-­order characteristics where contemporaneity between
settlements and sites needs to be established (making their interaction possible) and broad areas of full-­
coverage survey must be considered because all components of the “system” should be exposed. The last
is necessary in order that interactions between the full network of sites, settlements, central places, nearest
neighbours, and the like, can be considered.
Finally, an important consideration in regional studies is the definition of “region” itself. Virtually
all analytical methods establish whether relationships exist between environmental features and archaeo-
logical distributions relative to the region investigated. The nature, size, and breadth of that region must
therefore be carefully considered because the nature of findings depends largely on the definition of
region. They should be defined either relative to the spread of the archaeological distribution in ques-
tion or according to some a priori construct of arguable relevance, such as a watershed, valley, or political
entity. Clearly, more work needs to be conducted in this critical domain.

Acknowledgements
All statistical calculations and associated graphs were generated using R software (The R Project for Statisti-
cal Computing, www.r-­project.org/). The GIS operations and all maps were generated using TerrSet, by
Clark Labs at Clark University (https://clarklabs.org/terrset/). Excellent comments and improvements to
this chapter were suggested by the editors.

References
Attwell, M. R., & Fletcher, M. (1987). An analytical technique for investigating spatial relationships. Journal of
Archaeological Science, 14, 1–11.
Berry, K. J., Johnson, J. E., & Mielke, Jr., P. W. (2014). A chronicle of permutation statistical methods: 1920–2000, and
beyond. Cham, Switzerland: Springer.
Bevan, A., Crema, E., Li, X., & Palmisano, A. (2013). Intensities, interactions, and uncertainties: some new approaches
to archaeological distributions. In A. Bevan & M. Lake (Eds.), Computational approaches to archaeological spaces
(pp. 27–52). Walnut Creek, CA: Left Coast Press.
Billman, B. R. (1997). Settlement pattern research in the Americas: Past, present, and future. In B. R. Billman &
G. M. Feinman (Eds.), Settlement pattern studies in the Americas: Fifty years since Virú (pp. 1–5). Washington, DC:
Smithsonian Institution Press.
Analysing regional environmental relationships 229

Browning, D. M., Beaupré, S. J., & Duncan, L. (2005). Using partitioned Mahalanobis D2(k) to formulate a GIS-­based
model of timber rattlesnake hibernacula. Journal of Wildlife Management, 69, 33–44.
Chang, K. C. (1968). Settlement archaeology. Palo Alto, CA: National Press Books.
Chapman, J. (2000). Settlement archaeology, theory. In L. Ellis (Ed.), Archaeological method and theory: An encyclopedia
(pp. 551–555). New York: Garland Publishing.
Conolly, J., & Lake, M. (2006). Geographical information systems in archaeology. Cambridge: Cambridge University Press.
Conover, W. J. (1999). Practical nonparametric statistics (3rd ed.). New York: John Wiley.
Dunn, J. E., & Duncan, L. (2000). Partitioning Mahalanobis D2 to sharpen GIS classification. In C. A. Brebbia &
P. Pascolo (Eds.), Management information systems 2000: GIS and remote sensing (pp. 195–204). Boston: WIT Press.
Fischer, P., Farrelly, C., Naddocks, A., & Ruggles, C. (1997). Spatial analysis of visible areas from the Bronze Age
cairns of Mull. Journal of Archaeological Science, 24, 581–592.
Fish, S. K., Fish, P. R., Miksicek, C., & Madsen, J. (1985). Prehistoric agave cultivation in Southern Arizona. Desert
Plants, 7, 107–112.
Gaffney, V., & Stančič, Z. (1996). GIS approaches to regional analysis: A case study of the island of Hvar. Ljubljana: Uni-
versity of Ljubljana.
Harris, T. M., & Lock, G. R. (1990). The diffusion of a new technology: A perspective on the adoption of a geo-
graphic information systems within UK archaeology. In K. M. S. Allen, S. W. Green, & E. B. W. Zubrow (Eds.),
Interpreting space: GIS and archaeology (pp. 33–53). London: Taylor & Francis.
Hays, W. L. (1994). Statistics (5th ed.). Fort Worth, TX: Harcourt Brace.
Hodder, I. R., & Orton, C. (1976). Spatial analysis in archaeology. Cambridge: Cambridge University Press.
Hosmer, D. W., & Lemeshow, S. (2000). Applied logistic regression (2nd ed.). New York: John Wiley.
Hunt, E. D. (1992). Upgrading site-­catchment analyses with the use of GIS: Investigating the settlement patterns of
horticulturalists. World Archaeology, 24, 283–309.
Jolliffe, I. T. (2002). Principal components analysis (2nd ed.). New York: Springer-­Verlag.
Kammermans, H. (2000). Land evaluation as predictive modelling: A deductive approach. In G. Lock (Ed.), Beyond
the map: Archaeology and spatial technologies (pp. 125–146). Amsterdam: IOS Press.
Kantner, J. (1996). Settlement pattern analysis. In B. M. Fagan (Ed.), The Oxford companion to archaeology (pp. 636–
638). Oxford: Oxford University Press.
Kellogg, D. C. (1987). Statistical relevance and site locational data. American Antiquity, 52, 143–150.
Kvamme, K. L. (1985). Determining empirical relationships between the natural environment and prehistoric site
locations: A hunter-­gatherer example. In C. Carr (Ed.), For concordance in archaeological analysis: Bridging data struc-
ture, quantitative technique, and theory (pp. 208–238). Kansas City: Westport Publishers.
Kvamme, K. L. (1990). One-­sample tests in regional archaeological analysis: New possibilities through computer
technology. American Antiquity, 55, 367–381.
Kvamme, K. L. (1992a). Geographic information systems and archaeology. In G. Lock & J. Moffett (Eds.), Com-
puter applications and quantitative methods in archaeology 1991 (pp. 77–84). BAR International Series S577. Oxford:
Tempus Reparatum.
Kvamme, K. L. (1992b). Terrain form analysis of archaeological location through geographic information systems.
In G. Lock & J. Moffett (Eds.), Computer applications and quantitative methods in archaeology 1991 (pp. 127–136).
BAR International Series S577. Oxford: Tempus Reparatum.
Kvamme, K. L. (1996). Randomization methods for statistical inference in raster GIS contexts. In A. Bietti, A.
Cazzella, I. Johnson, & A. Voorrips (Eds.), The colloquia of the XIII international congress of prehistoric and protohistoric
sciences, vol. 1: Theoretical and methodological problems (pp. 107–114). Forli, Italy: ABACO.
Kvamme, K. L. (2006). There and back again: Revisiting archaeological locational modeling. In M. W. Mehrer & K.
L. Wescott (Eds.), GIS and archaeological site location modelling (pp. 3–38). Boca Raton, FL: CRC Press.
Kvamme, K. L. (in press). Defining and modeling the dimensions of settlement choice: An empirical approach. In
E. Robinson, S. Harris, & B. F. Codding (Eds.), Cultural landscapes and long-­term human ecology. Berlin: Springer.
Kvamme, K. L., Stark, M. T., & Longacre, W. A. (1996). Alternative procedures for assessing standardization in ceramic
assemblages. American Antiquity, 61, 116–126.
Levene, H. (1960). Robust tests for equality of variances. In I. Olkin, S. G. Ghurye, W. Hoeffding, W. G. Madow, & H.
B. Mann (Eds.), Contributions to probability and statistics: Essays in honor of Harold Hotelling (pp. 278–292). Stanford:
Stanford University Press.
230 Kenneth L. Kvamme

Lock, G., & Harris, T. (2006). Enhancing predictive archaeological modeling: Integrating location, landscape, and
culture. In M. W. Mehrer & K. L. Wescott (Eds.), GIS and archaeological site location modelling (pp. 41–62). Boca
Raton, FL: CRC Press.
Maschner, H. D. G. (1996). The politics of settlement choice on the Northwest Coast: Cognition, GIS, and coastal
landscapes. In M. Aldenderfer & H. D. G. Maschner (Eds.), Anthropology, space, and geographic information systems
(pp. 175–189). Oxford: Oxford University Press.
Mehrer, M. W., & Wescott, K. L. (Eds.). (2006). GIS and archaeological site location modeling. Boca Raton, FL: CRC Press.
Mink II, P. B., Stokes, B. J., & Pollack, D. (2006). Points vs. polygons: A test case using statewide geographic infor-
mation. In M. W. Mehrer & K. L. Wescott (Eds.), GIS and archaeological site location modelling (pp. 219–239). Boca
Raton, FL: CRC Press.
Murphy, R. F. (1977). Introduction: The anthropological theories of Julian H. Steward. In J. C. Steward & R. F.
Murphy (Eds.), Evolution and ecology: Essays on social transformation by Julian H. Steward (pp. 1–39). Urbana, IL:
University of Illinois Press.
O’Sullivan, D., & Unwin, D. (2003). Geographic information analysis. New York: John Wiley.
Parsons, J. R. (1972). Archaeological settlement patterns. Annual Review of Anthropology, 1, 127–150.
Pearson, C. E. (1978). Analysis of Late Mississippian settlements on Ossabaw Island, Georgia. In B. D. Smith (Ed.),
Mississippian settlement patterns (pp. 53–80). New York: Academic Press.
Peebles, C. S. (1978). Determinants of settlement size and location in the Moundville phase. In B. D. Smith (Ed.),
Mississippian settlement patterns (pp. 369–416). New York: Academic Press.
Plog, F. (1968). Archaeological surveys: A new perspective (Unpublished master’s thesis). Department of Anthropology,
University of Chicago, Chicago.
Plog, F., & Hill, J. N. (1971). Explaining variability in the distributions of sites. In G. J. Gumerman (Ed.), The distri-
bution of prehistoric population aggregates (pp. 7–36). Prescott, AZ: Prescott College Press.
R Core Team. (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing,
Vienna, Austria. Retrieved from www.R-­project.org/
Rotenberry, J. T., Preston, K. L., & Knick, S. T. (2006). GIS-­based niche modeling for mapping species’ habitat.
Ecology, 87, 1458–1464.
Shennan, S. (1988). Quantifying archaeology. Edinburgh: Edinburgh University Press.
Shermer, S. J., & Tiffany, J. A. (1985). Environmental variables as factors in site location: An example from the Upper
Midwest. Midcontinental Journal of Archaeology, 10, 215–240.
Steward, J. H. (1937). Ecological aspects of Southwestern society. Anthropos, 32, 87–104.
Trigger, B. G. (1967). Settlement archaeology: Its goals and promise. American Antiquity, 32, 149–160.
Trigger, B. G. (1968). The determinants of settlement patterns. In K. C. Chang (Ed.), Settlement archaeology
(pp. 53–78). Palo Alto, CA: National Press.
Ullah, I. I. T. (2011). A GIS method for assessing the zone of human-­environmental impact around archaeological
sites: a test case from the Late Neolithic of Wadi Ziqlâb, Jordan. Journal of Archaeological Science, 38, 623–632.
Verhagen, P., & Whitley, T. G. (2012). Integrating archaeological theory and predictive modeling: A live report from
the scene. Journal of Archaeological Method and Theory, 19, 49–100.
Vita-­Finzi, C., & Higgs, E. S. (1970). Prehistoric economy in the Mt. Carmel area of Palestine: Site catchment analy-
sis. Proceedings of the Prehistoric Society, 36, 1–37.
Warren, R. E., & Asch, D. L. (2000). A predictive model of archaeological site location in the Eastern Prairie Pen-
insula. In K. L. Wescott, & R. J. Brandon (Eds.), Practical applications of GIS for archaeologists: A predictive modeling
toolkit (pp. 5–32). London: Taylor and Francis.
Wheatley, D. (1995). Cumulative viewshed analysis: A GIS-­based method for investigating intervisibility, and its
archaeological application. In G. Lock and Z. Stančič (Eds.), Archaeology and geographical information systems
(pp. 171–186). London: Taylor and Francis.
Willey, G. R. (1953). Prehistoric settlement patterns in the Virú Valley, Peru. Bureau of American Ethnology Bulletin 155.
Washington, DC: Smithsonian Institution Press.
Winters, H. D. (1969). The Riverton culture: A second millennium occupation in the Central Wabash Valley. Monographs 1.
Springfield, IL: Illinois State Museum.
13
Predictive spatial modelling
Philip Verhagen and Thomas G. Whitley

Introduction
Archaeological predictive modelling can be defined as a set of techniques employed to predict “the
location of archaeological sites or materials in a region, based either on a sample of that region or on
fundamental notions concerning human behavior” (Kohler & Parker, 1986, p. 400). The basic premise
of predictive modelling is that human spatial behaviour is to a large extent predictable, which implies
that the locations where people lived and performed their daily activities can be identified on the basis
of statistical and/or explanatory models.
The roots of predictive modelling can be traced back to the days of New Archaeology, and in particu-
lar the development of Site Catchment Analysis in the 1970s (Vita-­Finzi & Higgs, 1970). Archaeologists
became aware that human settlement is intimately linked to its environmental setting, which also implied
that it should be possible to predict the locations suitable for human settlement. Around the same time,
Cultural Resource Management (CRM) was developing in North America through legislation aimed at
the protection of cultural heritage. In the absence of sufficient data to identify all archaeological sites in a
region, predictive modelling answered the need for a more comprehensive mapping of cultural resources
and ways in which to avoid impacts to them by development. Although mainframe-­based Geographic
Information Systems (GIS) had already been applied to predictive modelling in a very limited fashion,
the arrival of desktop-­based GIS in the late 1980s paved the way for further proliferation of predictive
modelling, particularly in CRM contexts. Publication of seminal works on the theory and methods of
predictive modelling began initially in the USA (Kohler & Parker, 1986; Judge & Sebastian, 1988) and
later also in Europe (Van Leusen & Kamermans, 2005).
Predictive modelling is used in archaeology for two purposes. First, it is a planning aid for CRM in
order to assess the risks of disturbing archaeological remains during development projects. Here predictive
models serve to inform and influence the decision-­making processes of planners and to convince them
that developments should take place in the least sensitive areas. They also guide the archaeological inves-
tigations once developments have started, by making reasoned choices on where to concentrate research
efforts within the constraints of available time and money. The cost-­effectiveness of the approach has
been proven in CRM, and it has provided a basic degree of protection to zones of high archaeological
potential in those regions where it is included in CRM policies. Second, predictive modelling is applied
232 Philip Verhagen and Thomas G. Whitley

as a tool to develop and test scientific models of human locational behaviour. In academic contexts,
therefore, predictive models can be considered as heuristic devices (Verhagen & Whitley, 2012) that can
play an important role in formalizing and quantifying theoretical notions on the development of settle-
ment patterns and land use.
Despite its widespread application, predictive modelling has also encountered substantial criticism
from archaeologists. This is because it will never be able to accurately predict the locations or presence
of all archaeological remains. The models are only as good as the data and theories that have been used
to create them, since we can only extrapolate from the existing state of archaeological knowledge. In that
respect, there is no real difference between theory-­driven or data-­driven approaches; they both reflect
our existing understandings, or biases, about the human past. For many archaeologists the risk of ‘wrong
predictions’ and their undesired consequences for the protection and investigation of the archaeological
record in CRM is unacceptable. From a scientific point of view, however, discrepancies between predic-
tion and data can be the starting point to develop new theories about site location and/or to direct future
data collection practices, which could also be a guiding principle for decision making in CRM.

Method
Predictive models can be made following two different strategies, usually named ‘inductive’ and ‘deduc-
tive’ (Kamermans & Wansleeben, 1999), but more accurately described as ‘data-­driven’ and ‘theory-­
driven’ (Wheatley & Gillings, 2002). In data-­driven modelling, the locations of known archaeological
sites in a study region are compared to a number of parameters that are considered to be important to
settlement choice (such as slope gradient, soil type, or distance to water), using various statistical assess-
ments within a GIS. The results of this quantitative site location analysis can then be extrapolated to
areas where no archaeological data are yet available, and will thus result in a prediction of site densities
or probabilities for the area to be studied. A popular technique for data-­driven predictive modelling
is logistic regression (cf. Warren, 1990; Hudak et al., 2002; Conolly & Lake, 2006, Chapter 8.8). This
is a technique for fitting a prediction curve to a set of observations that is especially suited for vari-
ables that are measured at different scales (nominal, ordinal, interval and/or ratio). Also, fitting to a
logistic rather than to a linear curve has advantages for increasing the statistical contrast between site
and non-­site locations (Warren & Asch, 2000), and as such it has been the preferred tool for predictive
modelling for many years. However, many other options are becoming available, including ecological
niche modelling (cf. Kondo, 2015; Banks, 2017), Monte Carlo simulations (cf. Kvamme, 1997; Vana-
cker et al., 2001) and Bayesian statistics (cf. Finke, Meylemans, & Van de Wauw, 2008; Van Leusen,
Millard, & Ducke, 2009).
Despite the current state of sophistication of statistical modelling, it is still very difficult to determine
which statistical technique performs best since there are hardly any case studies available where methods are
compared. Additionally, regression and similar statistical analyses require large datasets of previously known
archaeological sites to produce significant results. Therefore, only well-­represented site types or behaviours
can be predicted using such techniques, and prior biases in those datasets dramatically affect the outcomes.
The theory-­driven approach bypasses most of the statistical complexity by defining theoretical
assumptions about the parameters influencing human spatial behaviour. For example, it can be assumed
that early farming communities preferentially located their settlements in environments well-­suited for
agriculture and animal husbandry, which in turn can be related to parameters such as soil quality and
texture, nitrogen content, and moisture potential or drainage. Those characteristics may be embodied
in, and extracted from, soil type classifications which exist within a GIS dataset. Weights are then given
to the parameters based on the nature of the assumptions about the people who may have been living
Predictive spatial modelling 233

there, as well as different site functions or behaviours. The weights and variables are then combined into
predictive formulas which are compared to the known archaeological record in order to judge their per-
formance. This approach has the advantage of including more sophisticated theoretical frameworks that
are based on causal explanation; it can include human agency as a factor, and it is not directly dependent
on archaeological datasets (Verhagen & Whitley, 2012). As a result, there are few restrictions on the types
of sites or behaviours which might be predicted. Deciding on a best possible model however still implies
comparing various parameter weights to the actual archaeological data.

Problems and pitfalls


The potential flaws of predictive models were summarized by Van Leusen and Kamermans (2005):

1 the archaeological input data is usually not representative of the full archaeological record, and the
archaeological record itself is a biased reflection of human activity in the past; therefore, we cannot
expect that models based on existing archaeological datasets will accurately predict all locations of
past human settlement;
2 the predictor variables used are often based on modern-­day environmental datasets, that may not be
at the right level of detail and accuracy for predictive modelling purposes, and may not accurately
reflect the situation in the past;
3 socio-­cultural variables are usually not included;
4 the temporal resolution of predictive models is limited, since this is determined by the archaeological
and environmental input datasets used; and
5 testing of predictive models is often done in a haphazard way and in most cases does not involve
a representative field survey; the distinction made between areas of low and high probability has
instead often led to a policy of not surveying the low-­probability zones, and in this way, a self-­
fulfilling prophecy will be created (Wheatley, 2004).

Predictive modelling has often been criticized for restricting itself to a limited set of ‘environmental’ vari-
ables. This can partly be attributed to the fact that there are relatively few relevant datasets available that
cover large areas. The inputs most often used are derivatives of Digital Elevation Models (in particular
slope and aspect), and, to a lesser extent, topographical, geological and pedological maps. This approach
has been successfully applied in many cases, but the datasets are sometimes used in a very uncritical way.
Socio-­cultural variables (such as distance to roads, or to specific archaeological sites), on the other hand,
are much more difficult to implement in predictive models because of the scarcity of relevant data and
lack of quantifiable theoretical models, although there is no inherent barrier to including them (see e.g.
Whitley, Moore, Goel, & Jackson, 2010).
Ideally, the desired degree of accuracy of a predictive model should determine what is needed in terms
of data and knowledge. In practice, however, predictive models are often made on the basis of datasets
that happen to be available, and for this reason they can vary considerably in their accuracy. It is therefore
extremely important that the models are tested, both internally in order to establish the uncertainties
in the parameters and archaeological data used by means of sensitivity analysis, and externally by add-
ing independent, representative archaeological data. Establishing the representativeness of archaeological
datasets however is often very difficult, since in many cases there is insufficient information about the
intensity and methods of survey applied and about the influence of potential biases, such as the visibility
of archaeological remains on the surface. These factors highly influence not just the number of archaeo-
logical sites found but also the types of sites that can be discovered successfully (Verhagen, 2008).
234 Philip Verhagen and Thomas G. Whitley

Creating and testing archaeological predictive models is therefore a complex exercise, at the end of
which the results need to be translated into terms that can be easily implemented in CRM and other
planning contexts. Formal statistical assessments do not necessarily play an important role in this. Instead,
a number of explicit and implicit assumptions about the importance of specific archaeological remains,
their state of preservation, and the costs of excavating them are used as well to assess the archaeological and
financial risks of development plans. Thus, creating predictive models is only one stage in the decision-­
making process surrounding archaeology in spatial planning, and as such their role in CRM should not
be overemphasized. However, since they are used at the beginning of the planning process and will direct
decision making in subsequent stages, their accuracy should be a major concern to archaeologists, devel-
opers and planners alike.

Case studies

The Mn/Model
A classic example of large-­scale agency-­supported predictive modelling is known as the ‘Minnesota
Model’ (or Mn/Model in abbreviation – Hudak et al., 2002). The Mn/Model was the first archaeological
predictive model to be applied to an entire US state, and was originally initiated in 1995 by the Minnesota
Department of Transportation with the financial support of the US Federal Highway Administration. It
was also the first widely applied, data-­driven model to consider survey bias and depth of deposits in its
application.
The main objective of the Mn/Model was to provide transportation planners with a GIS-­based tool
that would help them identify areas likely to contain archaeological sites, so that they could be avoided.
The idea was that these sites could be identified early on in the planning process, thereby saving time and
expense later on when transportation projects were underway. The primary methodological assumptions
of the model were:

1 That only pre-­1837 sites could be predicted using the methods employed. The year 1837 marks the
earliest permanent historic-­era settlement within Minnesota, and it was largely assumed that historic
Euro-­American settlement followed more complex patterns not defined by environmental variables.
2 That separate models were necessary for 24 different ‘environmental regions’ within Minnesota.
Each region was defined based on topographic distinctions, ecological communities, or geomorpho-
logical origins. They each also had a unique set of pre-­existing archaeological sites from which the
correlative analyses were drawn.
3 That paleo-­landscape and geomorphological modelling would additionally help create the ‘third and
fourth’ dimensions of depth and time to the analysis.
4 That issues with sample size and pre-­existing survey bias could be overcome by using statistical
techniques. These techniques would allow the generation of appropriate datasets from which cor-
relations could be derived.

The Mn/Model is a set of 24 multiple logistic regression models (one for each environmental subregion)
that each identify correlations between a dependent variable (e.g. site presence/absence) and a wide range
of independent variables (the environmental predictors). In this case, predictor variables were derived
from elevation, watersheds, hydrology, soils, geology, vegetation maps, anthropogenic disturbances (usually
modern), paleoclimate models of temperature, precipitation, geomorphological events, and palynology,
as well as some historical cultural features in limited contexts (Hudak et al., 2002). Known site locations
Predictive spatial modelling 235

were evaluated against ‘non-­sites’ using these variables in a logistic regression analysis for each environ-
mental region over three successive phases; each modified in response to data quality or other issues
encountered during the process.
The resulting regional models met or exceeded the performance expected by the modellers, with
an average gain statistic (cf. Kvamme, 1988) of about 0.71 for all regions combined during Phase 3,
which had improved from 0.37 in Phase 1, and 0.68 in Phase 2. That range of individual gain statistics
though varied widely across regions with some as low as 0.40 and others as high as 0.89 depending on
the number of sites being modelled and the ability of the model to reduce the size of the high/medium
probability areas.
Gain is calculated as follows (Kvamme, 1988):

G = 1 − pa /ps

where
pa = the area proportion of the zone of interest (usually the zone of high probability); and
ps = the proportion of sites found in the zone of interest.

If the area likely to contain sites in a region is small (the model is very precise), and the sites found in
that area represent a large proportion of the total (the model is very accurate), then we will have a model
with a high gain.
The authors note that the positive results in some cases can be very misleading due to biased survey,
few known sites in limited environments, and very few actual surveys having been conducted. Its total
cost over a period of seven years was $4.5 million. Nevertheless, the Mn/Model has been held up as a
successful application of archaeological predictive modelling since it is estimated to have saved the State
of Minnesota about $3 million per year, for the first four years of its implementation. Several other states
have followed with their own statewide data-­driven archaeological predictive models including North
Carolina and Washington.

The LAMAP approach


The most popular methods for probabilistic predictive modelling of unrecorded site locations – logistic
regression and weights-­of-­evidence modelling – are not well suited for dealing with site presence-­only
data. Unfortunately, many archaeological survey data sets have little or no information on site absence,
and predictions are then usually made using ‘pseudo-­absence’ data, by assuming that the prevalence of
non-­sites is approximately equal to a random or uniform distribution over the whole study region. This
is justified by the argument that the proportion of sites compared to non-­sites is very small, and thus the
prevalence of non-­sites will be very similar to such a random or uniform distribution (Kvamme, 1988).
Carleton, Conolly, and Iannone (2012) and Carleton et al. (2017) instead defined the notion of
archaeological potential as ‘the relative suitability of different land parcels within a confined region for
human occupation’, a measure that can be estimated from site presence-­only data. A similar concept is
applied for expert-­judgment based predictive models in the Netherlands (see Van Leusen & Kamermans,
2005), but these do not include quantitative estimations.
The underlying theoretical concept assumes that, when deciding where to settle, people will preferen-
tially select the best of the available options using ‘mental archetypes’ from nearby areas. So locations that
are more similar to nearby sites will be more likely to be settled than other ones. The mental archetypes
themselves are not directly accessible, but since existing sites are realizations of these archetypes, it can be
236 Philip Verhagen and Thomas G. Whitley

assumed that their characteristics can be used for predictive modelling purposes by finding the locations
that are most similar to them.
The authors developed a new methodology for this purpose, the Locally Adaptive Model of Archaeo-
logical Potential (LAMAP), that employs not just the information from a site’s location itself, but from
a predefined (circular) neighbourhood around the site. The characteristics of these known site location
surroundings are then compared to the whole study region, resulting in a measure of similarity of each
location in the region to the characteristics of the known sites. Such an approach is not completely new:
similar GIS-­based analyses had already been undertaken extensively in the south of France since the 1990s
(see Favory, Nuninger, & Sanders, 2012). However, these were not aimed at predictive modelling but only
at analysing site location preferences.
The LAMAP method is implemented by first calculating the frequency of each particular value of
a landscape characteristic, like elevation, within a neighbourhood around a site. Optionally, the model
accommodates distance weighting, so that locations closer to the site will be considered as more impor-
tant than those further away. Then, it is established how probable it is that the observed set of values that
occurs jointly within the site’s neighbourhood is also found elsewhere.
Carleton et al. (2012) first prepared a test model for a set of Maya sites in Belize, using a conventional
set of environmental parameters and a distance radius of 1 km. The model’s performance, measured with
Kvamme’s gain statistic, was considered to be good, although the testing was only done by holding back
a portion of the site sample (split sampling; see Kvamme, 1988, for more details). Later, they also tested the
model using new field survey data and sites newly identified on Light Detection and Ranging (LiDAR)
images (Carleton et al., 2017). This resulted in a very close correlation between prediction and site occur-
rence. The high success rate coupled to the relatively simple implementation suggests that this method
is a good solution for data-­driven predictive modelling on the basis of site presence-­only data without
having to resort to using pseudo-­absence data.

Georgia Coast Model


In response to the overwhelming number of predictive models that are based on biased datasets and rely
on a data-­driven approach, Whitley (2003, 2005, 2010) addressed ideas of causality and determinism in
predictive modelling and the misleading use of the gain statistic as the sole measure of model success.
His approach to a theory-­driven model was one based on examining human energetics, or the idea that
all human behaviour entails maintaining a balance between energy costs and expenditure, and that site
‘selection’ was a cognitive process based on the applications of both choice and risk. To that end, he initi-
ated the Georgia Coast Model in 2009, as part of a large-­scale analysis of the collection, storage, trade,
and consumption of faunal and floral resources in the coastal region of the US state of Georgia between
4500 and 300 BP (Whitley et al., 2010; Whitley, 2013). The goal was not to create a predictive model
per se, but to use geospatial analysis, driven by theory, to develop explanations for why certain areas may
have been chosen for settlement.
Rather than predict the locations of archaeological sites, the energetics approach was to predict the
locations of faunal and floral resources during different seasons and over long periods, as well as modelling
other environmental variables that limited the availability of, or access to, those resources. The approach
relies on key concepts from Optimal Foraging theory (MacArthur & Pianka, 1966; Emlen, 1966), Cen-
tral Place Foraging (Orians & Pearson, 1979; Stephens & Krebs, 1986), as well as Diet Breadth (Hames &
Vickers, 1982; O’Connell & Hawkes, 1984; Winterhalder and Kennett, 2006; Smith, 1991; Grayson &
Delpech, 1998) and Prospect theories (Kahneman & Tversky, 1979; Tversky & Kahneman, 1992; Wak-
ker, Timmermans, & Machielse, 2003) to model how prehistoric people may have maximized their
Predictive spatial modelling 237

energy consumption and expenditure given assumptions about their diet known from the archaeological
record. Through this it would be possible to model a variety of outcomes including the nature of catch-
ments from known sites, dietary preferences and sustainable populations, the evolution from land-­based
to maritime diets in the region, seasonal resource stability, and even complex concepts such as resource
competition and social dominance. Additionally, the approach could be used to generate an archaeologi-
cal predictive model for sites, based on their likely function, in non-­surveyed areas.
The overall model is designed as an intersection of a series of environmental and habitat models devel-
oped in the GIS. These are weighted-­additive predictive models, exactly like the kind used for predicting
archaeological site locations in other situations described above. But, these are intended to predict the
habitat suitability for one specific resource at one specific time of year based on regional biological studies
of that particular organism and existing local, state, and regional habitat models. Fifteen different envi-
ronmental variables are used to develop geospatial models for 37 different forage categories (i.e. faunal
or floral species or groupings), for each month of the year. These models represent suitability as a range
of values from 0 (not suitable) to 1 (highest suitability) (Figures 13.1 and 13.2).
The habitat models are then converted to calorific surfaces based on estimates of species population
size, density, reproductive rates, mortality, and resilience, during each month. A calorific surface is a GIS
layer, which shows a prediction for the number of calories (kCal) one might acquire from any one pixel
(or map unit) from each resource at any given time of year. Instead of a decimal value between 0 and 1,
as in the habitat models, the calorific surfaces represent numbers of calories. By adding them all together,
one gets a total number of predicted calories at every GIS pixel in the study area. The resulting ‘available’
calorific surfaces are then modified into models of ‘returned’ calories by subtracting the calorific ‘costs’
of acquiring the ‘available’ resources (Figures 13.3 and 13.4).
The cost formulas are based on energy expenditures for individuals and families calculated by Thomas
(2008), for each of the resources used in the study. But, they are also tempered by known archaeological
faunal/floral assemblages and the periodic introduction of different technologies; such as the bow-­and-­
arrow, the introduction of crop staples, and grain storage. Subtracting calories based on rates of energy
loss through decay and trade, as well as dietary preferences (e.g. personal tastes) leads to the outcome of a
‘selected’ energy model. In short, the available calories are the ones predicted by the habitat models. The
returned calories are the predicted calories but with the costs of accessing and processing them subtracted.
The selected calories are the ones remaining after some have been lost over time from decay, traded away
to someone else, or left uncollected for some other reason.
Ultimately, the objectives of the model were meant to be explanatory, in that they helped answer
questions about periods of site occupation, seasonality, diet sufficiency, patterns of exploitation, regional
trade, and competition. To do so meant applying the locations of some 7000 known archaeological sites,
but of which only 103 were well-­dated domestic sites, to the analysis. Although no predictive model was
actually intended from the outcome, Whitley used a sample area with a total of 308 known archaeological
sites, did a simple unweighted combination of all calorific variables, and split them into three equal parts
creating areas of low, moderate, and high total calorific value. These were then overlaid with the known
archaeological sites and compared to see how many occurred in high probability areas.
This simple analysis showed that even without formal constructs separating out sites by period, func-
tion, or seasonality, or a more sophisticated evaluation of the boundaries between high and low potential-
ity, the gain values were in excess of 0.80; higher than typical data-­driven models and unheard of for areas
without high terrain dissection or limitations on water availability. The objective here was not to evaluate
the application of this particular simplistic predictive model for development purposes, but to illustrate
that a theory-­driven approach was far more powerful than a data-­driven one in predicting how human
behaviour shapes site selection. The gain statistic, even with its inherent flaws, was merely used here as
Figure 13.1 Southern portion of the coastal Georgia study area: maximum available calories for white-­tailed
deer (Odocoileus virginianus) for the month of September (ca. 500 BP). A colour version of this figure can be
found in the plates section.
Figure 13.2 Southern portion of the coastal Georgia study area: maximum available calories for all shellfish
species for the month of September (ca. 500 BP). A colour version of this figure can be found in the plates
section.
Figure 13.3 Southern portion of the coastal Georgia study area: returnable calories for all resources combined
for the month of January (ca. 500 BP). A colour version of this figure can be found in the plates section.
Figure 13.4 Southern portion of the coastal Georgia study area: returnable calories for all resources combined
for the month of September (ca. 500 BP). A colour version of this figure can be found in the plates section.
242 Philip Verhagen and Thomas G. Whitley

a comparative device with far more expensive data-­driven models like the Mn/Model. This approach
was also used in a more formal predictive model successfully applied in parts of Louisiana, Arkansas and
Mississippi (Whitley et al., 2011), but which remained untested in the field since federal funding expired.

The effects of pre-­existing settlement on location choice


The predictive modelling examples given so far have only employed the environmental characteristics of
site location. This set of predictors can be supplemented by analysing the spatial relationships between
settlements. However, modelling the spatial influence of existing settlement on new settlement location
choice was until recently mostly based on relatively simple theories about the effects of cost-­distance.
Archaeological studies in the 1970s, for example, already applied gravity models to infer networks within
(political) territories, departing from the assumption that central places have a cluster of dependent sites
surrounding them. Central places can be given weights to reflect different hierarchical ranks, and depen-
dent sites are then connected to the central place which is closest in terms of weighted (cost-­)distance
(e.g. Hodder, 1974; Alden, 1979; Renfrew & Level, 1979; Ducke & Kroefges, 2008). These predicted
territorial networks can also be used to infer the location of ‘missing’ sites in areas where connections are
modelled but no settlements have been recorded.
It is only recently that researchers have started to explore more complex models of settlement interac-
tion, and translate these into spatial predictions. Verhagen, Nuninger, Bertoncello, and Castrorao Barba
(2016), for example, defined the concept of ‘memory of landscape’ on the assumption that not just the
presence of current, but also of previously existing settlement may influence the choice of site location.
Their case study for rural settlements from the Roman period in the south of France demonstrated that
this effect existed, even when it did not seem to be a very strong predictor for this particular study area.
The effect also changed depending on the timeframe considered. In some periods, new settlements
seemed to be more pioneering, preferring locations in previously unexploited areas, whereas others
showed ‘opportunistic’ behaviour by choosing locations close to previously settled areas. This approach
was further elaborated by Nüsslein, Nuninger, and Verhagen (in press) in a case study in northeast France,
where the creation and persistence of settlement patterns was observed to depend on the structure of
local, hierarchical networks.
Rihll and Wilson (1987, 1991) modelled the presence of highly dominant sites without making initial
assumptions about their importance on the basis of size or other considerations. Instead, they departed
from equally weighted sites so that the weights were adapted in an iterative procedure depending on the
number and strength of the connections for each site. A site that occupies a central position in the first
iteration (when no weights are given) will receive a higher weight in the next one, and so on, until the
network stabilizes. This has the effect of prioritizing centrally located sites and creates a strong hierarchi-
cal structure of nodes with a limited number of very important sites or ‘terminals’. Bevan and Wilson
(2013) demonstrated that creating connections between settlements from the Bronze Age on Crete on
the basis of cost-­distances in this way quickly imposes a ‘hierarchy of activity’ on the landscape, favour-
ing specific connections for interactions. The effect is self-­reinforcing, leading to a system of highly
hierarchized settlement with only a few major arteries of movement, that broadly corresponds to the
archaeological evidence. The model results however remain tentative and can better be seen as offering
new interpretations of settlement structure than as reliable predictions of ancient networks of social and
economic interaction. This approach has recently been applied in various other case studies (Rivers,
Knappett, & Evans, 2013; Davies et al., 2014; Paliou & Bevan, 2016).
The influence of pre-­existing settlement patterns on the development of a subsequent settlement
pattern is not easy to assess, since it needs archaeological datasets with a high chronological and spatial
Predictive spatial modelling 243

resolution. In the absence of this information, we therefore can only model spatio-­temporal patterns
by including uncertainty. Bevan and Wilson (2013) and Paliou and Bevan (2016) assessed the quality
of their models by including simulated additional settlements in the modelled network, based on a
prediction of suitable site locations. By repeating this procedure a large number of times, it could be
established whether the resulting networks’ characteristics were stable or not. In this exercise, conven-
tional predictive modelling was therefore used to support archaeological analysis, rather than the other
way around.

Conclusion
Predictive modelling has a long and controversial history in archaeology. There are almost as many
methods of creating a predictive model as there are actual models out there. Yet we routinely see new
models that rely on a data-­driven, correlative approach. These are almost always logistic regression-­based
analyses applied to strictly environmental parameters taken straight from the Judge and Sebastian (1988)
playbook. Such techniques have worked well in some situations, but they leave a great deal to be desired
from an explanatory perspective and are routinely criticized for their perceived environmental determin-
ism, along with many other issues. Repeated application of these methods, developed in the 1970s and
1980s, seems to have driven a philosophical wedge between academic and CRM opinions regarding the
value of predictive models (Verhagen & Whitley, 2012). Although there are new and innovative develop-
ments arising every year in archaeological predictive modelling, the general (incorrect) perception is one
of methodological stagnation and theoretical limitations. Changing such a perception will eventually
require new publications that can revisit the theory and methods of predictive modelling, and can put
them into a more modern context.

References
Alden, J. R. (1979). A reconstruction of toltec period political units in the valley of Mexico. In C. Renfrew & K. L.
Cooke (Eds.), Transformations: Mathematical approaches to cultural change (pp. 169–200). New York, NY: Academic
Press.
Banks, W. E. (2017). The application of ecological niche modeling methods to archaeological data in order to exam-
ine culture-­environment relationships and cultural trajectories. Quaternaire, 28, 271–276.
Bevan, A., & Wilson, A. (2013). Models of settlement hierarchy based on partial evidence. Journal of Archaeological
Science, 40, 2415–2427.
Carleton, W. C., Cheong, K. F., Savage, D., Barry, J., Conolly, J., & Iannone, G. (2017). A comprehensive test of the
Locally-­Adaptive Model of Archaeological Potential (LAMAP). Journal of Archaeological Science: Reports, 11, 59–68.
Carleton, W. C., Conolly, J., & Iannone, G. (2012). A Locally-­Adaptive Model of Archaeological Potential (LAMAP).
Journal of Archaeological Science, 39, 3371–3385.
Conolly, J., & Lake, M. (2006). Geographical informations systems in archaeology. Cambridge, UK: Cambridge University
Press.
Davies, T., Fry, H., Wilson, A., Palmisano, A., Altaweel, M., & Radner, K. (2014). Application of an entropy maximiz-
ing and dynamics model for understanding settlement structure: The Khabur triangle in the middle Bronze and
Iron ages. Journal of Archaeological Science, 43, 141–154.
Ducke, B., & Kroefges, P. C. (2008). From points to areas: Constructing territories from archaeological
site patterns using an enhanced xtent model. In A. Posluschny, K. Lambers, & I. Herzog (Eds.), Layers of
perception: Proceedings of the 35th international conference on computer applications and quantitative methods in
archaeology (CAA), Berlin, Germany, April 2–6, 2007, Kolloquien zur Vor-­und Frühgeschichte (Vol. 10, p. 243).
Bonn: Dr. Rudolf Habelt GmbH, + CD-­ROM. Retrieved from http://proceedings.caaconference.org/
paper/78_ducke_kroefges_caa2007/
244 Philip Verhagen and Thomas G. Whitley

Emlen, J. M. (1966). The role of time and energy in food preference. American Naturalist, 100, 611–617.
Favory, F., Nuninger, L., & Sanders, L. (2012). Integration of geographical and spatial archeological concepts for the
study of settlement systems. L’Espace géographique, 41, 295–309.
Finke, P. A., Meylemans, E., & Van de Wauw, J. (2008). Mapping the possible occurrence of archaeological sites by
Bayesian inference. Journal of Archaeological Science, 35, 2786–2796.
Grayson, D. K., & Delpech, F. (1998). Changing diet breadth in the early Upper Palaeolithic of southwestern France.
Journal of Archaeological Science, 25, 1119–1129.
Hames, R., & Vickers, W. (1982). Optimal diet breadth theory as a model to explain variability in Amazonian hunt-
ing. American Ethnologist, 9, 358–378.
Hodder, I. (1974). Some marketing models for Romano-­British coarse pottery. Britannia, 5, 340–359.
Hudak, G. J., Hobbs, E., Brooks, A., Sersland, C. A., & Phillips, C. (Eds.). (2002). Mn/model final report 2002: A
predictive model of precontact archaeological site location for the state of Minnesota. St. Paul, MN: Minnesota Department
of Transportation.
Judge, J. W., & Sebastian, L. (Eds.). (1988). Quantifying the present and predicting the past: Theory, method and application
of archaeological predictive modelling. Denver, CO: U.S. Department of the Interior, Bureau of Land Management.
Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47, 263–291.
Kamermans, H., & Wansleeben, M. (1999). Predictive modelling in Dutch archaeology, joining forces. In J. Barceló, I.
Briz, & A. Vila (Eds.), New techniques for old times-­CAA98: Computer applications and quantitative methods in archaeol-
ogy (pp. 225–230). Oxford: Archaeopress.
Kohler, T. A., & Parker, S. C. (1986). Predictive models for archaeological resource location. In M. B. Schiffer (Ed.),
Advances in archaeological method and theory (Vol. 9, pp. 397–452). New York: Academic Press.
Kondo, Y. (2015). An ecological niche modelling of Upper Palaeolithic stone tool groups in the Kanto-­Kushinetsu
region, eastern Japan. The Quaternary Research, 54, 207–218.
Kvamme, K. L. (1988). Development and testing of quantitative models. In W. J. Judge & L. Sebastian (Eds.), Quanti-
fying the present and predicting the past: Theory, method, and application of archaeological predictive modelling (pp. 325–428).
Denver, CO: U.S. Department of the Interior, Bureau of Land Management Service Center.
Kvamme, K. L. (1997). GIS and statistical inference in Arizona: Monte Carlo significance tests. In I. Johnson &
M. North (Eds.), Archaeological applications of GIS: Proceedings of colloquium II, UISPP XIIIth congress, Forlí, Italy,
September 1996. Sydney: University of Sydney.
MacArthur, R. H., & Pianka, E. R. (1966). On the optimal use of a patchy environment. American Naturalist, 100,
603–609.
Nüsslein, A., Nuninger, L., & Verhagen, P. (in press). To boldly go where no one has gone before: Integrating social
factors in site location analysis and predictive modelling, the hierarchical types map. In J. B. Glover, J. M. Moss, &
D. Rissolo (Eds.), Digital archaeologies, material worlds. Proceedings of the CAA2017 Conference, Atlanta.
O’Connell, J. F., & Hawkes, K. (1984). Food choice and foraging sites among the Alyawara. Journal of Anthropological
Research, 40, 435–504.
Orians, G. F., & Pearson, N. E. (1979). On the theory of central place foraging. In D. J. Horn, R. D. Mitchell, &
C. R. Stairs (Eds.), Analysis of ecological systems (pp. 154–177). Columbus: Ohio State University Press.
Paliou, E., & Bevan, A. (2016). Evolving settlement patterns, spatial interaction and the socio-­political organisation
of late prepalatial South-­Central crete. Journal of Anthropological Archaeology, 42, 184–197.
Renfrew, C., & Level, E. (1979). Exploring dominance: Predicting polities from centers. In C. Renfrew & K. L.
Cooke (Eds.), Transformations: Mathematical approaches to cultural change (pp. 145–166). New York, NY: Academic
Press.
Rihll, T. E., & Wilson, A. G. (1987). Spatial interaction and structural models in historical analysis: Some possibilities
and an example. Histoire & Mesure, 2, 5–32.
Rihll, T. E., & Wilson, A. G. (1991). Modelling settlement structures in ancient Greece: New approaches to the polis.
In J. Rich & A. Wallace-­Hadrill (Eds.), City and country in the ancient world (Vol. 3, pp. 58–95). London: Routledge.
Rivers, R., Knappett, C., & Evans, T. (2013). What makes a site important? Centrality, gateways and gravity. In
C. Knappett (Ed.), Network analysis in archaeology: New approaches to regional interaction (pp. 125–150). Oxford:
Oxford University Press.
Smith, E. A. (1991). Inujuamiut foraging strategies: Evolutionary ecology of an arctic hunting economy. New York: Aldine.
Stephens, D. W., & Krebs, J. R. (1986). Foraging theory. Princeton: Princeton University Press.
Predictive spatial modelling 245

Thomas, D. H. (2008). Native American landscapes of St. Catherines Island, Georgia. Anthropological Papers of the
American Museum of Natural History 88.
Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal
of Risk and Uncertainty, 5(4), 297–323.
Vanacker, V., Govers, G., Van Peer, P., Verbeek, C., Desmet, J., & Reyniers, J. (2001). Using Monte Carlo simulation
for the environmental analysis of small archaeologic datasets, with the Mesolithic in Northeast Belgium as a case
study. Journal of Archaeological Science, 28, 661–669.
Van Leusen, M., & Kamermans, H. (Eds.). (2005). Predictive modelling for archaeological heritage management: A research
agenda. Amersfoort: Rijksdienst voor het Oudheidkundig Bodemonderzoek.
Van Leusen, M., Millard, A. R., & Ducke, B. (2009). Dealing with uncertainties in archaeological prediction. In H.
Kamermans, M. van Leusen, & P. Verhagen (Eds.), Archaeological prediction and risk management: Alternatives to current
practice (pp. 123–160). Leiden: Leiden University Press.
Verhagen, P. (2008). Testing archaeological predictive models: A rough guide. In A. Posluschny, K. Lambers, &
I. Herzog (Eds.), Layers of perception: Proceedings of the 35th international conference on computer applications and quantita-
tive methods in archaeology (CAA), Berlin, Germany, April 2–6, 2007, Kolloquien zur Vor-­und Frühgeschichte (Vol. 10,
pp. 285–291). Bonn: Dr. Rudolf Habelt GmbH.
Verhagen, P., Nuninger, L., Bertoncello, F., & Castrorao Barba, A. (2016). Estimating the “memory of landscape” to
predict changes in archaeological settlement patterns. In S. Campana, R. Scopigno, G. Carpentiero, & M. Cirillo
(Eds.), CAA 2015: Keep the revolution going: Proceedings of the 43rd annual conference on computer applications and
quantitative methods in archaeology (pp. 623–636). Oxford: Archaeopress.
Verhagen, P., & Whitley, T. G. (2012). Integrating predictive modelling and archaeological theory: A live report from
the scene. Journal of Archaeological Method and Theory, 19, 49–100.
Vita-­Finzi, C., & Higgs, E. S. (1970). Prehistoric economy in the Mount Carmel area of Palestine: Site catchment
analysis. Proceedings of the Prehistoric Society, 36, 1–37.
Wakker, P. P., Timmermans, D. R. M., & Machielse, I. A. (2003). The effects of statistical information on insurance decisions
and risk attitudes. Amsterdam: Department of Economics, University of Amsterdam.
Warren, R. E. (1990). Predictive modeling in archaeology: A primer. In K. M. S. Allen, S. W. Green & E. B. W.
Zubrow (Eds.), Interpreting space: GIS and archaeology (pp. 90–111). London: Taylor and Francis.
Warren, R. E., & Asch, D. L. (2000). Site location in the Eastern Prairie Peninsula. In K. L. Wescott & R. J. Bran-
don (Eds.), Practical applications of GIS for archaeologists: A predictive modeling toolkit (pp. 5–32). London: Taylor and
Francis.
Wheatley, D. (2004). Making space for an archaeology of place. Internet Archaeology, 15. Retrieved from http://
intarch.ac.uk/journal/issue15/wheatley_index.html
Wheatley, D., & Gillings, M. (2002). Spatial technology and archaeology: The archaeological applications of GIS. London:
Taylor and Francis.
Whitley, T. G. (2003). Causality and cross-­purposes in predictive modeling. In M. der Stadt Wien, R. K. Erbe, &
S. Wien (Eds.), Enter the past: The E-­way into four dimensions of cultural heritage. BAR International Series, 1227
(pp. 236–239). Oxford: Archaeopress.
Whitley, T. G. (2005). A brief outline of causality-­based cognitive archaeological probabilistic modeling . In M.
van Leusen & H. Kamermans (Eds.), Predictive modelling for archaeological heritage management: A research agenda
(pp. 123–138). Amersfoort: Rijksdienst voor het Oudheidkundig Bodemonderzoek.
Whitley, T. G. (2010). Re-­thinking accuracy and precision in predictive modeling. In F. Niccolucci & S. Hermon
(Eds.), Beyond the artifact: Digital interpretation of the past (pp. 312–318). Budapest: Archaeolingua.
Whitley, T. G. (2013). A paleoeconomic model of the Georgia coast (4500 to 300 BP). In V. Thompson & D. H.
Thomas (Eds.), Life among the tides: Recent archaeology of the Georgia bight (pp. 235–285). New York, NY: American
Museum of Natural History Anthropological Papers.
Whitley, T. G., Moore, G., Goel, G., & Jackson, D. (2010). Beyond the marsh: Settlement choice, perception and spatial
decision-­making on the Georgia coastal plain. In B. Frischer, J. Webb Crawford, & D. Koller (Eds.), Making his-
tory interactive: Computer applications and quantitative methods in archaeology (CAA): Proceedings of the 37th international
conference, Williamsburg,Virginia, United States of America, March 22–26, 2009 (pp. 380–390). Oxford: Archaeopress.
Whitley, T. G., Moore, G., Jackson, D., Dellenbach, D., Goel, G., Bruce, J., . . . Futch, J. (2011). An archaeological predic-
tive model for the USACE,Vicksburg district: Western Mississippi, Northern Louisiana, and Southern Arkansas. American
246 Philip Verhagen and Thomas G. Whitley

Recovery and Reinvestment Act 2009: Section 110 Compliance Report for the U.S. Army Corps of Engineers,
Vicksburg District: NHPA, Cultural Resources Investigations Technical Report No. 7 (Vol. 3). Atlanta, GA:
Brockington and Associates, Inc.
Winterhalder, B., & Kennett, D. (2006). Behavioral ecology and the transition from hunting and gathering to agricul-
ture. In D. Kennett & B. Winterhalder (Eds.), Behavioral ecology and the transition to agriculture (pp. 1–21). Berkeley:
University of California Press.
14
Spatial agent-­based modelling
Mark Lake

Introduction
Spatial agent-­based modelling (ABM) is a method of computer simulation that can be used to explore
how the aggregate characteristics of a system – for example a settlement pattern, population dispersal
or distribution of artefacts – arise from the behaviour of artificial agents. In archaeological ABM the
agents are typically individual people or social units such as households. Agent-­based modelling is often
presented as part of the toolkit of complexity science (Beekman & Baden, 2005; Epstein & Axtell, 1996),
but it is a very flexible method which can be used in projects informed by many different theoretical
perspectives.
It should be noted that while the vast majority of agent-­based models used in archaeology are explic-
itly spatial, the representation of space is not a necessary feature of ABM (see e.g. Ferber, 1999). Moreover,
many of the issues that arise when using explicitly spatial ABM are essentially the same as those that
apply to the use of raster GIS, or various forms of statistical spatial analysis. For that reason, this chapter
focuses in issues which are specific to the use of ABM, most of which are relevant irrespective of whether
the model is spatial. It is therefore strongly recommended that this chapter be read alongside others in
this handbook. Chapters 2, 3, 7 and 19 may be particularly relevant to the task of preparing spatial input
data, while Chapters 4, 6, 8, 9 and 21 discuss methods that may be relevant to the statistical analysis and
presentation of spatial simulation outputs.
The primary purpose of this chapter is to explain the choices that must be made when designing,
building, experimenting with and disseminating an ABM. Readers seeking a practical tutorial complete
with sample models and code should consult Railsback and Grimm’s excellent (2012) Agent-­Based and
Individual-­Based Modelling: A Practical Introduction. Readers who are interested in the history and theory
of ABM in archaeology will find up-­to-­date reviews in Cegielski and Rogers (2016), and Lake (2014,
2015). Additional discussion of the relationship between ABM and archaeological theory can be found
in Aldenderfer (1998), Beekman and Baden (2005), Beekman (2005), Costopoulos (2010), Kohler (2000),
Kohler and van der Leeuw (2007a), McGlade (2005) and Mithen (1994). Useful textbooks on agent-­
based modelling include Grimm and Railsback (2005) (aimed at ecologists), the rather briefer Gilbert
(2008) (aimed at sociologists) and Ferber (1999) (aimed at artificial intelligence researchers and computer
scientists).
248 Mark Lake

Agent-­based models in archaeology


Archaeologists were arguably using the forerunners of ABM as far back as the 1970s (Lake, 2014), but
there has undoubtedly been an explosion of archaeological interest in ABM since the publication of
Kohler and Gumerman’s (2000) influential collection of agent-­based models, Dynamics in Human and
Primate Societies in 2000 (Lake, 2014), especially in the last ten years (Cegielski & Rogers, 2016).

Characteristics of archaeological ABM


The tens of ABM now published in the archaeological literature (see Cegielski & Rogers, 2016; Lake,
2014) range in complexity from models that barely meet the minimum textbook definition of an agent-­
based model (Ferber, 1999, pp. 9–10) but were implemented using ABM software (e.g. Bentley, Hahn, &
Shennan, 2004), through relatively simple abstract models of one or a limited number of processes (e.g.
Crema & Lake, 2015; Premo, 2007), to much more complex models seeking greater realism in their
portrayal of human society (e.g. Aubán, Barton, Gordó, & Bergin, 2015; Kohler et al., 2012a; Wilkinson,
Christiansen, Ur, Widell, & Altaweel, 2007).
The Long House Valley ABM, presented as a case study later in this chapter, illustrates many of
the features of a modern spatial ABM (Figure 14.4). This agent-­based model was built to explore the
relationship between climatically determined resource availability, settlement location and population
growth in Long House Valley, Arizona in the period AD 400–1450. The agents are individual Puebloan
households which are endowed with rules by which they choose where to settle in Long House Valley
in order to grow sufficient maize to survive. The model is explicitly spatial because these agents inhabit a
geographically realistic model of Long House Valley, comprising GIS-­style raster maps of maize-­g rowing
potential and the location of water sources. The maize-­g rowing potential for each hectare in the valley
was determined by detailed palaeoenvironmental research. The simulation progresses in yearly steps, at
each of which the resource availability is modified according to a high resolution time-­series estimates
of rainfall. At each time-­step settlements grow, fission, relocate or collapse depending on the ability of
individual agents (the households) to grow sufficient maize to support their ongoing maintenance and
reproduction. Over time, repeated individual household decision-­making and reproduction produces
a changing aggregate settlement pattern and population size, which can be compared to observed and
proxy evidence in the archaeological record.
Although the Long House Valley Model is among the best known archaeological spatial ABM (see
Kohler, Gumerman, & Reynolds, 2005 for a popular account), it is by no means exhaustive in terms of the
kinds of entities, relationships and processes which can be captured in an ABM. Particularly notable exten-
sions, each considered in turn, are sociality and cognition, evolution, environmental change and virtual
reality.

• Sociality and cognition. Archaeologists have increased the realism of human agents by incorporating
aspects of social interaction. This ranges from agents learning from one another (Kohler, Cockburn,
Hooper, Bocinsky, & Kobti, 2012b; Lake, 2000a; Mithen, 1989; Premo, 2012; Premo & Scholnick,
2011), through simple collective decision-­making (Lake, 2000a) to the exchange of goods (Bentley,
Lake, & Shennan, 2005; Kobti, 2012), group formation (Doran, Palmer, Gilbert, & Mellars, 1994;
Doran & Palmer, 1995) and the emergence of leaders (Kohler et al., 2012b). Another way of increas-
ing the realism of agents is to explicitly model learning and memory. These are a feature of a number
of models of hunter-­gatherer foraging, including Costopoulos’ (2001) investigation of the impact
of time-­discounting, Mithen’s (1989, 1990) model of decision-­making in Mesolithic hunting and
Spatial agent-­based modelling 249

Lake’s (2000a) spatial ABM of Mesolithic land-­use. The last of these extends to each agent having
its own geographically referenced cognitive map of its environment (Figure 14.1).
• Evolution. In ABMs built to explore change over longer periods of time it may be appropriate for the
population of agents to evolve as a result of agent reproduction involving recombination or mutation
of agent rules (or other attributes that are normally fixed for the lifetime of the agent). Examples
include Premo’s model of hominin prosociality (2005), Kachel, Premo, and Hublin’s (2011) evaluation
of the ‘grandmother hypothesis’ for human evolution, Lake’s (2001b) model of the evolution of the
hominin capacity for cultural learning and Xue, Costopoulos, and Guichard’s (2011) model of the
extent to which tracking the environment too closely can be detrimental in the long term. There are
also a number of ABMs which model the cultural transmission of traits across agent generations, for
example Premo and Kuhn’s (2010) investigation of the effects of local extinctions on culture change
and diversity in the Palaeolithic.
• Environmental change. Many archaeological ABMs include environmental change. One option is
external forcing, where the environment is altered over time to reflect palaeoenvironmental time-­
series data. For example, in the Long House Valley ABM (see the case study in this chapter) the
maize yield changes over time following rainfall data, while in Xue et al. (2011) model changes in
productivity are based on ice core data. Another option is to explicitly model the impact of agents
on the environment. For example, early versions of Kohler et al.’s Village ABM reduced yields from
continued farming (2000, 2012a), while recent versions also explicitly model the population growth
of prey species such as deer (Johnson & Kohler, 2012) thereby incorporating reciprocal human-­
environment interaction. Additionally, archaeologists interested in the socioecological (Barton, Riel-­
Salvatore, Anderies, & Popescu, 2011) dynamics of long-­term human environment interaction have
coupled ABMs of human behaviour with geographical information systems or other raster models
of natural processes such as soil erosion (e.g. Barton, Ullah, & Bergin, 2010; Barton, Ullah, & Mita-
sova, 2010; Kolm & Smith, 2012).

Figure 14.1 Schematic illustration of the features of an Agent Based Model (ABM) with cognitive agents,
based on the model described (Lake, 2000a). A colour version of this figure can be found in the plates section.
Source: Mapping data ©Crown copyright and database rights 2019 Ordnance Survey (100025252)
250 Mark Lake

Figure 14.2 Example of the realistic rendering of a simulated landscape. A colour version of this figure can
be found in the plates section.
Source: Adapted from Ch’ng and Stone (2006)

• Virtual reality. Ch’ng and Stone (Ch’ng & Stone, 2006; Ch’ng, 2007; Ch’ng et al., 2011) have
combined ABM and gaming engine technology to generate dynamic vegetation models for archae-
ological reconstruction and interactive visualisation of Mesolithic hunter-­gatherers foraging in a
landscape now submerged under the North Sea (Figure 14.2).

Uses of ABM in archaeology


The general case for computer simulation in archaeology was initially advanced by Doran in 1970 and
more recent discussion can be found in Cegielski and Rogers (2016), Costopoulos (2009), Kohler (2000),
Kohler et al. (2005), Lake (2001a, 2014), Premo (2008) and Rogers and Cegielski (2017). Today, archae-
ologists typically use spatial ABM (and ABM more generally) for one or more of three main purposes.

• Understanding long-­term change. The notion that archaeology has much to offer contemporary society
as a science of long-­term societal change and human-­environment interaction (Johnson, Kohler, &
Cowan, 2005; van der Leeuw & Redman, 2002; van der Leeuw, 2008) has intellectual antecedents in
the mid-­20th century programs of cultural ecology and sociocultural evolution (Kohler & van der
Leeuw, 2007a), but we now have better understanding of the importance of non-­linearity, recur-
sion and noise in the evolution of living systems, whether that is couched in the language of chaos
(Schuster, 1988), complexity (Waldrop, 1992), evolutionary drive (Allen & McGlade, 1987), contin-
gency (Gould, 1989), niche construction (Odling-­Smee, Laland, & Feldman, 2003), or structuration
(Giddens, 1984). Since ABMs explicitly model and give causal force to the micro-­level parts (agents)
they are well suited to exploring how potentially non-­linear long-­term systemic change arises from
the decision-­making of agents interacting with and even modifying their physical and social envi-
ronment (see Kohler & van der Leeuw, 2007a; Barton, 2013 for manifestos, Kohler & Varien, 2012
for the history and role of simulation in one long-­running socionatural study, and Beekman &
Baden (2005) for a more overtly sociological perspective).
• Inferring behaviour from the archaeological record. ABM can be used in conjunction with ‘middle range
theory’ (Binford, 1977) to help infer what “organisational arrangements of behaviour” (Pierce, 1989,
Spatial agent-­based modelling 251

p. 2) and human decision-­making (Mithen, 1988) produced the observed archaeological evidence.
Archaeologists usually make the connection between past behaviour and its expected archaeological
outcome on the basis of “intuition or common sense, ethnographic analogies and environmental
regularities, or in some cases experimental archaeology” (Kohler et al., 2012a, p. 40), but computer
simulation is particularly advantageous for this purpose when the candidate behaviours can no lon-
ger be observed and have no reliable recent historical record. Moreover, simulation makes it possible
to explore the outcome of behaviour aggregated and sampled at the often coarse grained spatial and
temporal resolution of the archaeological record. Good examples of this are Mithen’s (1988, 1990)
use of ABM to generate virtual faunal assemblages resulting from different Mesolithic hunting goals
and Premo’s (2005) spatial ABM of Pleistocene hominin food sharing which revealed that the dense
artefact accumulations at Olduvai and Koobi I, long attributed to central place foraging, could alter-
natively have been formed by routed foraging in a patchy environment.
• Testing quantitative methods. The fact that computer simulation can be used to generate expected
outcomes of known behaviour also makes it well-­suited for testing the efficacy of other analyti-
cal techniques. The role of such ‘tactical’ (Orton, 1982) simulations is to provide data resulting
from known behaviour that can then be sampled in ways that mimic the various depositional and
post-­depositional processes which determine what evidence we eventually recover. By varying the
behaviour and/or the subsequent degradation of the data it is possible to investigate whether the
analytical technique in question is capable of retrieving a (typically statistical) ‘signature’ which is
unique to the original behaviour. Examples include tests of measures of the quantity of pottery
(Orton, 1982), the efficacy of multivariate statistics (Aldenderfer, 1981b) to differentiate functional
assemblages, the ability of cladistic methods to reconstruct patterns of cultural inheritance (Eerkens,
Bettinger, & McElreath, 2005), the relationship between temporal frequency distributions and pre-
historic demography (Surovell & Brantingham, 2007), the effect of field survey strategy in the recov-
ery of data from battlefields (Rubio-­Campillo, María Cela, & Hernàndez Cardona, 2011), and the
robustness of population genetic methods when applied to time-­averaged archaeological assemblages
(Premo, 2014).

Method
Having decided on the purpose of an ABM, the next task is to determine what should be included and
in what detail (system bounding), followed by how exactly they should be modelled (detailed design).
After this it will be necessary to choose software to implement the model and, having implemented it,
verify that it works correctly (in a software sense). As discussed next, creating a computer model can be
informative in itself, but ultimately the purpose is to run experiments, which should be carefully designed
according the purpose of the model and earlier decisions about system bounding and detailed design.
Finally, the modeller should consider how to disseminate the model to promote reproducible research and
the longer-­term advancement of knowledge. Each of these major topics is discussed in turn.

Problem definition and system bounding


The different uses to which archaeologists put ABM potentially pose different requirements of a model,
particularly the extent to which it should produce output that can be directly compared with measurable
features of the archaeological record. However, it is always important to be clear about which aspects of
the system can be considered known and which are the unknown aspects about which new knowledge is
sought. Deciding what to include in a model – system bounding – requires an awareness of epistemic
252 Mark Lake

issues which influence the capacity of a model to generate new knowledge (see Lake, 2015 for a more
detailed treatment).

• Informative models are generative. It is widely agreed (see Beekman, 2005; Costopoulos, 2009; Kohler,
2000; Premo, 2008) that the explanatory power of a simulation model lies in the fact that it “must
be observed in operation to find out whether it will produce a predicted outcome” (Costopoulos,
2009, p. 273). Models which must be run to determine their outcome are termed ‘generative’ (with
respect to the phenomenon of interest). The challenge when building generative models is to avoid
an infinite regress: imagine the complexity of a model in which social institutions emerge from the
actions of individual people whose self in turn emerges from explicit modelling of their underlying
neuropsychology, which is in turn modelled as an outcome of the replication and mutation of genes.
The outcomes of a model like this might be so sensitive to chance events that it would effectively
have little or no explanatory power and, in any case, it would very likely be computationally intrac-
table. The solution is to ‘bracket’ or hold constant those aspects of the world thought to be causally
distant from the question at hand. For example, in biology it is possible to win useful insights into
cycles of mammalian population growth and collapse without modelling atomic vibrations within
the biomolecules that make up muscle fibres. Even sociologists who reject the ontological reality of
social institutions accept that for practical purposes it may be necessary “to assume certain back-
ground conditions which are not reduced to their micro dimensions” (King, 1999, p. 223). Ensuring
that an ABM is “generative with respect to its purpose” (Lake, 2015, p. 25) requires a clear statement
of what question(s) the model is intended to answer in order that it is clear what can be treated as
known and thus included in the model specification, and what is to be explained, and should there-
fore be left to be discovered by running simulations (see also Kohler et al., 2012a).
• There is a trade-­off between realism and generality. In practice it is impossible to simultaneously maximise
the generality, realism, and precision of models of complex systems (Levins, 1966). Broadly speaking,
one can have a generalised and probably relatively abstract model which fits many cases but none
of them in every detail, or a more specific and probably more realistic model which fits just one or
a few cases in greater detail. In the case of an ABM greater realism normally entails one or more of
the following:
1 Capturing a larger number of different properties of the modelled entities. For example, does
the environment contain woodland, or is it made up of several different tree species which have
different calorific output when burned?
2 Modelling more of the relationships between different entities and so capturing a larger number
of real-­world process. For example, when a hunter kills prey does that have no effect on the
subsequent availability of prey, or does it deplete the prey population and, if so, does that in turn
impact on future prey population growth?
3 Less commonly, visual realism in the sense of being rendered in a virtual reality.
There are different views on the relative merits of realism versus generality. Kohler and van der
Leeuw (2007b, p. 3) argue that “A good model is not a universal scientific truth but fits some por-
tion of the real world reasonably well, in certain respects and for some specific purpose”, so not
unsurprisingly they suggest that the choice between realism and generality should be made accord-
ing to the scope and purpose of the model. Others see a strong presumption in favour of simplicity
(Premo, 2008; Costopoulos, 2017) on the grounds that: (a) understanding requires reducing complexity
to “intelligible dimensions” (Wobst, 1974, p. 151); (b) it is more parsimonious to discover how much
complexity is necessary to explain the observed phenomenon than it is to assume it from the outset
Spatial agent-­based modelling 253

(Premo, 2007); and (c) models which have not been finely honed to fit a particular case but can
account for a greater diversity of cases have greater explanatory power because they allow one to
predict what should happen in a wider range of circumstances (Costopoulos, 2009).

Detailed design considerations


One of the advantages of ABM over other simulation paradigms is that it affords great flexibility in con-
ceptualising and implementing the modelled entities and processes.

Environment
It is possible to build an ABM in which the agents are not explicitly situated in any kind of space, although
in archaeology that is largely confined to tactical applications (e.g. Eerkens et al., 2005). Most archaeo-
logical ABM are spatial and the introduction of space requires consideration of three important issues.

• Geometry. Spatial ABM can have very different degrees of geometric specificity (Worboys & Duck-
ham, 2004). A purely topological network of agents explicitly models which agent is connected to
which. Adding edge-­weights to the network (see also Brughmans & Peeples, this volume) allows
the modeller to provide information about the relationship between the agents (which could be the
distance between them in Euclidean space or a non-­spatial property such as their similarity with
respect to some trait). More commonly, agents are located in Euclidean space, typically by placing
them on a regular grid of cells akin to a GIS raster map. The grid can be ‘empty’, simply serving to
locate agents with respect to one another, or it may contain values representing terrain or some other
aspect of the environment. Gridded environments can be abstract, or they can be a geographically
referenced representation of some part of the earth’s surface. Often the opposite edges of abstract-­
gridded environments are joined to form a continuous surface on a torus (doughnut), thereby avoid-
ing edge effects such as a reduction in spatial neighbourhood (see e.g. Premo, 2005).
• Updating. An important consideration is whether the agents’ environment should be updated as the
simulation runs. For example, in a simulation run for 100 years it would probably not be necessary
to update terrain height, whereas it might be appropriate to denude a resource exploited by agents
as and when they ‘harvest’ it. The latter would require a decision about whether, when and how
the resource should regenerate. A decision of this nature will require careful thought about system
bounding because it involves determining whether the resource can simply be ‘reset’ to some fixed
value, or whether it should be set to a new value which is itself the outcome of explicitly modelling
the process of regeneration. The latter blurs the boundary between agents and environment because
in a sense the environment has acquired ‘behaviour’ whose outcome may not be known without
running simulations – it too has become a generative phenomenon.
• Input data. The task of populating an ABM environment with appropriate values varies enormously
in magnitude. An abstract model might use a synthetic environment of resource availability in
which the absolute values may be arbitrary but perhaps the environment as a whole is characterised
by a particular property, for example a specific amount of spatial autocorrelation (Lake, 2001b). In
this case, a suitable grid of values can easily be created using GIS or statistical software (see Lloyd &
Atkinson, this volume). At the other extreme are ABMs with environments that represent the real
world at some point in time. The necessary paleoenvironmental reconstruction is often a significant
project in its own right, entailing both fieldwork and modelling (e.g. the case study in this chapter
and also Barton et al., 2010; Kohler et al., 2007; Wilkinson et al., 2007). Interpolating from sparse
254 Mark Lake

point observations of environmental data to a spatially continuous map of the distribution of a


resource at an ethnographic-­scale (say 20–1000m linear resolution) is likely to require use of eco-
logical models such as that developed by Cousins, Lavorel, and Davies (2003), or other methods of
downscaling such as that recently described by Contreras, Guiot, Suarez, and Kirman (2018).

Agents
ABM is scale-­agnostic, so agents can be any entity which can be treated as an individual in the sense that
it acts as a cohesive whole in respect of the particular research problem (Ferber, 1999). In archaeological
ABM the agents are usually individual people, or groups of people such as households, so the most impor-
tant design decisions usually concern agent goals, behaviour and learning (sociality is discussed later in
the context of collectives). Note that many of the issues discussed here are not relevant to uncomplicated
abstract models such as, for example, ABMs of cultural transmission in which agents simply copy traits
from other agents (e.g. Lake & Crema, 2012).

• Attributes, states and behaviour. Attributes are enduring traits which an agent possesses throughout
its lifespan, for example whether it is male or female. In contrast, states change as a result of agent
behaviour and decision-­making (e.g. their location, energy reserves), the passage of time (age), or
possibly external agent or environmental impacts (e.g. theft of resources). Whether a given trait
counts as a fixed attribute or variable state depends on the framing of the research question. For
example, consider two different ways of building an ABM to explore the transition from foraging
to farming: endow agents with the decision-­making capacity to change their preferred subsistence
strategy (e.g. Bentley et al., 2005), or allow the relative proportion of lifetime foragers and farmers in
the overall population of agents to change as a result of differential reproduction, inter-­generational
cultural transmission, or land-­use competition (e.g. Angourakis et al., 2014). The former makes the
subsistence strategy a state, the latter an attribute. Which of these is the right approach depends
on the archaeological evidence, the duration of the transition and time-­scale of the model, and the
modeller’s views concerning the primacy of individual human agency.
• Goals. Agents are autonomous in the sense that their behaviour is directed by their own goals,
which may be different from those of other agents (Ferber, 1999, pp. 9–10). Ordinarily, an agent’s
ultimate goals will be determined by the modeller, but its proximal (immediate) goal at any par-
ticular time during the simulation may be variable if has been endowed with the capacity for
meta decision-­making (see Mithen, 1990). In evolutionary ABMs, in which agents differentially
reproduce, the modeller usually determines a set of ultimate goals but does not specify which
individual agent has which goal except perhaps for the first generation. Evolutionary ABMs, in
which the suite of goals can evolve by recombination during agent reproduction, are uncommon
in archaeological ABM.
• Rules. An agent’s behaviour depends on decision-­making rules which determine how it ‘thinks’
it can best pursue its goals given the circumstances in which it finds itself. These rules are speci-
fied by the modeller (except in models where they can evolve), but if the model is generative it
will be necessary to run the simulation to discover how the agents actually behave. Ordinarily,
agents are rational in the sense that their decision-­making rules ensure a non-­random relationship
between their goals, circumstances and behaviour. Rationality in this sense requires that agents
have some measure of the absolute or relative ‘worth’ of the actual or predicted outcomes of differ-
ent behaviours – what biologists term ‘fitness’ and economists term ‘utility’ (Railsback & Grimm,
2012, p. 143). This terminology and the fact that many archaeological ABMs use insights from
Spatial agent-­based modelling 255

behavioural ecology (see Kohler, 2000; Mithen, 1989 for arguments in favour) has led to criticism
of agent decision-­making rules on the grounds that they project modern rationality back into
the past (e.g. Clark, 2000; Cowgill, 2000; Shanks & Tilley, 1987; Thomas, 1991). There are two
issues at stake here: (a) is it appropriate to invoke a rationality grounded in modern evolutionary
biology or neoliberal economics, and (b) is it actually necessary to do so when using ABM. This
debate was reviewed by Lake (2004), who argued that ABM can in principle accommodate alterna-
tive rationalities.
• Agent prediction/learning. In an ABM learning can take place at the level of individual agents and/or
the system as a whole. The latter is discussed in the context of collectives. An individual agent can
be said to learn when it:
1 Discovers what resources are present in the environment as it moves through it. Note that a
cognitivist would require that the agent forms a representation of the environment that is sepa-
rate from the environment itself – a good test of this is whether the agent can ever have incor-
rect knowledge of its environment (perhaps due to the subsequent actions of other agents).
2 Forms a view about something that is not directly observable. For example, the likelihood of
encountering a particular type of animal is not directly observable, but must be inferred from
the number of actual encounters in a given duration and, as a result, different agents could end
up with different estimates based purely on chance. The accuracy of this kind of learning in a
changing environment depends on how much weight agents give to more distant events relative
to less distant ones, where distance could be in either time or space or both (see Costopoulos,
2001; Wren, Zue, Costopoulos, & Burke, 2014).
3 Copies behaviour or obtains knowledge from another agent. Note that use of the term ‘social
learning’ to describe this is intended to emphasise the fact that such learning eschews direct
observation of the environment, not that it necessarily entails a patterned (social) relationship
between the agents involved (Hinde, 1976).
The possibility of explicitly modelling learning means that ABM can be used to build formal quantitative
models in which humans are not perfect all-­knowing decision-­makers (see Bentley & Ormerod, 2012;
Mithen, 1991; Reynolds, 1987; Slingerland & Collard, 2012).

Collectives
Both sociologists (Gilbert, 1995) and archaeologists (Kohler & van der Leeuw, 2007a; Beekman, 2005)
have advocated using ABM to study the emergence of social norms and institutions from the beliefs and
actions of individuals. Emergence is a thorny philosophical problem (see Bedau & Humphreys, 2008)
and readers are referred to Beekman (2005) and Lake (2015) for more detailed discussion of the issues
as they relate to archaeological ABM. Basically, the concept of emergence raises two main questions in
the social sciences. One is whether the apparently recursive relationship between individuals and society
means that social institutions actually exert irreducible causal influence on agents (see Gilbert, 1995, for an
overview). The other question is whether the fact that human agents reason about the emergent proper-
ties of their own societies makes emergence in human systems qualitatively different from emergence in
physical systems (Conte & Gilbert, 1995; Gilbert, 1995).
In practice one can distinguish three kinds of ‘collective’ phenomenon in ABM:

1 Robust population-­level patterning in the interactions of individual agents who are not, however,
aware of this patterning.
256 Mark Lake

2 Patterned interaction in which agents are in some sense aware of the pattern and perhaps even adjust
their behaviour accordingly. For example, an agent might consider itself to belong to a group of
agents who share complimentary goals, but not actually engage in collective decision-­making.
3 Agents contributing to and abiding by collective decision-­making. Examples of this kind of strong
collective can be found in archaeological ABM of hunter-­gatherer (Lake, 2000a) and small-­scale
agricultural societies (Kohler et al., 2012b).

The modeller must decide how far to pre-­program collectives or whether to allow them to emerge. The
first kind of collective can readily be obtained by true emergence, whereas the second and third types are
more commonly (Railsback & Grimm, 2012, p. 210) scaffolded by programming agents with additional
characteristics (such as a group ID) and/or programming the characteristics of the collective entities (for
example, specifying the possible states and behaviours of groups even before any agents actually belong
to them). At the present time, archaeological ABM typically offer either emergent collective phenomena,
or collectives with some causal influence over agents, but not both (see Lake, 2015, for a more detailed
assessment).

Treatment of time
Modelling how a process unfolds over time requires decisions about the appropriate temporal intervals
and the scheduling of events.

• Temporal intervals and duration. The temporal intervals (timesteps) should reflect the frequency and
duration of the relevant agent decision-­making and behaviour. It is not always necessary to calibrate
a simulation in terms of real-­world time: for example, a tactical simulation intended to help develop
measures of drift in cultural evolution might have timesteps which are just abstract generations. The
total duration should reflect the rate at which the outcomes of agent behaviour accumulate to pro-
duce detectable patterns, both in the simulation itself and in the archaeological record (if relevant).
Note that the minimum temporal envelope within which changes in behaviour can be observed in
the archaeological record will often be longer than the duration over which such changes are detect-
able in the simulation results. One of the advantages of ABM is that it can be used to investigate
what the results of ethnographic-­scale human behaviours might look like when time-­averaged in
the archaeological record (e.g. Premo, 2014), that is to say, what the accumulation of material from
multiple episodes of behaviour might look like when aggregated across the minimum time-­span that
archaeologists can differentiate given the effects of post-­depositional processes and available dating
techniques.
• Scheduling. An ABM can be event driven, in which case agents individually schedule their own
activities (e.g. Lake, 2000b), or programmed so that all agents undertake activities at the same set
intervals. The latter scenario is much more common, but unless the ABM is being run on specialised
parallel hardware, the simulation will proceed sequentially even if conceptually agents are considered
to be undertaking activities at the same time. In this case, it is good practice to ensure that agents
do not undertake activities in the same order at every timestep, so as to avoid arbitrarily advantaging
or disadvantaging those that come towards the front or back of the execution queue. It will also be
necessary to decide whether or not agents should be aware of the results of the behaviour of other
agents who preceded them in the queue. As an example, agents who are unaware that other agents
have already harvested a resource in the same timestep will base their decision-­making on imperfect
knowledge, so the question is whether perfect or imperfect knowledge better captures reality.
Spatial agent-­based modelling 257

Implementation and verification

Computer hardware
Productive ABMs have been run on hardware ranging from laptops to high performance computers
(HPC) offering hardware parallelism. Hardware requirements are a function of the complexity of the
model and the rigour of the experimental design (see the next section). In many cases it is the latter which
poses the greatest challenge – a simulation which completes in one hour becomes a different proposi-
tion if it is necessary to undertake 1000 runs for all possible combinations of three parameters which can
each take ten values! Hardware evolves very rapidly, but one general point worth noting is that simply
increasing the number of cores in a computer does not increase the speed of simulation unless either the
software supports parallel execution of the code, or it is possible to arrange simultaneous execution of
multiple different simulations.

Software platforms
Implementation of an ABM invariably requires some computer programming, so the modeller will either
need to learn to program or collaborate with others who can. ABM can be implemented using a variety
of programming languages and software, each of which has pros and cons.

• General purpose programming languages (e.g. C++, Java, Python). These might be a good choice if the
modeller already knows the programming language and the model is relatively simple. An ABM
written in a general purpose compiled language such as C++ is likely to run very fast, but on the
other hand the lack of existing functionality may slow down development of a more complex
model, especially if a graphical user interface (GUI) is required and/or integration with GIS or sta-
tistical software.
• Statistical/mathematical programming languages. To judge from recent examples (e.g. Crema, 2014;
Crema & Lake, 2015) statistical programming languages such as R are probably better suited to
simpler abstract ABM, especially where a GUI is not required. Quantitatively inclined archaeologists
may already be conversant with languages such as R, but the greatest advantage of this approach
is the direct integration of the model into a powerful framework for the statistical analysis of the
simulation results (see for example Bevan, this volume and Crema, this volume), which can greatly
facilitate rigorous experimental design.
• Dedicated simulation frameworks. Dedicated simulation frameworks (e.g. Ascape, Mason, Repast Sym-
phony, SWARM) may provide a ‘drop-­and-­drag’ graphical model building tool, but in most cases
the modeller will end up writing at least some programming code using an object-­oriented lan-
guage such as Objective C, Java or Python. The main advantage of a simulation framework is that
it provides code for functionality such as controlling the simulation, setting parameters, scheduling
agents, drawing them on screen, logging results and often also exchanging data with other software
such as GIS. The most popular frameworks are largely ‘paradigm agnostic’ in that they do not impose
a particular concept of what constitutes an agent or how to model the environment. Additionally,
some frameworks (e.g. Pandora, Repast for HPC) support implementing ABM on high performance
computers. Taken together, these attributes make the popular simulation frameworks well-­suited for
implementing complex computationally intensive ABMs.
• Integrated modelling environment. An integrated modelling environment provides a ‘one-­stop’ solu-
tion for implementing an ABM by providing a single GUI for writing program code, running
258 Mark Lake

simulations, visualizing and logging the results and even automating multiple runs with different
parameters. The best known is NetLogo, which provides an excellent vehicle for learning ABM
(Railsback & Grimm, 2012 uses it) while at the same time being capable of supporting useful sci-
entific experiments in archaeology (e.g. Premo, 2014). Indeed, a particular advantage of NetLogo
is the built-­in support for sensitivity analysis, which facilitates and encourages the experimentation
required to actually learn from an ABM. NetLogo was originally designed around a particular
concept of agents and their environment and it may (probably rarely) be unnatural or perhaps even
impossible to use it to implement a specific conceptual model.

Integrating ABM and GIS


Spatial ABM require spatial input data and produce spatial results, so a means of connecting to GIS is
invaluable. Additionally, an ABM which explicitly models environmental processes, for example soil
erosion, might benefit from access to relevant GIS functionality. The various methods of coupling or
integrating ABM and GIS are discussed at length by Westervelt (2002) and Crooks and Castle (2012),
but are briefly sketched here.

• Loose coupling entails moving data between the ABM and GIS by saving and importing files that
both can read, typically a real or de facto interchange format such as ESRI’s shapefile and ASCII grid
formats. The most popular simulation frameworks and integrated modelling environments provide
the necessary functionality to achieve this kind of coupling, which generally occurs at the beginning
and end of each simulation.
• Tight coupling involves one or both of two enhancements over loose coupling. One is that the ABM
can directly access the GIS data in its native format by connecting to the geodatabase maintained by
the GIS software. Avoiding the need to convert data into an intermediate format and/or write it to
disk potentially increases the speed of data exchange, thereby facilitating the second enhancement,
which is synchronisation of the ABM and GIS, usually so that the GIS can actually be used to modify
the environment occupied by agents at intervals during the simulation (e.g. Barton et al., 2015).
Tight coupling of this nature generally requires that both the ABM and GIS can be controlled by a
meta-­program (typically a Unix shell script or Python script).
• Integration takes tight coupling one step further and dissolves the distinction between the ABM and
GIS software by embedding one in the other. One option is to model environmental change by
implementing the relevant algorithms within the ABM, even to the extent of treating aspects of the
environment (such as woodland) as being made up of agents (individual trees). Another is to modify
the GIS software to implement agent behaviour and dynamic updating of the GIS data (e.g. Lake,
2000b); this requires that the GIS software has a rich scripting language or that its source code is
available for modification (as with open source software).

Verification
Verification is the process of ensuring that the ABM program code correctly implements the conceptual
model (Aldenderfer, 1981a). Verification is not intended to determine whether the underlying concep-
tual model is a good model of the world, although it can sometimes reveal flaws of logic, typically where
the conceptual model simply does not specify what should happen under certain circumstances. Readers
should consult Railsback and Grimm (2012), Chapter 6, for practical advice about how to verify ABM
program code.
Spatial agent-­based modelling 259

Experimentation and analysis


Building an ABM can be said to have “conceptual utility” (Innis, 1972, p. 33) if it has served to “create
new problems and view old ones in new and interesting ways” (Zubrow, 1981, p. 143). Nevertheless, the
full potential of simulation is only realised if enough time and resource is reserved for extensive experi-
mentation to generate an ensemble of “‘what if ’ scenarios” (Premo, 2008, p. 50) or “alternative cultural
histories” (Gumerman & Kohler, 2001). Moreover, devising a generative ABM capable of matching
patterns in the archaeological record is not sufficient to prove that the modelled process is what actually
caused that pattern. The basic problem is that of underdetermination or equifinality: the possibility that
other processes might also be able to produce the observed pattern. Mitigating the risk of making false
inferences by replicating the past for the wrong reason requires rigorous experimentation designed to
answer a series of questions.

What is the impact of chance events?


Most ABMs are stochastic, meaning that one or more processes have a random component. Randomness
might reflect genuine randomness in the real world, but it is usually included to create initial variability
in the model and/or as a means of bracketing out unnecessary detail and avoiding an infinite regress in
the processes that must be modelled (Railsback & Grimm, 2012, pp. 201–202). For example, in an ABM
of hunting, an agent might probabilistically encounter prey, not because the movement of prey is actually
random, but to avoid modelling the decision-­making of each prey animal while maintaining the realism of
the relevant aspect of prey movement (the frequency with which prey are found in different parts of the
landscape); in this case implementing probabilistic prey encounter may also reflect the fact that agents are
uncertain about whether they will encounter prey. When incorporating random events it is important to
choose an appropriate probability distribution: in this example a Poisson distribution would be appropriate
(the probability of a number of events occurring in a given time period), but a uniform or normal distribu-
tion would better characterise variability in many attributes. Note, however, that drawing quantities from
a normal distribution can potentially produce extreme values that would simply be impossible in the real
world and although this will (by definition) be rare it could invalidate the model or even halt the simulation.
The impact of stochasticity on simulation results should be explored by running multiple simulations
which are identical apart from the seed used to initialise the random number generator. In this way it is
possible to build up an ‘envelope’ containing all possible simulation results and thus to determine whether
chance alone is sufficient to produce different outcomes that support different substantive conclusions. It
is difficult to provide a hard-­and-­fast rule for how many runs should be made, but one way of deciding
is to observe the declining rate at which new results fall outside the existing envelope and then stop once
this falls to a level which suggests any possible outcomes not yet observed will be extremely rare. Note
that multiple runs should be made for each possible parameter combination (see next), so experimenta-
tion can rapidly become computationally demanding even if the model itself is relatively simple.

How sensitive is the model to parameter changes?


It is usually desirable to conduct multiple simulations, each with a different combination of parameters
(Figure 14.3). There are three main reasons for this, each requiring a slightly different approach.

• Dealing with parameter uncertainty. If there is uncertainty about what parameter values best represent
the state of the world in the past then it will be necessary to run multiple simulations with different
260 Mark Lake

values in order to establish the likelihood of different outcomes (rather as for dealing with stochas-
ticity). Note, however, that the likelihood of different outcomes can only be estimated if attention
is paid to both the range of parameter values and the probability that they are correct, since propor-
tionately more simulations should be run with the more likely parameter values. Consequently, the
parameter values should be drawn from a distribution which reflects the nature of the uncertainty:
for example a uniform distribution if all values are equally plausible, but perhaps a normal distribu-
tion if a certain value is most likely and more distant values increasingly unlikely.

Figure 14.3 Graphed Agent Based Model (ABM) simulation results which collectively illustrate several aspects
of experimental design: (a) plotted points of the same colour and k value differ due to stochastic effects alone;
(b), two different parameters s and k are varied; and (c) two different agent rules, “CopyTheBest” and “Copy-
IfBetter” are explored. A colour version of this figure can be found in the plates section.
Source: Reproduced with permission from Figure 4 in Crema and Lake (2015)
Spatial agent-­based modelling 261

• Establishing what is possible. If the aim is to establish what could have happened in history under cer-
tain circumstances then it will be necessary to investigate what outcomes are possible given different
assumptions and starting points. As with the case of parameter uncertainty this requires multiple
simulation runs made with parameter values of interest. However, since the aim is not to establish
the likelihood of different outcomes there is no need to attach a probability to different parameter
values.
• Estimating unknown parameters. The aim here turns the conventional approach on its head by making
the parameter values the unknowns that are to be estimated by running simulations. The logic is to
vary the parameters and discover what values most reliably produce simulation results that match
the archaeological record. A good example of this approach is Crema, Edinborough, Kerig, and
Shennan’s (2014) use of simulation to investigate what kind of cultural transmission best explains
observed changes in European Neolithic arrowhead assemblages. Formal models of cultural trans-
mission usually have population size (of ‘teachers’) and innovation rate as important parameters, but
these values are rarely known with certainty and indeed are often quantities that the modeller would
like to infer. Crema et al. adopted an approximate Bayesian computation framework in which they
provided prior probability distributions for these parameters and then ran multiple simulations
which collectively sampled possible combinations of parameter values. By comparing the simulation
results to the observed changes in the archaeological data they were then able to provide posterior
probabilities for the parameters, in other words, to infer which values were more or less likely than
others given both initial knowledge and the results of the simulations.

How sensitive is the model to structural changes?


The behaviour of a simulation model depends on the structure of the model (e.g. the agent rules) as well
as parameter values and chance events. Given the problem of equifinality there is a case for comparing
“alternative model scenarios” (Railsback & Grimm, 2012, p. 113) rather than simply conducting sensitiv-
ity analysis of just one model. Railsback and Grimm note that this is not yet common and is therefore a
“less formalized and more creative” process (Railsback & Grimm, 2012, p. 306) because – unlike param-
eter values – alternative rules cannot be drawn from some defined quantitative distribution, but must
instead be chosen on the basis of theoretical understanding. For example, in a spatial ABM of foraging the
modeller might swap the goal of maximising intake with that of what is technically termed ‘satisficing’
(obtaining sufficient calories). However, pursuing this example, it can be argued that making this change
is as much a comparison of two alternative models as it is a test of the robustness of the original model.
Indeed, there is a case for abandoning hypothesis-­testing using single models in favour of ‘multi-­model
selection’ (Rubio-­Campillo, 2016), which also carries with it a subtle epistemic shift from attempting to
discover if one model is true to attempting to discover which of the currently available models is ‘best’
(Burnham & Anderson, 2002). Adopting a model-­selection approach opens up the possibility of more
formalised methods for choosing between models. Again, Crema et al. (2014) investigation of cultural
transmission in European Neolithic arrowhead assemblages provides a good example of how this might
be achieved in practice.

Can the model account for multiple patterns?


One way of increasing confidence that simulation results fit observed data for the right reasons – in other
words, that the model is a good representation of reality – is to adopt an approach known as “pattern ori-
ented modelling” (POM; see Railsback & Grimm, 2012, p. 291; Altaweel, Alessa, Kliskey, & Bone, 2010).
262 Mark Lake

The basic idea is that it is often relatively easy to ‘tune’ a model to replicate a single dataset comprising just
one variable, but rather more difficult to replicate multiple datasets and/or multiple variables. Achieving
the latter suggests that the model is ‘structurally realistic’. Railsback and Grimm (2012) provide an excel-
lent introduction to POM, but a brief archaeologically oriented example serves to illustrate the concept.
Mithen (1993) built a computer simulation in which human hunting impacted on the population growth
of mammoths. Rather than simply attempt to replicate the decline in the overall mammoth population, he
explicitly modelled the age structure of the mammoth population. This not only provided an additional
point of contact with the archaeological record (one more readily available than overall population size)
but also better captured the real-­world causal dynamics – that it might matter whether or not humans
hunted animals of reproductive age.

Dissemination and re-­use


Archaeological knowledge advances not just by collecting more data, but by subjecting existing interpre-
tations to new scrutiny. However, although much archaeological interpretation relies on the use of com-
puters and complex software, “their role in the analytical pipeline is rarely exposed for other researchers
to inspect or reuse” (Marwick, 2017, Abstract). There is growing awareness of the need to rectify this
situation (see also Ducke, 2013; Rollins, Barton, Bergin, Janssen, & Lee, 2014) and Marwick has recently
applied to archaeology some principles of reproducible research that have emerged in other fields. In the
specific case of ABM reproducibility requires the following.

• Dissemination of the program code and input data. Ideally it should be possible for other researchers to
run the simulation, both to verify the published results and to explore other scenarios. Program
code and data can be disseminated as ‘supplementary material’ hosted alongside published journal
articles, placed on a Web-­based hosting service such as GitHub, or perhaps better still, uploaded to
a collective repository such as Open ABM (www.openabm.org/). The ABM program code should
include inline comments to help others understand how it works and should be accompanied by
information about the computational environment required to run it.
• Documentation of the conceptual model. Researchers may be able to infer many aspects of the conceptual
model from the program code itself, but that presupposes that the program is actually an accurate
reflection of the original modeller’s intention and, in any case, it is helpful to have further informa-
tion about assumptions that have been made. The ODD (Overview, Design concepts, and Details)
protocol has been proposed as a standard for describing agent-­based models and ODD-­style docu-
mentation has been incorporated into the NetLogo integrated simulation environment. The full
specification can be found in Grimm et al. (2006, 2010) but here is the outline:
Overview The purpose of the model (which aspects of reality are included and why?). What the
entities are and how they are characterised? What processes are included and when do they
occur?
Design concepts For example, is the model intended to produce emergent phenomena? Does it
involve individual or population-­level adaptation? Does it include stochastic elements? What is
the nature of any collectives?
Details How is the model initialized? What are the external inputs? A fuller mathematical and/or
verbal description of the model.
• Documentation of the experimental design. In order to reproduce and/or extend published results, other
researchers will also need to know the exact range of parameters ‘swept’ during multiple runs. Any
Spatial agent-­based modelling 263

post-­processing of the raw simulation output (for example, the aggregation or averaging of agent
state variables) should also be documented.

Case study
As noted above, the Long House Valley ABM (Dean et al., 2000; Axtell et al., 2002) is a well-­known
archaeological model (Kohler et al., 2005) which illustrates many of the features of a modern spatial ABM
(Figure 14.4). There are several reasons for drawing attention to this model as a case study. One is that it
tackles the kind of research question (collapse of societies) that excites interest beyond academe and to that
extent, at least, is therefore a good advertisement for the use of spatial ABM in archaeology. Moreover, and
not unrelated, a version of the model (called “Artificial Anasazi”) is available as part of the standard release
of the popular and easy to install NetLogo ABM software (Stonedahl & Wilensky, 2010b). Consequently,
the interested reader can quite quickly get to the point of running the model, experimenting with it and
ultimately exploring and even modifying the code. Finally – and unusually – this model has a history

Figure 14.4 Comparison of Long House Valley simulation results with archaeological evidence. A colour ver-
sion of this figure can be found in the plates section.
Source: Adapted with permission from Kohler et al. (2005)
264 Mark Lake

(Swedlund, Sattenspiel, Warren, & Gumerman, 2015) to the extent that it has been re-­implemented and
studied by researchers who were not part of the original modelling effort, and this includes an analysis of
what actually causes the model outcomes (Janssen, 2009; Stonedahl & Wilensky, 2010a). This history of
use is an instructive lesson in how to ‘do science’ with archaeological spatial ABM.

Research question
Long House Valley, in northeastern Arizona, was sparsely occupied by hunters and gatherers until the intro-
duction of maize at around 1800 BC initiated the gradual development of substantial permanent settlements
and the Puebloan Anasazi cultural tradition. The valley was abruptly abandoned around AD 1300 and the
population migrated elsewhere. A key question is what caused the abandonment and, in particular, to what
extent it can simply be explained by the onset of climatic deterioration at circa AD 1270.
Three features of Long House Valley make it particularly suitable for the application of ethnographic-­
scale spatial ABM. One is that the valley is a topographically discrete entity which, given the focus on
agricultural subsistence, provides a natural ‘edge’ for the simulated world. The second feature is the avail-
ability of very rich and high resolution palaeoenvironmental data which make it possible to estimate the
maize growing potential of every hectare in the valley annually from AD 400–1450. Third, the valley has
been intensively surveyed, so there is relatively complete knowledge of the Puebloan settlement pattern,
much of it dated by dendrochronology. Additionally, it is claimed that ethnographic studies of historic
Pueblo groups can be used to parameterise aspects of the model, such as the nutritional requirements of
agents.

Model design
The two main components of the Long House Valley model are the landscape and agents. The landscape
is a 100 × 100m raster representation of Long House Valley in which each cell is allocated to one of seven
different zones. These differ in their agricultural yield (of maize) and are variably susceptible to changes
in the Palmer Drought Severity index (a measure of the impact of moisture and temperature on crop
growth). Additionally, the model includes a raster map of water sources. In later versions of the model,
variability in soil quality within zones is modelled stochastically by the simple expedient of adding a
random number drawn from a uniform distribution between zero and some upper bound representing
the spatial harvest variance.
Each agent represents a household of five persons. Agents farm one map cell and occupy a separate
unfarmed residential location which must be within 1km of their farmland. Agents have a fission age,
at which they spawn a new household, and an age of death, when they are removed from the model.
In the first version of the model these attributes were the same for all agents, but in later versions some
stochastic heterogeneity was introduced by randomly drawing these values from a uniform distribution
with specified lower and upper bounds. The goal of agents is to grow sufficient maize to meet their
annual requirement for survival. Agents who anticipate falling short search for a new cell to farm as per
the rules in Table 14.1 and, if successful, move there. Agents who exceed their fission age have a chance
of spawning a new household, which takes a fraction of the parent household’s stored maize.
The model is run from AD 800–1350 in annual time steps. At each time step the Palmer Drought
Severity Index is updated, which alters the yield of map cells. The map of water sources is also updated,
which is one of the criteria used by agents attempting to move to a new cell to farm. Agents also pursue
their goals (harvesting maize, possibly relocating and possibly fissioning) once per time step. The result of
iterating these processes is a simulated annual record of population size and settlement location.
Spatial agent-­based modelling 265

Table 14.1 Rules for choosing new farming and settlement locations (from Axtell et al., 2002, Table 2).

A. Identification of agricultural location:


i) The location must be currently unfarmed and uninhabited.
ii) The location must have potential maize production sufficient for a minimum harvest of 160 kg per person
per year. Future maize production is estimated from that of neighbouring sites.
iii) If multiple sites satisfy these criteria the location closest to the current residence is selected.
iv) If no site meets the criteria the household leaves the valley.
B. Identification of a residential location:
i) The residence must be within 1 km of the agricultural plot.
ii) The residential location must be unfarmed (although it may be inhabited, i.e. multihousehold sites
permitted).
iii) The residence must be in a less productive zone than the agricultural land identified in A.
If multiple sites satisfy these above criteria the location closest to the water resources is selected.
If no site meets these criteria they are relaxed in order of iii then i.

Further details of the model can be found in several sources. The version of the model distributed as
part of the standard NetLogo model library includes an ODD-­like description, which can also be viewed
at http://ccl.northwestern.edu/netlogo/models/ArtificialAnasazi. More detail, including tables of agent
attributes and rules in the original model are published in Axtell et al. (2002). Similar information is
provided by Janssen (2009), who additionally also describes certain submodels (for example, how exactly
the agricultural yield is calculated).

Experiments
The first version of the model has 17 parameters, and the model was initially run with values based on
ethnographic accounts of historic Pueblo groups, as per Table 14.2. It was found that with these “base
case” (Axtell et al., 2002) parameter values the model could reproduce qualitative features of the history
of demographic changes and settlement patterns in Long House Valley, but the actual population sizes
were up to six times too large (Axtell et al., 2002; Kohler et al., 2005). Subsequent adjustment of farming
yields to reflect characteristics of prehistoric maize coupled with the introduction of landscape and agent
heterogeneity, as mentioned above, resulted in the model closely matching the historic population sizes
(estimated from room counts).
The experimental design for the version of the model with greater stochasticity entailed calibrating
the model by varying the upper and lower bounds of the stochastic parameters to find the values which
produced the best fit between the simulated and historic population sizes (Axtell et al., 2002). This was
undertaken for both individual runs and for averages of 15 runs, the latter reflecting the fact that runs
with identical parameters can produce different results by chance alone.
Janssen (2009) subsequently conducted a further round of experiments on a version of Long House
Valley model re-­implemented in NetLogo. He was able to replicate the results reported by Axtell et al.
(2002), although it is interesting to see (Janssen, 2009, Figure 3) that even the calibrated model can pro-
duce quite variable results, some of which do not so convincingly match the qualitative features of the
population history (Figure 14.5). Perhaps more importantly, Janssen (2009, Paragraph 4.1) also conducted
experiments designed specifically to answer the question “What leads to the good fit of the simulation
Table 14.2 Original ‘base’ parameter values for the Long House Valley
model (from Axtell et al., 2002, Table 4).

Parameter Value

Random seed Varies


Year at model start AD 800
Year at model termination AD 1350
Nutritional need per individual 800 kg
Maximum length of grain storage 2 years
Harvest adjustment 1
Annual variance in harvest 0.1
Spatial variance in harvest 0.1
Household fission age 16 years
Household death age 30 years
Fertility (annual probability of fission) 0.125
Grain store given to new household 0.33
Maximum farm to residence distance 1600 m
Initial corn stocks, minimum 2000 kg
Initial corn stocks, maximum 2400 kg
Initial household age, minimum 0 years
Initial household age, maximum 29 years

Figure 14.5 Population curves produced by 100 runs of the calibrated Long House Valley Model, differing
only in random seed.
Source: Reproduced under a CC-­BY-­4.0 license from Janssen (2009)
Spatial agent-­based modelling 267

with the aggregated population data?” He found that the fit between the simulated and historic popula-
tion is primarily a function of landscape carrying capacity rather than parameters determining the lon-
gevity of households or at what age they might fission.

Implications
The best fitting runs of the calibrated model produce annual population sizes that track the estimated
historic values uncannily well up until abandonment of Long House Valley. If Janssen’s analysis is correct,
this may be primarily a function of the quality of the carrying capacity estimates derived from painstaking
palaeoenvironmental research. On the other hand, even the best-­fitting runs fail to predict the complete
depopulation of Long House Valley at circa AD1300 and so all those who have analysed the model are
in agreement that it has convincingly demonstrated that environmental factors alone cannot account
for the abrupt abandonment of the valley. Indeed, Kohler et al. (2005) suggest that archaeologists should
instead look for sociopolitcal or ideological drivers of this event. The role that ABM might play in this
next instalment of research is discussed by Janssen (2009).

Conclusion
Archaeological ABM are used for a variety of purposes and vary greatly in their complexity. Twenty –
perhaps even ten – years ago, ABM was almost always computationally ‘cutting edge’ in some way, and
this is still true of some more complex models, especially those requiring high performance computing
and/or generating virtual reality visualisations. On the other hand, many recent archaeological ABMs
have been implemented using well-­established software and run on relatively mainstream hardware. This
does not mean that those archaeological ABM’s are not computationally demanding, but that hardware
and software are now sufficient to permit greater focus on other issues such as experimental design. The
fact that the technological aspects of ABM have in many cases become less remarkable (literally so in
recent publications) suggest that the technique has genuinely come of age as a useful part of the archaeo-
logical toolkit. As the technology of ABM becomes ever more accessible it is hoped that this chapter will
help users understand what makes an archaeological ABM scientifically productive.

References
Aldenderfer, M. S. (1981a). Computer simulation for archaeology: An introductory essay. In J. A. Sabloff (Ed.),
Simulations in archaeology (pp. 67–118). Albuquerque: University of New Mexico.
Aldenderfer, M. S. (1981b). Creating assemblages by computer simulation: The development and uses of ABSIM. In
J. A. Sabloff (Ed.), Simulations in archaeology (pp. 11–49). Albuquerque: University of New Mexico.
Aldenderfer, M. S. (1998). Quantitative methods in archaeology: A review of recent trends and developments. Journal
of Archaeological Research, 6, 91–1220.
Allen, P., & McGlade, J. (1987). Evolutionary drive: The effect of microscopic diversity. Foundations of Physics, 17,
723–738.
Altaweel, M., Alessa, L., Kliskey, A., & Bone, C. (2010). A framework to structure agent-­based modeling data for
social-­ecological systems. Structure and Dynamics, 4(1), article 2.
Angourakis, A., Rondelli, B., Stride, S., Rubio-­Campillo, X., Balbo, A. L., Torrano, A., . . . Gurt, J. M. (2014). Land use
patterns in Central Asia. step 1: The musical chairs model. Journal of Archaeological Method and Theory, 21, 405–425.
Aubán, J. B., Barton, C. M., Gordó, S. P., & Bergin, S. M. (2015). Modeling initial neolithic dispersal: The first
agricultural groups in west Mediterranean. Ecological Modelling, 307, 22–31.
268 Mark Lake

Axtell, R. L., Epstein, J. M., Dean, J. S., Gumerman, G. J., Swedlund, A. C., Harburger, J., . . . Parker, M. (2002).
Population growth and collapse in a multiagent model of the Kayenta Anasazi in Long House Valley. Proceedings
of the National Academy of Sciences of the United States of America, 99(Suppl 3), 7275–7279.
Barton, C., Riel-­Salvatore, J., Anderies, J., & Popescu, G. (2011). Modeling human ecodynamics and biocultural
interactions in the late Pleistocene of western Eurasia. Human Ecology, 39, 1–21.
Barton, C., Ullah, I., & Mitasova, H. (2010). Computational modeling and Neolithic socioecological dynamics: A
case study from southwest Asia. American Antiquity, 75(2), 364–386.
Barton, C. M. (2013). Stories of the past or science of the future? Archaeology and computational social science. In
A. Bevan & M. Lake (Eds.), Computational Approaches to Archaeological Spaces (pp. 151–178). Walnut Creek, CA:
Left Coast Press.
Barton, C. M., Ullah, I., Mayer, G., Bergin, S., Sarjoughian, H., & Mitasova, H. (2015). MedLanD modeling laboratory
v.1 (version 1.0.0). Technical Report, CoMSES Computational Model Library. Retrieved from www.comses.net/
codebases/4609/releases/1.0.0/
Barton, C. M., Ullah, I. I., & Bergin, S. (2010). Land use, water and Mediterranean landscapes: Modelling long-­term
dynamics of complex socio-­ecological systems. Philosophical Transactions of the Royal Society A: Mathematical, Physical
and Engineering Sciences, 368(1931), 5275–5297.
Bedau, M. A., & Humphreys, P. (2008). Introduction. In M. A. Bedau & P. Humphreys (Eds.), Emergence: Contempo-
rary readings in philosophy and science (pp. 1–6). Cambridge, MA: The MIT Press.
Beekman, C. S. (2005). Agency, collectivities and emergence: Social theory and agent based simulations. In C. S.
Beekman & W. W. Baden (Eds.), Nonlinear models for archaeology and anthropology (pp. 51–78). Aldershot, UK:
Ashgate.
Beekman, C. S., & Baden, W. W. (2005). Continuing the revolution. In C. S. Beekman & W. W. Baden (Eds.),
Nonlinear models for archaeology and anthropology (pp. 1–12). Aldershot, UK: Ashgate.
Bentley, A., & Ormerod, P. (2012). Agents, intelligence, and social atoms. In M. Collard & E. Slingerland (Eds.),
Creating consilience: Reconciling science and the humanities (pp. 205–222). Oxford: Oxford University Press.
Bentley, R. A., Hahn, M. W., & Shennan, S. J. (2004). Random drift and culture change. Proceedings of the Royal Society
of London B, 271, 1443–1450.
Bentley, R. A., Lake, M. W., & Shennan, S. J. (2005). Specialisation and wealth inequality in a model of a clustered
economic network. Journal of Archaeological Science, 32(9), 1346–1356.
Binford, L. R. (1977). For theory building in archaeology. New York: Academic Press.
Burnham, K. P., & Anderson, D. R. (2002). Model selection and multi-­model inference: A practical information-­theoretic
approach (2nd ed.). New York: Springer.
Cegielski, W., & Rogers, J. (2016). Rethinking the role of agent-­based modeling in archaeology. Journal of Anthro-
pological Archaeology, 41, 283–298.
Ch’ng, E. (2007). Using games engines for archaeological visualisation: Recreating lost worlds. Proceedings of CGames
2007 (11th international conference on computer games: AI, animation, mobile, educational & serious games), La Rochelle,
France (2007), 7, 26–30.
Ch’ng, E., Chapman, H., Gaffney, V., Murgatroyd, P., Gaffney, C., & Neubauer, W. (2011). From sites to landscapes:
How computing technology is shaping archaeological practice. Computer, 44(7), 40–46.
Ch’ng, E., & Stone, R. J. (2006). 3D archaeological reconstruction and visualisation: An artificial life model for determining
vegetation dispersal patterns in ancient landscapes. Proceedings of the International Conference on Computer Graph-
ics, Imaging and Visualisation (CGIV’06), Sydney, Australia.
Clark, J. E. (2000). Towards a better explanation of hereditary inequality: A critical assessment of natural and historic
human agents. In M. A. Dobres & J. E. Robb (Eds.), Agency in archaeology (pp. 92–112). London: Routledge.
Conte, R., & Gilbert, N. (1995). Introduction: Computer simulation for social theory. In N. Gilbert & R. Conte
(Eds.), Artificial societies: The computer simulation of social life (pp. 1–18). London: UCL Press.
Contreras, D., Guiot, J., Suarez, R., & Kirman, A. (2018). Reaching the human scale: A spatial and temporal downscal-
ing approach to the archaeological implications of paleoclimate data. Journal of Archaeological Science, 93, 54–67.
Costopoulos, A. (2001). Evaluating the impact of increasing memory on agent behaviour: Adaptive patterns in an
agent-­based simulation of subsistence. Journal of Artificial Societies and Social Simulation, 4. Retrieved from www.
soc.surrey.ac.uk/JASSS/4/4/7.html
Spatial agent-­based modelling 269

Costopoulos, A. (2009). Simulating society. In H. Maschner, R. A. Bentley, & C. Chippindale (Eds.), Handbook of
archaeological theories (pp. 273–281). Lanham, MD: Altamira Press.
Costopoulos, A. (2010). For a theory of archaeological simulation. In A. Costopoulos & M. Lake (Eds.), Simulating
change: Archaeology into the twenty-­first century (pp. 21–27). Salt Lake City: University of Utah Press.
Costopoulos, A. (2017). Can you model my valley? particular people, places and times in archaeological simulation.
tDAR id: 431522.
Cousins, S. A. O., Lavorel, S., & Davies, I. (2003). Modelling the effects of landscape pattern and grazing regimes on
the persistence of plant species with high conservation value in grasslands in south-­eastern Sweden. Landscape
Ecology, 18, 315–332.
Cowgill, G. E. (2000). “Rationality” and contexts in agency theory. In M.-­A. Dobres & J. E. Robb (Eds.), Agency in
archaeology (pp. 51–60). London: Routledge.
Crema, E., Edinborough, K., Kerig, T., & Shennan, S. (2014). An approximate bayesian computation approach for
inferring patterns of cultural evolutionary change. Journal of Archaeological Science, 50, 160–170.
Crema, E. R. (2014). A simulation model of fission-­fusion dynamics and long-­term settlement change. Journal of
Archaeological Method and Theory, 21, 385–404. doi:10.1007/s10816-­013-­9185-­4
Crema, E. R., & Lake, M. W. (2015). Cultural incubators and spread of innovation. Human Biology, 87(3), 151–168.
Crooks, A. T., & Castle, C. J. E. (2012). The integration of agent-­based modelling and geographical information for
geospatial simulation. In A. J. Heppenstall, et al. (Eds.), Agent-­based models of geographical systems (pp. 219–251).
Dordrecht: Springer.
Dean, J. S., Gumerman, G. J., Epstein, J. M., Axtell, R. L., Swedlund, A. C., Parker, M. T., & McCarroll, S. (2000).
Understanding Anasazi culture change through agent-­based modeling. In T. A. Kohler & G. J. Gumerman (Eds.),
Dynamics in human and primate societies: Agent-­based modelling of social and spatial processes. Santa Fe Institute Studies
in the Sciences of Complexity (pp. 179–205). New York: Oxford University Press.
Doran, J. E. (1970). Systems theory, computer simulations, and archaeology. World Archaeology, 1, 289–298.
Doran, J. E., & Palmer, M. (1995). The EOS project: Integrating two models of Palaeolithic social change. In
N. Gilbert & R. Conte (Eds.), Artificial societies: The computer simulation of social life (pp. 103–125). London: UCL
Press.
Doran, J. E., Palmer, M., Gilbert, N., & Mellars, P. (1994). The EOS project: Modelling Upper Palaeolithic social
change. In N. Gilbert & J. Doran (Eds.), Simulating societies (pp. 195–221). London: UCL Press.
Ducke, B. (2013). Reproducible data analysis and the open source paradigm in archaeology. In A. Bevan & M. Lake
(Eds.), Computational approaches to archaeological spaces (pp. 307–318). Walnut Creek, CA: Left Coast Press.
Eerkens, J. W., Bettinger, R. L., & McElreath, R. (2005). Cultural transmission, phylogenetics, and the archaeological
record. In C. P. Lipo, M. J. O’Brien, M. Collard, & S. J. Shennan (Eds.), Mapping our ancestors: Phylogenic methods
in anthropology and prehistory (pp. 169–183). Somerset, NJ: Transaction Publishers.
Epstein, J. M., & Axtell, R. (1996). Growing artificial societies: Social science from the bottom up. Washington: Brookings
Press and MIT Press.
Ferber, J. (1999). Multi-­agent systems: An introduction to distributed artificial intelligence. Harlow, England: Addison-­Wesley.
Giddens, A. (1984). The constitution of society: Outline of a theory of structuration. Cambridge: Polity Press.
Gilbert, N. (1995). Emergence in social simulation. In N. Gilbert & R. Conte (Eds.), Artificial societies: The computer
simulation of social life (pp. 144–156). London: U.C.L. Press.
Gilbert, N. (2008). Agent-­based models. Quantitative Applications in the Social Sciences. Thousand Oaks, CA: Sage.
Gould, S. J. (1989). Wonderful life: The Burgess Shale and the nature of history (paperback ed.). London: Vintage.
Grimm, V., Berger, U., Bastiansen, F., Eliassen, S., Ginot, V., Giske, J., . . . DeAngelis, D. L. (2006). A standard protocol
for describing individual-­based and agent-­based models. Ecological Modelling, 198, 115–126.
Grimm, V., Berger, U., DeAngelis, D. L., Polhill, J. G., Giske, J., & Railsback, S. F. (2010). The ODD protocol: A
review and first update. Ecological Modelling, 221, 2760–2768.
Grimm, V., & Railsback, S. (2005). Individual-­based modeling and ecology. Princeton: Princeton University Press.
Gumerman, G. J., & Kohler, T. A. (2001). Creating alternative cultural histories in the prehistoric Southwest: Agent-­
based modelling in archaeology. In Examining the course of Southwest archaeology: The Durango conference, September
1995 (pp. 113–124). Albuquerque: New Mexico Archaeological Council.
Hinde, R. A. (1976). Interactions, relationships and social structure. Man, 11, 1–17.
270 Mark Lake

Innis, G. S. (1972). Simulation of ill-­defined systems, some problems and progress. Simulation, 19, 33–36.
Janssen, M. A. (2009). Understanding Artificial Anasazi. Journal of Artificial Societies and Social Simulation, 12(4), 13.
Retrieved from http://jasss.soc.surrey.ac.uk/12/4/13.html
Johnson, C. D., & Kohler, T. A. (2012). Modeling plant and animal productivity and fuel use. In T. A. Kohler,
M. D. Varien, & A. M. Wright (Eds.), Emergence and collapse of early villages: Models of Central Mesa Verde archaeology
(pp. 113–128). Berkeley: University of California Press.
Johnson, C. D., Kohler, T. A., & Cowan, J. (2005). Modeling historical ecology, thinking about contemporary sys-
tems. American Anthropologist, 107(1), 96–107.
Kachel, A. F., Premo, L. S., & Hublin, J.-­J. (2011). Grandmothering and natural selection. Proceedings of the Royal
Society B: Biological Sciences, 278(1704), 384–391.
King, A. (1999). Against structure: A critique of morphogenetic social theory. The Sociological Review, 47(2), 199–227.
Kobti, Z. (2012). Simulating household exchange with cultural algorithms. In T. A. Kohler & M. D. Varien (Eds.),
Emergence and collapse of early villages: Models of central Mesa Verde archaeology (pp. 165–174). Berkeley: University
of California Press.
Kohler, T. A. (2000). Putting social sciences together again: An introduction to the volume. In T. A. Kohler & G. J.
Gumerman (Eds.), Dynamics in human and primate societies: Agent-­based modelling of social and spatial processes. Santa
Fe Institute Studies in the Sciences of Complexity (pp. 1–44). New York: Oxford University Press.
Kohler, T. A., Bocinsky, R. K., Cockburn, D., Crabtree, S. A., Varien, M. D., Kolm, K. E., . . . Kobti, Z. (2012a).
Modelling prehispanic Pueblo societies in their ecosystems. Ecological Modelling, 241, 30–41.
Kohler, T. A., Cockburn, D., Hooper, P. L., Bocinsky, R. K., & Kobti, Z. (2012b). The coevolution of group size and
leadership: An agent-­based public goods model for prehispanic Pueblo societies. Advances in Complex Systems,
15(1 & 2), 1150007-­1–1150007-­29.
Kohler, T. A., & Gumerman, G. J. (Eds.). (2000). Dynamics in human and primate societies: Agent-­based modeling of social
and spatial processes. Oxford: Oxford University Press.
Kohler, T. A., Gumerman, G. J., & Reynolds, R. G. (2005). Simulating ancient societies. Scientific American, 293, 76–84.
Kohler, T. A., Johnson, C. D., Varien, M., Ortman, S., Reynolds, R., Kobti, Z., . . . Yap, L. (2007). Settlement ecody-
namics in the prehispanic central Mesa Verde region. In T. A. Kohler & S. E. van der Leeuw (Eds.), Model-­based
archaeology of socionatural systems (pp. 61–104). Santa Fe, NM: SAR Press.
Kohler, T. A., Kresl, J., West, C. V., Carr, E., & Wilshusen, R. H. (2000). Be there then: A modeling approach to settle-
ment determinants and spatial efficiency among Late Ancestral Pueblo populations of the Mesa Verde region, U.S.
Southwest. In T. A. Kohler & G. J. Gumerman (Eds.), Dynamics in human and primate societies. Santa Fe Institute
Studies in the Sciences of Complexity (pp. 145–178). New York: Oxford University Press.
Kohler, T. A., & van der Leeuw, S. E. (2007a). Introduction: Historical socionatural systems and models. In T. A.
Kohler & S. E. van der Leeuw (Eds.), The model-­based archaeology of socionatural systems (pp. 1–12). Santa Fe: School
for Advanced Research Press.
Kohler, T. A., & van der Leeuw, S. E. (Eds.). (2007b). The model-­based archaeology of socionatural systems. Santa Fe: School
for Advanced Research Press.
Kohler, T. A., & Varien, M. D. (2012). Emergence and collapse of early villages in the central Mesa Verde: An intro-
duction. In T. Kohler & M. Varien (Eds.), Emergence and collapse of early villages in the Central Mesa Verde: Models of
Central Mesa Verde archaeology (pp. 1–14). Berkeley: University of California Press.
Kolm, K. E., & Smith, S. M. (2012). Modeling paleohydrological system strucure and function. In T. A. Kohler,
M. D. Varien, & A. M. Wright (Eds.), Emergence and collapse of early villages: Models of Central Mesa Verde archaeology
(pp. 73–83). Berkeley: University of California Press.
Lake, M. (2015). Explaining the past with ABM: On modelling philosophy. In G. Wurzer, K. Kowarik, & H. Resch-
reiter (Eds.), Agent-­based modeling and archaeology (pp. 3–35). Switzerland: Springer.
Lake, M. W. (2000a). MAGICAL computer simulation of Mesolithic foraging. In T. A. Kohler & G. J. Gumerman
(Eds.), Dynamics in human and primate societies: Agent-­based modelling of social and spatial processes (pp. 107–143).
New York: Oxford University Press.
Lake, M. W. (2000b). MAGICAL computer simulation of Mesolithic foraging on Islay. In S. J. Mithen (Ed.), Hunter-­
gatherer landscape archaeology: The Southern Hebrides Mesolithic project, 1988–98,Volume 2: Archaeological Fieldwork on
Colonsay, Computer Modelling, Experimental Archaeology, and Final Interpretations (pp. 465–495). Cambridge: The
McDonald Institute for Archaeological Research.
Spatial agent-­based modelling 271

Lake, M. W. (2001a). Numerical modelling in archaeology. In D. R. Brothwell & A. M. Pollard (Eds.), Handbook of
archaeological sciences (pp. 723–732). Chichester: John Wiley & Sons.
Lake, M. W. (2001b). The use of pedestrian modelling in archaeology, with an example from the study of cultural
learning. Environment and Planning B: Planning and Design, 28, 385–403.
Lake, M. W. (2004). Being in a simulacrum: Electronic agency. In A. Gardner (Ed.), Agency uncovered: Archaeological
perspectives on social agency, power and being human (pp. 191–209). London: UCL Press.
Lake, M. W. (2014). Trends in archaeological simulation. Journal of Archaeological Method and Theory, 21(2), 258–287.
Lake, M. W., & Crema, E. R. (2012). The cultural evolution of adaptive-­trait diversity when resources are uncertain
and finite. Advances in Complex Systems, 15(1 & 2), 1150013-­1–1150013-­19.
Levins, R. (1966). The strategy of model building in population biology. American Scientist, 54(4), 421–431.
Marwick, B. (2017). Computational reproducibility in archaeological research: Basic principles and a case study of
their implementation. Journal of Archaeological Method and Theory, 24(2), 424–450.
McGlade, J. (2005). Systems and simulacra: Modeling, simulation, and archaeological interpretation. In H. D. G.
Maschner & C. Chippindale (Eds.), Handbook of archaeological methods (pp. 554–602). Oxford: Altamira Press.
Mithen, S. J. (1988). Simulation as a methodological tool: Inferring hunting goals from faunal assemblages. In
C. L. N. Ruggles & S. P. Q. Rahtz (Eds.), Computer applications and quantitative methods in archaeology 1987. Number
393 in International Series (pp. 119–137). Oxford: British Archaeological Reports.
Mithen, S. J. (1989). Modeling hunter-­gatherer decision making: Complementing optimal foraging theory. Human
Ecology, 17, 59–83.
Mithen, S. J. (1990). Thoughtful foragers: A study of prehistoric decision making. Cambridge: Cambridge University Press.
Mithen, S. J. (1991). “A cybernetic wasteland”? Rationality, emotion and Mesolithic foraging. Proceedings of the
Prehistoric Society, 57, 9–14.
Mithen, S. J. (1993). Smulating mammoth hunting and extinction: Implications for the Late Pleistocene of the Cen-
tral Russian Plain. Archeological Papers of the American Anthropological Association, 4(1), 163–178.
Mithen, S. J. (1994). Simulating prehistoric hunter-­gatherers. In N. Gilbert & J. Doran (Eds.), Simulating societies:
The computer simulation of social phenomena (pp. 165–193). London: UCL Press.
Odling-­Smee, F. J., Laland, K. N., & Feldman, M. W. (2003). Niche construction: The neglected process in evolution. Princ-
eton, NJ: Princeton University Press.
Orton, C. (1982). Computer simulation experiments to assess the performance of measures of quantity of pottery.
World Archaeology, 14, 1–19.
Pierce, C. (1989). A critique of middle-­range theory in archaeology. Paper completed at first year of graduate school.
Premo, L. S. (2005). Patchiness and prosociality: An agent-­based model of Plio/Pleistocene hominid food sharing.
In P. Davidsson, K. Takadama, & B. Logan (Eds.), MABS 2004, volume 3415 of lecture notes in artificial intelligence
(pp. 210–224). Berlin: Springer-­Verlag.
Premo, L. S. (2007). Exploratory agent-­based models: Towards an experimental ethnoarchaeology. In J. T. Clark &
E. M. Hagemeister (Eds.), Digital discovery: Exploring new frontiers in human heritage, CAA 2006: Computer applica-
tions and quantitative methods in archaeology (pp. 29–36). Budapest: Archeolingua Press.
Premo, L. S. (2008). Exploring behavioral terra incognita with archaeological agent-­based models. In B. Frischer &
A. Dakouri-­Hild (Eds.), Beyond illustration: 2D and 3D technologies as tools of discovery in archaeology. British Archaeo-
logical Reports International Series (pp. 46–138). Oxford: ArchaeoPress.
Premo, L. S. (2012). Local extinctions, connectedness, and cultural evolution in structured populations. Advances in
Complex Systems, 15(1 & 2), 1150002-­1–1150002-­18.
Premo, L. S. (2014). Cultural transmission and diversity in time-­averaged assemblages. Current Anthropology, 55(1),
105–114.
Premo, L. S., & Kuhn, S. L. (2010). Modeling effects of local extinctions on culture change and diversity in the
Paleolithic. PLoS One, 5(12), e15582.
Premo, L. S., & Scholnick, J. B. (2011). The spatial scale of social learning affects cultural diversity. American Antiquity,
76(1), 163–176.
Railsback, S. F., & Grimm, V. (2012). Agent-­based and individual-­based modeling: A practical introduction. Princeton:
Princeton University Press.
Reynolds, R. G. (1987). A production system model of hunter-­gatherer resource scheduling adaptations. European
Journal of Operational Research, 30(3), 237–239.
272 Mark Lake

Rogers, J., & Cegielski, W. H. (2017). Building a better past with the help of agent-­based modeling. Proceedings of
the National Academy of Sciences, 114(49), 12841–12844.
Rollins, N. D., Barton, C. M., Bergin, S., Janssen, M. A., & Lee, A. (2014). A computational model library for pub-
lishing model documentation and code. Environmental Modelling & Software, 61, 59–64.
Rubio-­Campillo, X. (2016). Model selection in historical research using Approximate Bayesian Computation. PLoS
One, 11(1), e0146491.
Rubio-­Campillo, X., María Cela, J., & Hernàndez Cardona, F. (2011). Simulating archaeologists? Using agent-­based
modelling to improve battlefield excavations. Journal of Archaeological Science, 39, 347–356.
Schuster, H. G. (1988). Deterministic Chaos. New York: VCH Publishers.
Shanks, M., & Tilley, C. (1987). Re-­constructing archaeology. Cambridge: University Press.
Slingerland, E., & Collard, M. (2012). Introduction creating consilience: Toward a second wave. In E. Slingerland &
M. Collard (Eds.), Creating consilience: Integrating the sciences and humanities (e ed., pp. 123–740). Oxford: Oxford
University Press.
Stonedahl, F., & Wilensky, U. (2010a). Evolutionary robustness checking in the Artificial Anasazi Model. Proceedings of the
AAAI Fall symposium on complex adaptive systems: Resilience, robustness, and evolvability, November 11–13,
Arlington, VA.
Stonedahl, F., & Wilensky, U. (2010b). Netlogo Artificial Anasazi Model. Retrieved from http://ccl.northwestern.edu/
netlogo/models/ArtificialAnasazi
Surovell, T., & Brantingham, P. (2007). A note on the use of temporal frequency distributions in studies of prehistoric
demography. Journal of Archaeological Science, 34(11), 1868–1877.
Swedlund, A. C., Sattenspiel, L., Warren, A. L., & Gumerman, G. G. (2015). Modeling archaeology: Origins of the
Artifical Anasazi Project and beyond. In G. Wurzer, K. Kowarik, & H. Reschreiter (Eds.), Agent-­based modeling and
archaeology (pp. 37–52). Switzerland: Springer.
Thomas, J. (1991). The hollow men? A reply to Steven Mithen. Proceedings of the Prehistoric Society, 57, 15–20.
van der Leeuw, S., & Redman, C. L. (2002). Placing archaeology at the center of socio-­natural studies. American
Antiquity, 67(4), 597–605.
van der Leeuw, S. E. (2008). Climate and society: Lessons from the past 10000 years. AMBIO: A Journal of the Human
Environment, 37(sp14), 476–482.
Waldrop, M. (1992). Complexity: The emerging science at the edge of order and Chaos. New York: Simon & Schuster.
Westervelt, J. D. (2002). Geographic information systems and agent-­based modelling. In H. R. Gimblett (Ed.),
Integrating geographic information systems and agent-­based modeling techniques for simulating social and ecological processes.
Santa Fe Institute Studies in the Sciences of Complexity (pp. 83–104). Oxford: Oxford University Press.
Wilkinson, T., Christiansen, J., Ur, J., Widell, M., & Altaweel, M. (2007). Urbanization within a dynamic environ-
ment: Modeling Bronze Age communities in upper Mesopotamia. American Anthropologist, 109(1), 52–68.
Wobst, H. M. (1974). Boundary conditions for Palaeolithic social systems: A simulation approach. American Antiquity,
39, 147–178.
Worboys, M., & Duckham, M. (2004). GIS: A computing perspective (2nd ed.). London: Taylor and Francis.
Wren, C. D., Zue, J. X., Costopoulos, A., & Burke, A. (2014). The role of spatial foresight on models of hominin
dispersal. Journal of Human Evolution, 69, 70–78.
Xue, J. Z., Costopoulos, A., & Guichard, F. (2011). Choosing fitness-­enhancing innovations can be detrimental under
fluctuating environments. PloS One, 6(11), e26770.
Zubrow, E. (1981). Simulation as a heuristic device in archaeology. In J. A. Sabloff (Ed.), Simulations in archaeology
(pp. 143–188). Albuquerque: University of New Mexico Press.
15
Spatial networks
Tom Brughmans and Matthew A. Peeples

Introduction

What are spatial networks?


A network is a formal representation of the structure of relations among a set of entities of interest. In
many cases, networks are analysed as mathematical graphs where the entities are defined as nodes with
the connections among pairs of nodes defined as edges representing a formal dyadic relationship (edges
are also sometimes called arcs, ties, or links). Nodes and edges can be used to represent any features and
relationships of interest, the only requirements being that they can be formally described and that their
boundaries can be unambiguously defined (at least for analytical purposes). Networks can be described
and visualized in a variety of formats (see Figure 15.1) which can provide information on the presence/
absence or weights of edges or the direction of flows across a network, as well as attributes of nodes and
edges defined without direct reference to the network itself (e.g. node age, size, population estimates, edge
length, etc.). The methods and models used to collect, manage, analyse, present, and interpret network
data are diverse, but generally connected by the notion that the properties of nodes, edges, attributes, and
global structures of a network (or any combination thereof) depend on one another in ways that can
provide us with unique insights and testable ideas about the drivers of a range of social processes (Brandes,
Robins, McCranie, & Wasserman, 2013, pp. 9–11).
Here we focus on a specific class of networks that have received considerable attention in archaeol-
ogy: spatial networks. Spatial networks refer to any set of formally defined nodes and edges where these
network features are located in geometric space, and where network topology (the structural arrangement
of network elements) is at least partly constrained by the spatial relationships among them (Barthelemy,
2011). Common examples include road networks, power grids, or even the internet as a spatialized set of
connections among computers and routers. In archaeology, spatial networks have been used to investigate
a range of phenomena including transportation or flows across roads, rivers, currents, or other cost-­paths;
line-­of-­sight networks for exploring intervisibility; space-­syntax graphs for exploring the accessibility of
features, settlements, or broader landscapes (see Thaler, this volume); and material culture networks of
exchange, interaction, or similarity constrained by the geographic loci of production and consumption
of those materials. We discuss these various applications in detail in this chapter.
274 Tom Brughmans and Matthew A. Peeples

Figure 15.1 Four different network data representations of the same hypothetical Mediterranean transport
network. (a) adjacency matrix with edge length (in km) in cells corresponding to a connection; (b) node-­
link-­diagram where edge width represents length (in km). Please refer to the colour plate for a breakdown by
transport type where red lines = sea, green = river, grey = road); (c) edge list; (d) geographical layout. Once
again, please refer to the colour plate for a breakdown of transport type. A colour version of this figure can be
found in the plates section.
Source: Background © Openstreetmap

Spatial network data allow us to directly explore the systematic spatial relationships among nodes,
edges and attributes that would otherwise be difficult to characterize. The abstract transport network
shown in Figure 15.1 provides an instructive example. The different roles played in the Roman transport
system by Cosa and Portus cannot be understood only with reference to their spatial locations and prox-
imity to other towns, but also by the opportunities afforded by their relationships with all other towns
by way of connections across roads, rivers, and seas. From Portus all other towns can be reached directly
in one step over the transport network, whereas from Cosa two steps are needed to reach either Puteoli
or Carthage. Moreover, the maritime route between Cosa and Portus could come into use or become
popular as a result of the slower alternative route via Rome. When such dependencies are of interest, spa-
tial network methods, often coupled with GIS analytical tools, can offer extremely valuable approaches.
Spatial networks 275

Before we proceed with the archaeological application of spatial networks, we want to briefly
consider the interchangeable use of the words network and graph. The word graph is more commonly
used in the fields of mathematics, computer science and computational geometry. Indeed, graph the-
ory is a long-­established subdiscipline of mathematics and one of the fundamentals of computer sci-
ence (Harary, 1969). In many disciplines where graph theory is applied to real-­world phenomena the
term network is used, and this is the case for the two disciplines with the most active traditions of net-
work research: Social Network Analysis (SNA) and statistical physics. However, in practice the terms
graph and network are commonly used interchangeably and we will here consistently use the term
network.

An overview of archaeological network research: introduction


Spatial networks have a long history in archaeology and many of the earliest applications of network
methods drew upon tools for creating and analysing geographically explicit networks to explore
settlement patterns and exchange systems in particular (see Stjernquist, 1966; Doran & Hodson, 1975,
pp. 12–15; Hodder & Orton, 1976, pp. 68–73 for some early examples of network visualizations).
Building on these early calls, formal spatial network analytical approaches have been sporadically
applied by archaeologists to a host of issues since the 1970s (Terrell (1977) is often cited as the first
formal example) but perhaps surprisingly given the frequent use of spatial data in archaeology, network
methods in general and spatial networks in particular have only recently seen a dramatic increase in
popularity (see Brughmans & Peeples, 2017; Collar, Coward, Brughmans, & Mills, 2015). In this sec-
tion we briefly discuss four of the most common applications of spatial network data in archaeology.
Some of these applications concern network representations of observed relationships such as roads
connecting places, whereas others concern network representations of relationships derived from
archaeological data through an intermediary method, such as the use of similarity measures to repre-
sent material culture similarity networks. This overview is by no means exhaustive, but our discussion
highlights the most common ways that archaeological data are abstracted and formally represented as
network data.

Roads, rivers, oceans, traversal and transportation


Perhaps the most direct method for representing a network based on archaeological spatial data involves
the assessment of transportation and flows at various scales based on formal features like roads, trails, or
rivers or simply the likely paths across various landscapes or waterways. In such networks, nodes are typi-
cally defined as discrete features at a site or on the landscape (rooms, sites, settlements, etc.) and edges are
defined by the features or paths that connect them. In some cases, edges represent easily identifiable for-
mal features like roads and trails (Isaksen, 2007, 2008; Jenkins, 2001; Pailes, 2014; Menze & Ur, 2012) or
riverine paths (Peregrine, 1991) where the edges themselves have clear spatial information. In other cases
the connections between pairs of nodes may be defined using models of the costs of traversal or proximity
across the physiographic environment (Bevan & Wilson, 2013; Hill, Peeples, Huntley, & Carmack, 2015;
Mackie, 2001; Verhagen, Brughmans, Nuninger, & Bertoncello, 2013; White & Barber, 2012), or seas/
oceans (Broodbank, 2000; Evans, 2016; Hage & Harary, 1991, 1996; Irwin, 1978; Knappett, Evans, & Riv-
ers, 2008; Terrell, 1977) that are derived from analyses using GIS, spatially explicit models or related tools.
Networks based on either formal features or models of traversal have been used to explore a broad range
of social processes from the relationship between node position and prominence to the rise of expansive
trade systems, pilgrimages, and settlement hierarchies.
276 Tom Brughmans and Matthew A. Peeples

Visibility networks
Another common topic in archaeological spatial network research is the study of visibility, usually repre-
sented as lines-­of-­sight: the ability for an observer to observe an object of interest within a natural or built-­up
environment or to be observed (see Brughmans & Brandes, 2017, for a recent overview). Visibility networks
are typically defined based on line-­of-­sight data, often derived through GIS analyses (see Gillings & Wheat-
ley, this volume). In line-­of-­sight networks the set of nodes represents the observation locations and the edges
represent lines-­of-­sight. A pair of nodes is connected by an edge if a line-­of-­sight starting at the eye level of
an observer at one observation point can reach the second observation point, i.e. if the line-­of-­sight is not
blocked by a natural or cultural feature. In some studies, this point-­to-­point model of visibility is expanded
to landscape scale assessments of viewsheds where the total cumulative area viewable from a given viewpoint
is defined and networks are created based on areas with overlapping viewsheds or when certain key features
are mutually viewable (see O’Sullivan & Turner, 2001; Brughmans & Brandes, 2017; Bernardini & Peeples,
2015). The method is most commonly used to study hypothesised visual signalling networks, communities
sharing visual landmarks and to explore processes of site positioning and the possible expression of power
relationships through visual control (Bernardini & Peeples, 2015; Brughmans, Keay, & Earle, 2014, 2015,
Brughmans, de Waal, Hofman, and Brandes, 2017; Brughmans & Brandes, 2017; De Montis & Caschili,
2012; Earley-­Spadoni, 2015; Fraser, 1980, 1983; Ruestes Bitrià, 2008; Shemming & Briggs, 2014; Swanson,
2003; Tilley, 1994, pp. 156–166). Analyses of visibility network data frequently involve assessments of the
relative importance of different nodes for sending or receiving information or resources across that network,
or to evaluate the likelihood that a given configuration suggests a concern for signaling, defense, or other
factors among the people who built those features.

Access analyses
A somewhat different use for network methods in spatial data draws on a body of work referred to as
space syntax (Hillier & Hanson, 1984; Hillier, 1996; for a detailed discussion see Thaler, this volume).
The access analysis approach in space syntax is particularly popular in archaeological research. It uses
network graphs and related visualizations to explore the nature of physical or sometimes visible access
within features, buildings, or larger landscapes. The basic idea behind the approach is that we can think of
discrete spaces being “reachable” from one another through tree-­like networks that let us both examine
the overall structure of mutual reachability among spaces and also assess the relative depth (the number
of edges crossed) from one space to another. In this way individual spaces (however they are defined) are
characterized as nodes, and edges are drawn between pairs of nodes that are reachable (i.e. that share a
doorway or are mutually visible). A number of studies have employed space syntax graphs to argue that
tracking or comparing the cultural logics of spatial organization can provide insights into a range of issues
including social organization, public versus private spaces, the distribution of urban services, and social
stratification (see Branting, 2007; Brusasco, 2004; Cutting, 2003; Fairclough, 1992; Ferguson, 1996; Foster,
1989; Grahame, 1997; Wernke, 2012). Analyses of space syntax graphs are often limited to qualitative
assessments, due in part to concerns over incomplete data in archaeological contexts (see Cutting, 2003)
but archaeologists are also starting to take advantage of quantitative tools for assessing the topology of
access networks (e.g. Wernke, 2012; Wernke, Kohut, & Traslaviña, 2017).

Spatial material culture networks


The final common approach to spatial networks in archaeology involves analyses of network data gener-
ated through material cultural data which are assessed in relation to the spatial arrangement of nodes and
Spatial networks 277

edges. The methods used to abstract networks from archaeological material cultural data are quite diverse
but often involve the use of geochemically sourced materials or regions (e.g. Golitko, Meierhoff, Fein-
man, & Williams, 2012), or the shared presence or similarities in material cultural assemblages to define
edges among settlements or regions (e.g. Mills et al., 2013). Although the presence and/or weights of
edges in such material cultural networks are typically defined using aspatial data (such as artefact type
frequencies) the samples from which these data are drawn are often associated with spatial locations that
allow for a consideration of the propinquity of social and spatial relations. In many cases, geographic
proximity or other spatial information is used to generate a null model of geographic connections
expected under certain constraints which is then compared to the network based on material cultural
data. For example, Mills and colleagues (2013) created a two-­mode network of obsidian distribution in
the late Prehispanic Southwest and compared the obsidian network to geographic expectations based on
the costs of travel across the landscape, to identify times and places where the material networks deviated
from the geographic expectation. Most material cultural networks explored using archaeological data
have a spatial component and such direct comparisons between material and geographic distance are
becoming increasingly common (e.g. Gjesfjeld, 2015; Gjesfjeld & Phillips, 2013; Hill et al., 2015).

Method
In this section we will introduce some key concepts in spatial network research, commonly applied ana-
lytical techniques and a range of spatial network models.

Building spatial networks


The range of archaeological applications of spatial networks reviewed above reveals that spatial network
data can either be generated through models, such as those introduced below, or derived from observa-
tions. Regardless of their source, at least three things are needed to build a spatial network dataset: a set
of nodes, a set of edges connecting these nodes, and information about their spatial embeddedness. The
latter could take the form of spatial coordinates of nodes’ point locations or of edges’ starting and ending
locations. Such information is commonly included in attributes attached to the nodes and edges, along
with other additional information about nodes and edges. The most common network data formats are
shown in Figure 15.1, and network data represented in these formats can be imported into most network
science software. The adjacency matrix (Figure 15.1(a)) represents the set of nodes as the column and row
headers and includes a value in the cell referring to a pair of nodes that have an edge. The node-­link-­
diagram (Figure 15.1(b)) represents nodes as points and edges as lines between them, and is a particularly
appropriate data format to emphasise the presence of edges unlike the adjacency matrix which is a more
powerful representation of the absence of edges. The edge list (Figure 15.1(c)) consists of three columns
listing the pair of nodes that are connected by an edge and the value of their connection.

Planar and non-­planar networks


A planar network is a network where the edges do not cross but instead always end in nodes (Figure 15.2).
A key feature of many spatial networks is planarity, which is often enforced precisely because nodes
and sometimes edges are spatially embedded. Planar spatial networks have traditionally received more
attention in network science than non-­planar spatial networks, and many network analysis methods and
models have been purposely developed to study planar networks, some of which are introduced below
(Barthelemy, 2011).
278 Tom Brughmans and Matthew A. Peeples

Figure 15.2 A planar network representing transport routes plotted geographically (a) and topologically (b).
A non-­planar social network representing social contacts between communities plotted geographically (c) and
topologically (d). Note the crossing edges in the non-­planar network. A colour version of this figure can be
found in the plates section.
Source: Background © Openstreetmap

Local and global spatial network analysis measures


A range of statistical network methods can be used to explore the structure of spatial networks. Spatial
network measures usually take the form of aspatial network science techniques modified to include a
physical distance variable reflecting edge distance. Many of these network science measures, when applied
to spatial networks, reveal particular properties of spatial networks such as the generally limited density
of planar networks (see further in chapter). In this chapter we will limit ourselves to listing the most
common network science analytical measures with spatial variants, some of which will be applied in
the case-­study below, but see Barthelemy (2011) for an exhaustive overview of spatial network analysis
measures and properties.
Spatial networks 279

Network analysis measures are commonly divided into local measures that reveal structural properties
of nodes or small sets of nodes, and global measures that reveal structural properties of the network as a
whole. The most common procedure for creating spatial variants of all these measures is to consider the
physical distance of edges, or any other spatially derived attributes of edges such as transport time or effort
of moving between two places, as a repelling “weight” in the algorithm: the higher the physical distance
between two nodes, the lower the score of the measure.
Local measures include degree, paths, centralities, and a node clustering coefficient. A node’s degree
refers to the number of edges it has, and spatial degree refers to the number of edges weighted by their
summed distance. A path is a sequence of connected node pairs from one node to another in the network.
The shortest path from any one node i to any other node j is the minimum number of connected nodes
between i and j that need to be traversed in order to reach j from i. A spatial variant of the shortest path

Cosa
a) Puteoli

Portus
Nodes scaled by degree centrality

Rome Carthage

Cosa
b) Puteoli

132 214
Portus
Nodes scaled by betweenness centrality
126
path segment lengths labeled

29 603

Rome Carthage

Cosa
c) Puteoli

132 214
Portus
Nodes scaled by closeness centrality
126
path segment lengths labeled

29 603

Carthage
Rome

Figure 15.3 Examples of three different node centrality measures: (a) nodes scaled by degree centrality,
(b) nodes scaled by betweenness centrality with path segment lengths shown, (c) nodes scaled by closeness
centrality with path segment lengths shown.
280 Tom Brughmans and Matthew A. Peeples

includes the summed distance of all edges on the path as a weight. Centrality refers to a very large number
of network measures that each reflect a node’s importance in the network according to different structural
features, the most popular of which are degree, closeness and betweenness. A node’s closeness centrality
refers to the network or spatial distance from this node over the set of shortest paths to each other node.
A node’s betweenness centrality refers to the number of all shortest paths between all node pairs in the
network that this node is positioned on. A node’s clustering coefficient is the existing proportion of all
edges that could exist between its direct network neighbours, i.e. the density in the direct neighbourhood
of the network (see O’Sullivan & Turner (2001) for a spatial variant applied to total viewsheds.
Global measures include average degree, degree distribution, density, average shortest path length,
diameter, and network clustering coefficient. The network’s average degree is the average of all nodes’
degree scores. A network’s node degree scores are most commonly explored as a distribution (see the case
study in this chapter for examples). The density is the existing proportion of all edges that could exist
in a network. Spatial networks where the edges are spatially embedded such as transport systems tend
to have very low densities, whereas spatial networks where only the nodes are explicitly embedded such
as artefact similarity networks typically have much higher densities. The average shortest path length is
the average of all shortest path lengths between all node pairs in the network. The network diameter is
the longest shortest path between any pair of nodes in the network. The network clustering coefficient
is the average of all nodes’ clustering coefficient scores.

Spatial network models


A body of techniques has been developed, mainly in computational geometry, to represent core structures
and patterns of a set of spatially embedded nodes. These models are used in archaeological research as
representations of archaeological theories of the interactions or interaction opportunities between the
entities under study. Only a set of nodes and their spatial location are required to apply them, and the
fundamental patterning they derive from this information can be compared to observed network pat-
terning to understand how far removed the empirical network structure is from ideal theorized network
structures. Here we will limit ourselves to introducing some of the most fundamental spatial network
models, but additional and more elaborate models can be found in computational geometry and physics
handbooks and reviews (Chorley & Haggett, 1967; Barthelemy, 2011). More complex models that have
received a lot of attention in archaeology but fall outside the scope of the current overview are models
that evaluate the cost of interaction between node pairs to propose interaction probabilities and derive
hierarchical relationships between nodes. These include gravity models and their modification by Rihll
and Wilson (1987) for the study of the emergence of Greek city-­states (see Bevan & Wilson (2013) for a
further applied example), as well as the ARIADNE model which has been used for the study of interac-
tions between island communities in the Middle Bronze Age Aegean (Evans & Rivers, 2017; Knappett
et al., 2008).

Relative neighbourhood networks, beta skeletons and Gabriel graphs


A pair of nodes are relative neighbours and are connected by an edge if they are at least as close to each
other as they are to any other point (Toussaint, 1980). It can be derived for a pair of nodes Ni and Nj in a
set of nodes N by considering a circle around each with a radius equal to the distance between Ni and Nj.
If the almond-­shaped intersection of the two circles does not include any other points then Ni and Nj are
relative neighbours (Figure 15.4(a–b)). The relative neighbourhood network is a subset of the Delaunay
triangulation and contains the minimum spanning tree (introduced below). A Gabriel graph is derived
Spatial networks 281

a) b)
C

A B A B

c) C d)

A B A B

Figure 15.4 Examples showing relative and Gabriel graph neighborhood definitions: (a) A is a relative neigh-
bor of B because there are no nodes in the shaded overlap between the circles around A and B, (b) A and B
are not relative neighbors because C falls within the shaded overlap. (c) A and B are Gabriel neighbors because
there are no nodes within the circle with a diameter AB, (d) A and B are not Gabriel neighbors because C falls
within the circle with a diameter AB.

when the same principle is applied to a circular (rather than almond-­shaped) region between every pair
of nodes: if no other nodes lie within the circular region with diameter d (i,j) between Ni and Nj then
Ni and Nj are connected in the Gabriel graph (Figure 15.4(c–d)). The concept of relative proximity can
be controlled and varied in an interesting way using the concept of beta skeletons (Kirkpatrick & Radke,
1985). Rather than fixing the diameter of the circle as in the Gabriel graph, the diameter can be varied
using a parameter β. Varying the value of β leads to interesting alternative network structures that are
denser with lower values of β, sparser with higher values of β, and the beta skeleton equals the Gabriel
graph when β = 1 (i.e. when the diameter of the circles equIls d (i,j)). These models create planar net-
works and have been applied in archaeology to study site and artefact distributions as well as to represent
the theoretical flow of ceramics between settlements (Brughmans, 2010; Jiménez-­Badillo, 2012).

Minimum spanning tree


In a set of nodes in the Euclidean plane, edges are created between pairs of nodes to form a tree where
each node can be reached by each other node, such that the sum of the Euclidean edge lengths is less
than the sum for any other spanning tree. The model has been applied by Per Hage and Frank Harary
(1996) to study a diverse range of phenomena in Pacific archaeology: kinship networks and descent, the
evolution and devolution of social and linguistic networks, and classification systems. They also used a
model dynamically generating a minimum spanning tree edge by edge as a theoretical representation of
the growth of a past social network. Herzog (2013) also uses minimum spanning trees as one representa-
tion of least-­cost paths between places.
282 Tom Brughmans and Matthew A. Peeples

Delaunay triangulation
A triangulation network aims to create as many triangles as possible without allowing for any crossing
edges and therefore creates planar networks. The Delaunay triangulation specifically is derived from the
Voronoi diagram or Thiessen polygons: a pair of nodes are connected by an edge if and only if their
corresponding tiles in a Voronoi diagram (or Thiessen polygons) share a side. The model has seen wide-
spread application for representing archaeological theories, but mainly for the study of transport systems.
To name just a few, Fulminante (2012) used Delaunay triangulation as a theoretical model for a road and
river transport system between Iron Age towns in Central Italy (Latium Vetus), and Herzog (2013) used
it as a representation of least-­cost path networks. Evans and Rivers (2017) apply Delaunay triangulation
for exploring the rise of Greek city-­states.

K-­nearest neighbours and maximum distance


In the previously discussed models nodes were connected to their nearest neighbours relative to the location
of all other nodes. However, a simpler way of creating nearest neighbour networks is to connect a node to
the closest other nodes regardless of the location of all other nodes. This is the approach taken in K-­nearest
neighbour networks, where each node is connected to the K other nodes closest to it. The method is
sometimes called Proximal Point Analysis (Terrell, 1977). Another alternative to relative neighbourhood
networks is offered by maximum distance networks: a node pair Ni and Nj is connected if the distance from
each other d (i,j) is lower or equal than a threshold distance value dmax. In archaeological applications of these
two models the edges are usually considered to represent the most likely channels for the flow of material
or immaterial resources between individuals, settlements or island communities (Broodbank, 2000; Collar,
2013; Terrell, 1977). An applied example of these two models is given in the case study.

Case study
We will illustrate some of the network measures and models introduced in this chapter through an
exploration of the structure of the Roman transport system. By applying a wide range of spatial network
models and methods we will illustrate how interesting insights can be gained by taking a topological as
well as spatial look at a past phenomenon. The following research questions will guide our exploration
of the transport system:

• In what regions is the transport system particularly dense and in what regions is it particularly sparse?
• How important is each urban settlement as an intermediary in the flow of information or goods
between all other settlements?
• How did the Roman transport system structure flows of supplies to the capital of Rome, and which
regions and supplying towns were better positioned in the system to supply Rome?
• Does the Roman transport system reveal a particular spatial structure: nearest-­neighbour, relative-­
neighbour or maximum distance?

An abstract representation of the Roman transport system will be used here: the Orbis geospatial network
model of the Roman world (Scheidel, 2015; Meeks, Scheidel, Weiland, & Arcenas, 2014). Orbis offers a
static and hypothetical representation of the Roman transport system with limited detail. Therefore, our
present analysis merely aims to explore our research questions within the context of the coarse-­g rained
structure of the Roman transport system in the second century AD as hypothesised by the Orbis team.
Spatial networks 283

Data
We decided to use the Orbis dataset because it is well-­studied and well-­known among Roman archaeol-
ogy scholars, it is open access and reusable for research purposes (Meeks et al., 2014), and it provides the
only functional network dataset covering the entire Roman Empire at its largest extent. However, a key
limitation of Orbis is that it is not as detailed as our current knowledge of Roman settlements and routes
allows, precisely because it aims to represent the broad Empire-­wide structure of the Roman transport
system in a comparable way. Moreover, the selection of nodes and edges, as well as the distance assigned to
edges, reflect decisions by its creators and should be submitted to sensitivity analyses (which is not within
the scope of this chapter). Finally, Orbis represents a static picture of what the Roman transport system
might have looked like in the second century AD, and does not offer the ability to explore how this
system changed through time. The longitude and latitude of all nodes was cross-­checked with the Ple-
iades gazetteer of ancient placenames (Bagnall et al., 2018) and corrected where necessary. The resulting
network dataset includes a set of 678 nodes, 570 of which represent urban settlements and the remainder
cultural features such as crossroads or natural features such as capes. The node attributes include the settle-
ment name and latitude longitude coordinates. These nodes are connected by a set of 2208 directed links
representing the ability to travel between a node pair in a particular direction. Edge attributes include the
type of transport link (road, river, sea) and the distance in kilometres.

Spatial network visualisation


An initial visual exploration of this network can be performed to identify key structural features, using
both geographical and topological layout algorithms. A geographical visualisation places the nodes in
their correct geographical positions, which allows for an intuitive and recognisable exploration of the
regional differences in node and edge distribution (Figure 15.5(a)). For example, we can easily identify
the difference between maritime and terrestrial routes, the geographical extent of the Roman Empire, the
Rhine and Danube Rivers making up the edges of the system at the northern borders of the empire, and
the strong difference in node and edge density between Italy and the rest of the system. However, this fig-
ure has a high degree of node and edge overlap making the structure of the network particularly difficult
to interpret. The topological visualisation shown in Figure 15.5(b) aims to avoid such overlap, revealing
at a glance a number of interesting structural features that allow us to provide an informal answer to our
first research question: the Aegean region is particularly dense; another dense cluster at the centre of the
network consists of present-­day Italy, France and Spain; the river Nile creates a tree-­like pattern at the
periphery of the network; provinces along the border of the empire have sparser transport networks.

Distance weighted betweenness centrality


Betweenness centrality allows us to answer our second research question because it measures how impor-
tant a node is as an intermediary in the flow of information or goods between all other nodes, and it is
therefore a particularly appropriate measure to study transport systems. It is calculated by counting how
often each node is positioned on the shortest paths between all node pairs. Applying this measure to the
Orbis network gives the results shown in Table 15.1 and Figure 15.5(c, d). The topological visualisation
(Figure 15.5(d)) reveals that nodes at the centre of the network and in particular those crossing dense
clusters score very high whilst nodes at the periphery score very low. The geographical visualisation
(Figure 15.5(c)) further reveals that these highly scoring nodes are a chain of port sites connecting Egypt
with Britain circling around the Iberian Peninsula.
284 Tom Brughmans and Matthew A. Peeples

Figure 15.5 Network representation of the Orbis network: geographical layout (a, c) and topological layout
(b, d). Node size and colour represent betweenness centrality weighted by physical distance in (a) and (b), and
they represent unweighted betweenness centrality in (c) and (d): the bigger and darker blue the node, the more
important it is as an intermediary for the flow of resources in the network. By comparing (a, b) with (c, d),
note the strong differences in which settlement is considered a central one depending on whether physical
distance is taken into account (a, b) or not (c, d). Edge colours represent edge type: red = sea, green = river,
grey = road. A colour version of this figure can be found in the plates section.
Source: Background © Openstreetmap

However, this unweighted betweenness centrality measure completely ignores physical distance and
considers the traversal of each edge equally: all that is considered is the number of hops over the network
to get from one node to the other. To make this network analysis more representative of the physical
reality of the system we can weigh the edges according to their physical distance, where a shortest path is
now defined as the path between a pair of nodes with the lowest summed distance. Results of the distance
Spatial networks 285

Table 15.1 Top 20 highest ranking towns according to the topological betweenness centrality measure and the
distance weighted betweenness centrality measure. Towns highly ranked according to both measures are highlighted.

Rank Betweenness Distance weighted betweenness

1 Messana Puteoli
2 Alexandria Delos
3 Rhodos Hispalis
4 Gades Roma
5 Apollonia-­Sozousa Palantia
6 Olisipo Pisae
7 Sallentinum Pr. Ascalon
8 Flavium Brigantium Aquileia
9 Acroceraunia Pr. Rhodos
10 Lilybaeum Isca
11 Civitas Namnetum Apollonia-­Sozousa
12 Portus Blendium Lydda
13 Paphos Iuliobona
14 Ostia/Portus Placentia
15 Carthago Constantinopolis
16 Corcyra Histria
17 Aquileia Ephesus
18 Caralis Mothis
19 Sigeion Patara
20 Constantinopolis Lancia

weighted betweenness centrality measure are shown in Table 15.1 and Figure 15.5(a, b). Note how differ-
ent the top scoring towns are (Table 15.1), only four towns occur in both measures’ top 20 list. The high
scoring towns are still mostly ports but are now more equally spread throughout the system, often with
one or a few high scoring towns per province (Figure 15.5a). These high scoring towns can be inter-
preted as the most important intermediaries for the flow of goods and information through this abstract
representation of the Roman transport system if we assume that the shortest possible path between towns
was always preferred. The same method can of course be applied to represent other assumptions such as
the shortest path in terms of time or financial cost.

Distance from Rome


We now turn to our third research question centred on Rome: the capital of the Roman Empire and a
mega city with more than one million inhabitants. The city needed a constant supply of all types of goods
and was the largest market for staple goods. Indeed, much of the Roman economy was structured by the
need to supply the huge population of the city of Rome. One approach to understanding this structur-
ing is to explore how the Roman transport system could have structured flows of supplies to Rome, and
which regions and supplying towns were better positioned on this network to supply Rome. We already
know that “All roads lead to Rome”, but from some towns the roads take you there much faster than
286 Tom Brughmans and Matthew A. Peeples

Figure 15.6 Geographical network representation of the Orbis network: geographical layout (a) and topologi-
cal layout (b). Node size and colour represent increasing physical distance over the network away from Rome:
the larger and darker the node, the further away this settlement is from Rome following the routes of the
transport system. Note the fall-­off of the results with distance away from Rome structured by the transport
routes rather than as-­the-­crow-­flies distance. Edge colours represent edge type: red = sea, green = river, grey =
road. A colour version of this figure can be found in the plates section.
Source: Background © Openstreetmap

from other towns. These differences can be identified using spatial network methods, by calculating the
shortest paths from all towns to Rome according to the sum of their physical distance.
The results of this analysis (Figure 15.6) reveal of course a fall-­off with distance away from Rome.
But note that this does not merely represent a fall-­off of towns’ scores with as-­the-­crow-­flies distance
from Rome, as could be easily calculated in GIS, but rather with their distance to Rome over the short-
est path of the network. It offers a representation of physical distance morphed and structured by the
Roman transport system. We can observe differences between the outlying regions, like Britain being
closer than much of Syria and Egypt. But a more interesting result is the proximity of areas that became
the earliest overseas provinces: the proximity of Tunisian towns around Carthage, Sardinia, as well as
the relatively short distances to towns in Southern France and Western Spain as compared to much of
Greece, for example. These results also offer an appropriate visualisation of what we know about the
well-­documented large-­scale and possibly partly state-­organised supplies of foodstuffs to Rome from
Tunisia especially from the second century AD onwards, and it highlights the huge organisational efforts
that must have gone into the long distance and equally well-­documented transport of foodstuffs from
Southern Spain and, in particular, Egypt.

Network models
The network models discussed earlier in this chapter can be applied to the Orbis settlement distribution
pattern to answer our fourth research question. What spatial structuring does the settlement distribution
included in Orbis reveal? To what extent does the Orbis network align with or deviate from this struc-
turing? Does the Roman transport system reveal a nearest-­neighbour, relative-­neighbour or maximum
Spatial networks 287

distance structure? We will use global network measures to compare how similar the structure of the
simulated network models are to that of the Orbis network. The models presented in this section were
implemented in NetLogo, a very accessible programming language with an intuitive user-­interface and
comprehensive network science and GIS libraries (Wilensky, 1999).

K-­nearest-­neighbour networks
This model is very sensitive to the proximity of sets of nodes, and reveals clusters of densely settled areas in
the Orbis set of towns (Figure 15.7; Table 15.2). The nearest-­neighbour networks with K equals 1 and 2
are very disconnected, although for K equals 2 the global network measures are very similar to the Orbis
network but more clustered (Table 15.2). The network becomes connected with 4-­nearest-­neighbours
and the 10-­nearest-­neighbours network emphasises the clusters in areas where the settlement pat-
tern is densest, but both these networks are much denser and more clustered than the Orbis network
(Table 15.2). The degree distributions for these K-­nearest-­neighbour networks shows very little variance.
The lower limit always equals K, and just a few towns have a higher degree than most other towns, a dif-
ference that increases as K increases. In contrast, the degree distribution of the real Orbis network is very
skewed (Figure 15.5): the large majority of towns are connected to less than eight other towns, whereas
very few towns have a much higher degree. The towns with the highest degree are important port towns
or large population centres: Delos, Rhodos, Carthago, Ostia/Portus, Lilybaeum, Paphos, Messana, Rome
(the first two in this list have the highest degree, but this is partly caused by the very high density of
nodes in the Aegean area). The K-­nearest-­neighbour networks clearly do not capture this feature of the
Orbis network. The maritime routes in the Orbis dataset which cross long distances through the Atlantic
Ocean and the Mediterranean and Black Sea, are also not recreated by this model. However, aspects of the
structure of the terrestrial roads and the dense connections between Aegean islands, as well as the coastal
and riverine connections, are better captured by this model where K equals 4.

Maximum distance networks


The maximum distance networks have very different network patterns and degree distributions com-
pared to the previously discussed models (Figure 15.8; Table 15.2). At a maximum distance up to 165km
only the densest settled areas in the Orbis dataset in Central Italy, the Aegean and Phoenicia reveal dense
clusters. Only at a maximum distance of 220km does the outline of the Orbis transport network start
to appear and around a maximum distance of 440km the network becomes connected. However, the
220km and 440km networks are much denser than the Orbis network. The 82.5km and 99km maxi-
mum distance networks show a density, number of edges and average degree that is more similar to the
Orbis network, but like all other maximum distance networks the degree of clustering is much too high
(Table 15.2). Like the other models, this model does not succeed in capturing the long distance maritime
routes of the Orbis network but it does slightly better at representing the terrestrial, coastal and riverine
connections. The degree distribution is very different from both Orbis and the other models: the higher
the maximum distance, the higher the maximum degree; the degree distribution is only very slightly
skewed towards the lower degrees but tends to be very spread out.

Gabriel graph and relative neighbourhood network


Aside from the long distance overseas routes, the relative neighbourhood network captures the shape
of the transport system rather well (Figure 15.9; Table 15.2). It offers an outline of the Orbis network
including most coastal routes, includes some of the maritime connections between the African and
Figure 15.7 Nearest neighbour network results of the Orbis set of nodes. Node size represents degree. Insets
show degree distributions. Note how the network only becomes connected into a single component when
assuming 4-­nearest-­neighbours.
Spatial networks 289

Table 15.2 Results of global network measures for all tested models and the undirected Orbis network (in bold).
Highlighted results show some similarity in global network measures with the Orbis network.

Edges Average Degree Density Average Clustering Coefficient

Orbis (undirected) 805 2.825 0.005 0.235


1-­nearest-­neighbour 391 1.372 0.002 0.665
2-­nearest-­neighbour 743 2.607 0.005 0.447
4-­nearest-­neighbour 1416 4.968 0.009 0.551
10-­nearest-­neighbour 3488 12.239 0.022 0.614
82.5km-­maximum-­distance 684 2.4 0.004 0.818
99km-­maximum-­distance 981 3.442 0.006 0.771
220km-­maximum-­distance 3631 12.74 0.022 0.668
440km-­maximum-­distance 11321 39.723 0.07 0.697
Relative-­neighbourhood 663 2.326 0.004 0.079
Gabriel-­graph 1040 3.649 0.006 0.239

Eurasian continents and shows some similarities in the density and structure of the terrestrial routes.
However, the degree distribution is normally distributed and there is very little variance in nodes’
degrees. The Gabriel graph similarly shows little variance in its normally distributed degree distribu-
tion, but its triangular structure does succeed in recreating some of the long distance maritime con-
nections. Moreover, it is the only model used here that has an average clustering coefficient close to
that of the Orbis network.

Conclusions of network modelling results


This comparison of models suggests that the density, average degree and number of edges can be
approximated by a number of models: 2-­nearest-­neighbour, 82.5km and 99km maximum-­distance,
relative-­neighbourhood-­network, and Gabriel graph. However, only the latter two show similarities
in the shape of the Orbis network, and only the Gabriel graph succeeds in capturing the degree of
clustering. None of the models succeed in reproducing the very skewed degree distribution, suggest-
ing alternative models should be tested that include preferential attachment effects giving rise to a few
very highly connected nodes. These modelling results suggest that theories about the structure of the
Roman transport system, as hypothesised in the static, coarse resolution Orbis network, should: include
a tendency for settlements to be connected to a limited number of their nearest neighbours (e.g. 2–3);
mostly avoid the creation of very long distance routes (e.g. > 100km); crucially take into account the
position of pairs of nodes relative to all other nearby nodes by avoiding connections between settlement
pairs which have other settlements located in the circular neighbourhood described by the diameter
between them (i.e. the Gabriel graph). The results further suggest that these models should include
an effect to allow for high degree nodes to reproduce the skewed degree distribution (e.g. preferential
attachment), a pattern that is rarely reproduced in the explicitly spatial relative or nearest neighbour-
hood network models presented here.
Figure 15.8Maximum distance network results of the Orbis set of nodes. Node size represents degree. Insets
show degree distributions. Note how the network only becomes connected into a single component when
assuming 440 km as the maximum distance.
Spatial networks 291

Figure 15.9 Results of the Orbis set of nodes; (a) relative neighbourhood network and (b) Gabriel graph. Node
size represents degree. Insets show degree distributions. Note how the networks, as compared to the results
shown in Figures 15.7 and 15.8, better succeed in representing the shape of the Orbis transport network and
the long-­distance maritime routes crossing the Mediterranean.

Conclusion
In this chapter we have introduced spatial networks as consisting of sets of spatially embedded nodes and
edges whose topology is partly restricted by physical space. A strong research tradition in the archaeo-
logical application of spatial networks has focused on a few key themes: transport networks, visibility
networks, space syntax and material culture networks. The most commonly applied local and global
network measures have been introduced, along with a range of fundamental spatial network models.
Many of the methods and models introduced in this chapter were illustrated through a case study
which aimed at exploring the structure of the Roman transport system, as hypothesised by the Orbis
network. Geographical and topological visualisations of the Orbis network revealed complemen-
tary insights into regional differences in transport network density. The use of a distance weighted
292 Tom Brughmans and Matthew A. Peeples

betweenness centrality measure identified settlements that are particularly crucial as intermediaries
for the flow of information, people and goods in this system. Calculating the summed distance of
the shortest paths from all settlements to Rome highlighted regional differences in the proximity to
Rome following the transport network, which has implications for their ability to supply foodstuffs
to the capital. Finally, spatial network modelling results suggest that theories about the structure of
the Roman transport system should include nearest-­neighbourhood, relative-­neighbourhood and
maximum-­distance effects, and a preferential attachment effect is hypothesised to be a further key
explanatory factor.
Spatial network applications have a long history in archaeological research, but they have only recently
received more attention in the research traditions at the core of network science: social network analysis
and physics. We believe the strong archaeological research tradition in spatial networks reveals an impor-
tant opportunity for archaeologists to contribute to the future development of spatial network methods
and models and to their multi-­disciplinary application. More intense interaction with the broader net-
work science community will in turn lead to a richer toolbox of spatial network methods and models for
archaeologists to let loose on their research topics.

References
Bagnall, R., Talbert, R., Elliot, T., Holman, L., Becker, J., Bond, S., . . . Turner, B. (2018). Pleiades: A Gazetteer of past
places. Retrieved from http://pleiades.stoa.org/
Barthelemy, M. (2011). Spatial networks. Physics Reports, 499(1–3), 1–101. https://doi.org/10.1016/j.physrep.
2010.11.002
Bernardini, W., & Peeples, M. A. (2015). Sight communities: The social significance of shared visual landmarks.
American Antiquity, 80(2), 215–235. https://doi.org/10.7183/0002-­7316.80.2.215
Bevan, A., & Wilson, A. (2013). Models of settlement hierarchy based on partial evidence. Journal of Archaeological
Science, 40(5), 2415–2427.
Brandes, U., Robins, G., McCranie, A., & Wasserman, S. (2013). What is network science? Network Science, 1(1), 1–15.
https://doi.org/10.1017/nws.2013.2
Branting, S. (2007). Using an urban street network and a PGIS-­T ycladeh to analyze ancient movement. In E. M.
Clark & J. T. Hagenmeister (Eds.), Digital discovery: Exploring new frontiers in human heritage: Proceedings of the 34th
CAA conference, Fargo, 2006 (pp. 87–96). Budapest: Archaeolingua.
Broodbank, C. (2000). An Island archaeology of the early cyclades. Cambridge: Cambridge University Press.
Brughmans, T. (2010). Connecting the dots: Towards archaeological network analysis. Oxford Journal of Archaeology,
29(3), 277–303.
Brughmans, T., & Brandes, U. (2017). Visibility network patterns and methods for studying visual relational Phe-
nomena in archaeology. Frontiers in Digital Humanities: Digital Archaeology, 4(17). https://doi.org/10.3389/
fdigh.2017.00017
Brughmans, T., de Waal, M. S., Hofman, C. L., & Brandes, U. (2017). Exploring transformations in Caribbean
indigenous social networks through visibility studies: The case of late pre-­colonial landscapes in East-­Guadeloupe
(French West Indies). Journal of Archaeological Method and Theory, 25(2), 475–519. https://doi.org/10.1007/
s10816-­017-­9344-­0
Brughmans, T., Keay, S., & Earl, G. (2014). Introducing exponential random graph models for visibility networks.
Journal of Archaeological Science, 49, 442–454. https://doi.org/10.1016/j.jas.2014.05.027
Brughmans, T., Keay, S., & Earl, G. (2015). Understanding inter-­settlement visibility in Iron age and Roman Southern
Spain with exponential random graph models for visibility networks. Journal of Archaeological Method and Theory,
22, 58–143. https://doi.org/10.1007/s10816-­014-­9231-­x
Brughmans, T., & Peeples, M. (2017). Trends in archaeological network research: A bibliometric analysis. Journal of
Historical Network Research, 1(1), 1–24. https://doi.org/10.25517/jhnr.v1i1.10
Brusasco, P. (2004). Theory and practice in the study of mesopotamian domestic space. Antiquity, 78, 142–157.
Chorley, R. J., & Haggett, P. (1967). Models in geography. London: Methuen.
Spatial networks 293

Collar, A. (2013). Re-­thinking Jewish ethnicity through social network analysis. In C. Knappett (Ed.), Network
analysis in archaeology: New approaches to regional interaction (pp. 223–246). Oxford: Oxford University Press.
Collar, A., Coward, F., Brughmans, T., & Mills, B. J. (2015). Networks in archaeology: Phenomena, abstraction,
representation. Journal of Archaeological Method and Theory, 22, 1–32. https://doi.org/10.1007/s10816-­014-­9235-­6
Cutting, M. (2003). The use of spatial analysis to study prehistoric settlement architecture. Oxford Journal of Archaeol-
ogy, 22, 1–21.
De Montis, A., & Caschili, S. (2012). Nuraghes and landscape planning: Coupling viewshed with complex network
analysis. Landscape and Urban Planning, 105(3), 315–324. https://doi.org/10.1016/j.landurbplan.2012.01.005
Doran, J. E., & Hodson, F. R. (1975). Mathematics and computers in archaeology. Cambridge, MA: Harvard University
Press.
Earley-­Spadoni, T. (2015). Landscapes of warfare: Intervisibility analysis of early Iron and Urartian fire beacon sta-
tions (Armenia). Journal of Archaeological Science: Reports, 3, 22–30. https://doi.org/10.1016/j.jasrep.2015.05.008
Evans, T. (2016). Which network model should I use? Towards a quantitative comparison of spatial network models
in archaeology. In T. Brughmans, A. Collar, & F. Coward (Eds.), The connected past: Challenges to network studies in
archaeology and history (pp. 149–173). Oxford: Oxford University Press.
Evans, T. S., & Rivers, R. J. (2017). Was Thebes necessary? Contingency in spatial modelling. Frontiers in Digital
Humanities: Digital Archaeology, 4(8), 1–21. https://doi.org/10.3389/fdigh.2017.00008
Fairclough, G. (1992). Meaningful constructions: Spatial and functional analysis of medieval buildings. Antiquity, 66,
348–366.
Ferguson, T. J. (1996). Historic Zuni architecture and society: An archaeological application of space syntax. Papers of the
University of Arizona No. 60. Tucson: University of Arizona Press.
Foster, S. M. (1989). Analysis of spatial patterns in buildings (access analysis) as an insight into social structure:
Examples from the Scottish Atlantic Iron age. Antiquity, 63, 40–50.
Fraser, D. (1980). The cutpoint index: A simple measure of point connectivity. Area, 12(4), 301–304.
Fraser, D. (1983). Land and society in neolithic Orkney. BAR British Series, 117. Oxford: Archaeopress.
Fulminante, F. (2012). Social network analysis and the emergence of central places a case study from Central Italy
(Latium Vetus). BABESCH, 87, 27–53.
Gjesfjeld, E. (2015). Network analysis of archaeological data from hunter-­gatherers: Methodological problems and
potential solutions. Journal of Archaeological Method and Theory, 22(1), 182–205.
Gjesfjeld, E., & Phillips, S. C. (2013). Evaluating adaptive network strategies with geochemical sourcing data: A case
study from the Kuril Islands. In C. Knappett (Ed.), Network analysis in archaeology: New approaches to regional interac-
tion (pp. 281–306). Oxford: Oxford University Press.
Golitko, M., Meierhoff, J., Feinman, G. M., & Williams, P. R. (2012). Complexities of collapse: The evidence of Maya
Obsidian as revealed by social network graphical analysis. Antiquity, 86, 507–523.
Grahame, M. (1997). Public and private in the Roman house: The spatial order of the Casa Del Fauno. In R. Lau-
rence & A. Wallace-­Hadrill (Eds.), Domestic space in the Roman world: Pompeii and beyond. Journal of Roman
Archaeology, Supplementary Series, 22 (pp. 137–164). Portsmouth, Rhode Island.
Hage, P., & Harary, F. (1991). Exchange in Oceania: A graph theoretic analysis. Oxford: Clarendon Press.
Hage, P., & Harary, F. (1996). Island networks: Communication, kinship and classification structures in Oceania.
Cambridge: Cambridge University Press.
Harary, F. (1969). Graph theory. Reading, MA and London: Addison-­Wesley.
Herzog, I. (2013). Least-­cost networks. In G. Earl, T. Sly, A. Chrysanthi, P. Murrieta-­Flores, C. Papadopoulos,
I. Romanowska, & D. Wheatley (Eds.), Archaeology in the digital era: Papers from the 40th annual conference of computer
applications and quantitative methods in archaeology (CAA), Southampton, 26–29 March 2012 (pp. 237–248). Amster-
dam: Amsterdam University Press.
Hill, J. B., Peeples, M. A., Huntley, D. L., & Carmack, H. J. (2015). Spatializing social network analysis in the late pre-
contact U.S. Southwest. Advances in Archaeological Practice, 3(1), 63–77. https://doi.org/10.7183/2326-­3768.3.1.63
Hillier, B. (1996). Space is the machine. Cambridge: Cambridge University Press.
Hillier, B., & Hanson, J. (1984). The social logic of space. Cambridge: Cambridge University Press.
Hodder, I., & Orton, C. (1976). Spatial analysis in archaeology. Cambridge: Cambridge University Press.
Irwin, G. J. (1978). Pots and entrepots: A study of settlement, trade and the development of economic specialization
in Papuan prehistory. World Archaeology, 9(3), 299–319.
294 Tom Brughmans and Matthew A. Peeples

Isaksen, L. (2007). Network analysis of transport vectors in Roman baetica. In J. T. Clark & E. M. Hagenmeister
(Eds.), Digital discovery: Exploring new frontiers in human heritage: Proceedings of the 34th CAA conference, Fargo, 2006
(pp. 76–87). Budapest: Archaeolingua.
Isaksen, L. (2008). The application of network analysis to ancient transport geography: A case study of Roman Baetica.
Digital Medievalist, 4. Retrieved from www.digitalmedievalist.org/journal/4/isakse
Jenkins, D. (2001). A network analysis of Inka roads, administrative centers, and storage facilities. Ethnohistory, 48(4),
655–687.
Jiménez-­Badillo, D. (2012). Relative neighbourhood networks for archaeological analysis. In M. Zhou, I. Romanowska,
Z. Wu, P. Xu, & P. Verhagen (Eds.), Revive the past: Proceedings of computer applications and quantitative techniques in
archaeology conference 2011, Beijing (pp. 370–380). Amsterdam: Amsterdam University Press. Retrieved from http://
proceedings.caaconference.org/files/2011/42_Jimenez-­Badillo_CAA2011.pdf
Kirkpatrick, D. G., & Radke, J. D. (1985). A framework for computational morphology. Machine Intelligence and
Pattern Recognition, 2, 217–248.
Knappett, C., Evans, T., & Rivers, R. (2008). Modelling maritime interaction in the Aegean Bronze age. Antiquity,
82(318), 1009–1024.
Mackie, Q. (2001). Settlement archaeology in a Fjordland archipelago: Network analysis, social practice and the built
environment of Western Vancouver Island, British Columbia, Canada since 2,000 BP. BAR International Series
926. Oxford: Archaeopress.
Meeks, E., Scheidel, W., Weiland, J., & Arcenas, S. (2014). ORBIS (v2) network edge and node tables. Stanford Digital
Repository. Retrieved from Http://Purl.stanford.edu/mn425tz9757
Menze, B. H., & Ur, J. A. (2012). Mapping patterns of long-­term settlement in Northern Mesopotamia at a large
scale. Proceedings of the National Academy of Sciences of the United States of America, 109(14), E778–E787. https://
doi.org/10.1073/pnas.1115472109
Mills, B. J., Clark, J. J., Peeples, M. A., Haas, W. R., Roberts, J. M., Hill, J. B., . . . Shackley, M. S. (2013, March).
Transformation of social networks in the late pre-­hispanic US Southwest. Proceedings of the National Academy of
Sciences of the United States of America, 1–6. https://doi.org/10.1073/pnas.1219966110
O’Sullivan, D., & Turner, A. (2001). Visibility graphs and landscape visibility analysis. International Journal of Geo-
graphical Information Science, 15(3), 221–237. https://doi.org/10.1080/13658810010011393
Pailes, M. (2014). Social network analysis of early classic Hohokam corporate group inequality. American Antiquity,
79(3), 465–486. https://doi.org/10.7183/0002-­7316.79.3.465
Peregrine, P. (1991). A graph-­theoretic approach to the evolution of Cahokia. American Antiquity, 56(1), 66–75.
Rihll, T. E., & Wilson, A. G. (1987). Spatial interaction and structural models in historical analysis: Some possibilities
and an example. Histoire & Mesure, 2, 5–32.
Ruestes Bitrià, C. (2008). A multi-­technique GIS visibility analysis for studying visual control of an Iron age land-
scape. Internet Archaeology, 23. http://intarch.ac.uk/journal/issue23/4/index.html
Scheidel, W. (2015). Orbis: The stanford geospatial network model of the Roman world. Princeton/Stanford Working
Papers in Classics. https://doi.org/10.1093/OBO/97801953896610075.2
Shemming, J., & Briggs, K. (2014). Anglo-­Saxon communication networks. Retrieved from http://keithbriggs.info/
AS_networks.html
Stjernquist, B. (1966). Models of commercial diffusion in prehistoric times. Scripta Minora, 2, 1–43.
Swanson, S. (2003). Documenting prehistoric communication networks: A case study in the Paquimé polity. American
Antiquity, 68(4), 753–767.
Terrell, J. E. (1977). Human biogeography in the solomon Islands. Fieldiana Anthropology, 68(1), 1–47.
Tilley, C. (1994). A phenomenology of landscape: Places, paths and monuments. Oxford: Berg.
Toussaint, G. T. (1980). The relative neighbourhood graph of a finite planar set. Pattern Recognition, 12(4), 261–268.
https://doi.org/10.1016/0031-­3203(80)90066-­7
Verhagen, P., Brughmans, T., Nuninger, L., & Bertoncello, F. (2013). The long and winding road: Combining least
cost paths and network analysis techniques for settlement location analysis and predictive modelling. In G. Earl,
T. Sly, A. Chrysanthi, P. Murrieta-­Flores, C. Papadopoulos, I. Romanowska, & D. Wheatley (Eds.), Archaeology
in the digital era: Papers from the 40th annual conference of computer applications and quantitative methods in archaeology
(CAA), Southampton, 26–29 March 2012 (pp. 357–366). Amsterdam: Amsterdam University Press.
Spatial networks 295

Wernke, S. A. (2012). Spatial network analysis of a terminal prehispanic and early colonial settlement in Highland
Peru. Journal of Archaeological Science, 39(4), 1111–1122. https://doi.org/10.1016/j.jas.2011.12.014
Wernke, S. A., Kohut, L. E., & Traslaviña, A. (2017). A GIS of affordances: Movement and visibility at a planned colo-
nial town in Highland Peru. Journal of Archaeological Science, 84, 22–39. https://doi.org/https://doi.org/10.1016/j.
jas.2017.06.004
White, D. A., & Barber, S. B. (2012). Geospatial modeling of pedestrian transportation networks: A case study from
Precolumbian Oaxaca, Mexico. Journal of Archaeological Science, 39(8), 2684–2696. https://doi.org/10.1016/j.
jas.2012.04.017
Wilensky, U. (1999). NetLogo. Retrieved from Http://Ccl.northwestern.edu/Netlogo/. Center for Connected
Learning and Computer-­Based Modeling, Northwestern University, Evanston, IL.
16
Space syntax methodology
Ulrich Thaler

Introduction
You may not have heard of Harry Beck (Garfield, 2012, pp. 291–296), but he has probably made your life
easier at one time or another (though nobody asked him to). Beck was working as a technical draughts-
man for the London Underground Signals Office when, in 1931, he submitted a radical new draft, pre-
pared in his spare time, for the London Tube map, which had been inspired by electrical diagrams and
the realization that, once on an Underground train, passengers did not care overly much about physical
distance. You will recognize Beck’s diagram (Figure 16.1(b)) immediately, because this is, in principle,
the Tube map still used today, a design icon printed on tourist mugs and emulated by local transport
authorities worldwide. Although schematic single line diagrams had appeared as early as 1909, the pre-­
Beck map of the entire network was still drawn and often also shown superimposed over a road map
of the city (Figure 16.1(a)). What Beck had done, in effect, was to transform a geographical map into
a topological one, forgoing metric properties for relative relationships and thus producing a simplified
model of the system which still retains the properties essential to its users. This, in a nutshell, is what space
syntax methodology (Al Sayed, Turner, Hillier, Iida, & Penn, 2014; Hanson, 1998; Hillier, 1996; Hillier &
Hanson, 1984) aims to do as a statistical topological or network analysis of built contexts at the settlement
or building level, as explained below.
Before looking into details of methodology, however, we should take a brief look at the history of
space syntax and, perhaps, the distinction between space syntax theory and space syntax methodology.
Space syntax was formulated as an approach to the configurational analysis of built contexts at University
College London’s Bartlett School of Architecture from the mid to late 1970s on by a group of researchers
around Bill Hillier and, later, Julienne Hanson. As results from space syntax analyses, of integration in
particular, and real-­word observations of movement and traffic flows showed good correlations and due
to its resulting ability to model the global effects and repercussions of local changes within a given spatial
configuration (cf. Hillier, Penn, Hanson, Grajewski, & Xu, 1993), space syntax quickly gained traction as a
predictive tool for architectural and urban planning, fostering an extensive research community. Although
it never became part of archaeology’s methodological mainstream (if such a thing exists), most likely due
to the high quality and density of data it demands, space syntax was also picked up by some archaeologists
(e.g. Gilchrist, 1988; Foster, 1989) with surprising alacrity given the discipline’s usual tendency to adopt
more seasoned approaches from other fields. Indeed, even an early emphatic criticism (Leach, 1978) of
Figure 16.1 London Tube maps: (a) the 1908 version superimposed on a city plan. (b) the 1933 version featur-
ing H. Beck’s topological redesign.
Source: Figures 16.1(a–b): © TfL from the London Transport Museum collection (ref. nos. 2002/264, 1999/321)
298 Ulrich Thaler

space syntax in an archaeological volume (though formulated by a social anthropologist) predated by six
years the publication, in 1984, of the volume familiarly known as the ‘Old Testament’ in the space syntax
community, Hillier’s and Hanson’s “The social logic of space” (Hillier & Hanson, 1984).
It should also be noted that the analytical techniques we are concerned with here, then labelled
“alpha analysis” and “gamma analysis” for settlement and building level analyses respectively (Hillier &
Hanson, 1984, pp. 90–123, 147–155), were only introduced as one part of a wider intellectual agenda
in “The social logic of space”, a book that drew widely on ethnographic examples and sought to arrive
at fairly broad generalizations on its title matter. Another strong focus besides analytical techniques
was on considerations of how seemingly complex settlements as ‘global’ structures could arise from
specific, but potentially simple ‘local’ rules. Indeed, such “generative syntaxes” appear to have been the
first topic of discussion in the development of space syntax (Hillier, Leaman, Stansall, & Bedford, 1976),
which thus started out from a perspective that can be broadly termed as structuralist. Nonetheless, at
least some degree of appreciation of the recursive relationship of social (acts) and built structures can be
found in “The social logic”, and by the time Hanson and Hillier published their next major volumes on
urban/settlement and architectural/building studies respectively, “Space is the machine” (Hillier, 1996)
and “Decoding homes and houses” (Hanson, 1998), their stance can certainly be characterized as post-­
structuralist. This shift, perhaps, might serve as an indication that the analytical techniques developed
under the label of ‘space syntax’ are actually compatible with different theoretical perspectives and frame-
works. Efforts (or at least calls) to align space syntax with a phenomenological perspective (Seamon, 1994,
2003) may provide a particular interesting illustration of this point for the archaeologist, who is elsewhere
reminded that phenomenological approaches do “not translate well into a formal theory, nor a fixed set
of methodological techniques” (Tilley, 2005, p. 202). Indeed, the continued advocacy of space syntax
as an encompassing theoretical framework by Hillier in particular (e.g. Hillier, 1999a, p. 165, 2008; cf.
Batty, 2004, p. 3) seems to contrast somewhat, at least from the etic perspective of an archaeologist, with
a mainstream within the space syntax community that is strongly oriented towards practical applications
in architecture and urban planning. Emicly speaking, it is certainly true that in archaeology itself, as well
as related disciplines such as social anthropology (cf. Dafinger, 2010, pp. 125–127, 134–140), space syn-
tax’s broader theoretical aspirations have mostly been ignored in favour of a pragmatic approach which
conceives of space syntax as a methodological tool. Notwithstanding the inspiration that may be found
in the more abstract considerations presented by Hillier, Hanson and others, the present text is therefore
deliberately framed as an introduction into “space syntax methodology”.

Method

Basic principles
Let us begin with a concrete historical example, the ground floor plan (Figure 16.2(a)) of a 17th century
residence of minor German nobles, Schloss Friedeburg in Saxony-­Anhalt (Schwarzberg, 2002). In the
terms of network analysis, each room can be understood as a node and each (inside) door as an edge
connecting two nodes. For a corresponding visual representation, we can inscribe a dot in each room and
link these with straight lines where a door connects two rooms and thus arrive at a topological graph (Fig-
ure 16.2(b)). Since the information content of this graph no longer depends on the nodes’/dots’ relative
position to one another – the relevant relationships are shown by the lines representing the edges – this
graph can then be rearranged. The most commonly used form is the so-­called justified graph (Al Sayed
et al., 2014, pp. 13–14; Hillier & Hanson, 1984, pp. 106–108, 149), or j-­g raph for short (Figure 16.2(c)).
In this, the outside of the building, referred to as the carrier space, is shown as a further node at the root
Space syntax methodology 299

Figure 16.2 Schloss Friedeburg, Saxony-­Anhalt, Germany. The ground floor of the main residential building
(17th c. CE): (a) the state plan of 1930. (b) a simplified plan with points of access marked by arrows, topological
graph superimposed. (c) the justified graph with room types, rings and depth from carrier indicated. (d) the
path matrix with sums of path lengths.
Source: Figure 16.2(a); Schwarzberg (2002), Figure 1

of a dendritic graph in which nodes are arranged in horizontal lines according to their distance from the
outside, i.e. the minimum number of doors/edges through which they can be accessed.
The j-­graph already permits the determination of some numerical indicators of specific spatial prop-
erties. Ease of access from the outside as a quality of a given room or an entire building, for example, is
reflected in its depth or mean depth as numerical indicators.1 In the graph, the depth of each node can
easily be determined by simply numbering the horizontal lines of nodes (for the simple reason that this
is how the graph is organized in the first place), which in turn allows us to calculate the mean depth, a
first important indicator of how accessible the building as a whole is from its surroundings. As to internal
structure and relationships, the number of rings identifiable in the j-­g raph (and thus the system) offers
a first indicator of route choice options, while each individual room/node can be characterized by, on
the one hand, its connectivity, i.e. the number of immediate links with other nodes, and as an a-­, b-­,
c-­ or d-­type space (Al Sayed et al., 2014, p. 14; Hanson, 1998, pp. 173–174; Hillier, 1996, pp. 318–320;
Figure 16.2(c)). Spaces of type a display a single edge, i.e. are ‘dead-­end’ rooms accessible through but
one door and thus strongly controlled by other spaces. In contrast to the a-­type tips of the branches in
a dendritic system, b-­type spaces constitute the (stems of) the branches themselves, i.e. they are at least
and indeed most typically two-­edged – both literally and figuratively – in that they offer some degree of
control, as least locally, but contribute only moderately to linking up a system in global terms. The latter
300 Ulrich Thaler

is more characteristic of c-­type spaces, i.e. those two-­ or more-­edged nodes which form part of a (or
more precisely: one single) ring, and d-­type spaces, which form part and thus link two or more rings and
consequently are the main connectors within a building (if, indeed, the configuration contains d-­type
spaces at all).
While this categorization permits a first idea of the ‘connectedness’ – a local quality – and furthermore
the ‘centrality’ of a space within the network of spaces – a global quality of (literally) central importance
in space syntax analyses and referred to as integration – a better idea of the latter is gained if we do some
sums. The easiest way to do so (without a computer), but one rarely explicitly discussed (somewhat
surprisingly, unless you consider the ubiquity of computers), is a path matrix (Figure 16.2(d)) in which
the path length or step distance, i.e. the number of edges traversed, between each pair of nodes is noted
(Blanton, 1994, pp. 34–35). This permits the calculation of a sum of path lengths for each space/room.
The most integrated room in a building, i.e. the one most easily reached from all others, will be charac-
terized by the lowest, the least integrated by the highest sum of path lengths. From the sums we can also
calculate the mean path length (or mean distance, MD) for every space and from this and the number of
spaces (k) in a given system, Hillier and Hanson derived a numerical indicator of integration which they
termed ‘relative asymmetry’:

2 × ( MD − 1)
RA =
k−2
As the terminology indicates, this measure was intended to compare “how deep the system is from
a particular point with how deep or shallow it theoretically could be” (Hillier & Hanson, 1984, p. 108)
and express this in values from 0 to 1; the word ‘asymmetry’ denotes the non-­correspondence of actual
and theoretically possible depth. Higher asymmetry, however, means lesser integration of a space, so that,
somewhat counter-­intuitively, relative asymmetry as an indicator of integration – i.e. the spatial quality
we are, after all interested in – gives higher values for less integrated and lower values for better-­integrated
spatial units. The crucial advance over simply doing sums, on the other hand, is captured in the word
‘relative’: asymmetry and integration are now considered in relation to the size of the system, i.e. relative
asymmetry takes into account how big a building any given room is in.
If instead of a single room, we look at and want to characterise the building in its entirety, not only
can we calculate mean values for depth, relative asymmetry and (as we will see) other numerical indica-
tors, but we can also consider its ‘core’ and ‘genotype’. The integration core is defined as the subset of
the (typically 10%) most highly integrated spaces (Al Sayed et al., 2014, p. 15; Hillier & Hanson, 1984,
p. 115); it may be of interest, e.g., in how far the core penetrates certain sections of a larger building or
bypasses others. The notion of a topological genotype (Hillier & Hanson, 1984, pp. 143–175; Hillier,
Hanson, & Graham, 1987), by contrast, does not aim at a more detailed internal description of a layout,
but a simplified one for comparison with other contexts. If specific room functions are attested across a
sample of buildings in a consistent hierarchy of integration – e.g. kitchen > (read: more highly integrated
than) reception room(s) > work space(s) > bedroom(s) – then this re-­current organizational scheme,
which may be obscured by very different ‘phenotypical’ built forms, is referred to as a building genotype,
which can be characteristic, e.g. of certain building functions and/or cultural or social contexts.
Comparison between buildings, however, also leads us to a crucial methodological difficulty: while
relative asymmetry as a standardised value is easier to work with than sums of path lengths and while
one might thus expect it to facilitate reliable comparisons between different buildings and, in particular,
buildings of different sizes with accordingly very different sums of path lengths, the latter, unfortunately,
is not actually the case; we will address this issue later.
Space syntax methodology 301

Nodes and edges


For now, there are two directions in particular in which we can and need to expand on the first principles
set out above. While we will later take a look at different spatial properties and their numerical indicators,
the first vector of expansion concerns the nodes and edges which constitute the configurational frame-
work. So far, we have spoken of the rooms of a building as nodes and the doors between them as the cor-
responding edges. But although rooms may constitute the most intuitive set of nodes for configurational
network analysis, any set of consistently defined spatial entities or units which are meaningfully linked to
one another by corresponding edges can be subjected to the same kind of analysis (Hanson, 1998, p. 270;
Hillier, 1999b, p. 169). Indeed, rather than with the ‘rooms’ of everyday parlance, space syntax analyses
more typically are concerned with ‘convex spaces’ (Al Sayed et al., 2014, p. 12; Hillier & Hanson, 1984,
pp. 97–98). A convex space is defined as an area whose perimeter is not intersected by the connecting line
of any pair of points within it or, differently put, within which any two points are directly intervisible.
Edges are then defined by adjacency; this is not, nota bene, the adjacency of rooms sharing a door-­less
wall, but the congruence of part of the perimeter of two spaces. Happily, in most architectural contexts
at the level of the individual building (rather than the context of an entire settlement), the differentiation
between rooms and convex spaces is largely a notional one in principle and a matter of resolution in
practice (Figure 16.3(a)):2 any recessed window or projecting half-­column will, strictly speaking, break up
a given room into two or more convex spaces, but not any minor deviation from a basically rectangular
(and thus convex) shape need be included in a meaningfully analysable ground plan.
The formally defined convex break-­up, i.e. the set of “fewest and fattest” convex spaces that cover
the entire accessible area within a built context is, however, fundamental for the definition of a second
set of nodes and edges, referred to as the axial break-­up or axial map (Al Sayed et al., 2014, pp. 11–12;
Hillier & Hanson, 1984, pp. 99–100). This comprises the smallest set of longest lines of sight that cov-
ers, i.e. reaches, all convex spaces in a given spatial layout and additionally replicates all rings within it
(Figure 16.3(b)). When axial lines are taken as nodes of a network, their intersections will form the edges
(Figure 16.3(d)); once the axial break-­up is mapped in this manner, a j-­g raph (Figure 16.3(f)), path matrix
and numerical indicators of spatial properties can be derived from it in the same way set out above for
rooms (/convex spaces). Early on, a connection was proposed between “axiality [and] movement into
and through the system” as well as between “convexity [and the system’s] organisation from the point of
view of those who are already statically present in the system” (Hillier & Hanson, 1984, p. 96; cf. Al Sayed
et al., 2014, p. 15). In practice, although both perspectives can offer meaningful results at both levels,
convex analysis has found broader application in analyses at the building level, whereas axial analysis has
developed into a mainstay of analyses at the settlement level. It is therefore perhaps not surprising, given
the strong focus on urban planning within the wider space syntax community, that axial analysis has seen
a number of further developments.
These include, most notably, the line segment break-­up and the weighting of nodes and, more typically,
edges; both developments are combined in segment angular analysis (Al Sayed et al., 2014, pp. 71–98,
116–117; Dalton, 2001; Hillier & Iida, 2005; Turner, 2000, 2001a; cf. Stöger, 2011, pp. 63–64, 215–219, as
an example of the still rare implementation in archaeology). The line segment break-­up is based on (and
as a visual representation congruent with) the axial map, but the unit of analysis, i.e. the node, is no longer
defined as the entirety of an axial line but rather as the line segment between two intersections of axial
lines (or one such intersection and the end of an axial line; Figure 16.3(b)). In principle, edges could then,
as in convex analysis, be defined in terms of adjacency; but in fact, angular analysis diverges from ‘classical
space syntax’ in that it no longer uses a binary concept of an absent or present connection between two
spatial units, nor consequently step distance (i.e. fewest turns in the conventional axial map) as a measure
302 Ulrich Thaler

Figure 16.3 (a–c) Simplified ground floor plan of the main residential building of Schloss Friedeburg: (a) with
the non-­convex rooms highlighted and a suggestion for (approximately convex) subdivision of non-­convex
rooms 1 and 6. (b) with the axial map superimposed and line segments indicated for the longest and most
integrated axial line. (c) with three overlapping isovists and their centre-­points indicated. (d) the axial map of
the ground floor of the main residential building of Schloss Friedeburg, with the topological graph of axial
break-­up superimposed. (e) examples of diamond-­shaped topological graphs. (f) justified graph of the axial
break-­up of the ground floor of the main residential building of Schloss Friedeburg.
Source: Figures 16.2(b–d), 16.3–16.4: by the author

of distance between two spatial entities. Instead, the distance between two spatial units is given as the least
sum of angles that need to be turned on a connecting path. Thus, geometrical properties of the spatial
layout under study are reintroduced into the formerly purely topological analysis.
In a similar though perhaps less direct way, the realm of ‘topology simple and pure’ is transcended when
isovists are taken as the nodes in a network analysis, with the isovists’ overlaps establishing the edges of
the network graph (Al Sayed et al., 2014, pp. 27–38; Turner, Doxa, O’Sullivan, & Penn, 2001; Turner &
Penn, 1999). The isovist (Benedikt, 1979) is defined as the volume or, in the present context, area of space
visible from a given point (Figure 16.3(c)); essentially this is what in geographical terms would be called
a viewshed (see Gillings & Wheatley, this volume). In contrast to the number of convex spaces or the
Space syntax methodology 303

minimum number of axial lines in a given spatial layout, the number of points – each of which allows the
construction of an isovist – within that layout is, of course, infinite. Hence, Visual Graph Analysis (VGA)
starts by superimposing an arbitrary grid over a layout and then constructing isovists of the centre points
of the raster cells. From this point on, analysis again follows the established methodology of ‘classical space
syntax’, yet at least two marked differences between VGA and convex or axial analysis deserve mention.
The first lies in the fact, already alluded to, that the regular grid brings with it a notable degree of sen-
sitivity for metric properties; the metric size of a given room within a building will influence the visual
integration of the points within it (which are, after all, normally intervisible). The second concerns the
way we use the results of VGA; as the centre point of a raster cell is not an intuitively meaningful spatial
entity in the same way as a visual line or – a fortiori and even under the designation ‘convex space’ – a
room are, the interpretation of results from VGA will typically focus less on numerical indicators for
individual spatial units and more on the ‘heat map’ of integration as, perhaps aptly, a visual representation,
in which red/light denotes highly and blue/dark weakly integrated areas. This does not diminish VGA’s
potential for study both at the building and the settlement level.

Qualities and indicators


Having looked at edges and nodes, we will now turn to spatial qualities and their numerical indicators,
starting with the problem left open earlier: if we are to compare integration as a configurational quality
between different built contexts of different sizes or individual rooms within them, then the numerical
indicators we use to describe this quality need to be robust with regard to differences in size between the
contexts under comparison. Yet, in reality, spatial configurations themselves display some sensitivity to
size. Linear topological arrangements such as a single string of b-­type rooms ending in an a-­type space
offer a simple, but telling illustration: a three-­room structure following this configurational principle will
provide a habitable building or part thereof. But a 13-­or even 43-­room building configured in accor-
dance with the same principle, i.e. without c-­and d-­type spaces or even branching, would be eminently
impractical (at least for most purposes). So, for any meaningful comparison between different contexts,
numerical indicators have to be calculated in a way that corrects for the size-­sensitivity of configurations,
a process referred to as ‘normalisation’.
For this reason, Hillier and Hanson quickly followed their theoretically derived relative asymmetry
(RA; see earlier) with a second, empirically adjusted numerical indicator of integration termed ‘real rela-
tive asymmetry’ (RRA; Hillier & Hanson, 1984, pp. 109–113). The underlying idea is simple: In a first
step, RA values are calculated for a standard set of j-­g raphs of different sizes but of the same basic shape
and organizational structure (Figure 16.3(e)), i.e. of systems that should ‘behave’ similarly irrespective of
their size. These RA values, designated as ‘D-­values’ in reference to the diamond shape of the underlying
j-­graph series, can be compiled in a reference table and then used as correction factors for values from
real-­life contexts. This means that RRA for a spatial unit in a system of a given size k is calculated by
dividing the unit’s RA by the D-­value corresponding to the system’s size:
RA
RRA =
Dk
In practice, there happily is no need to regularly consult a reference table, as D-­values for differently-­
sized systems can also be calculated by a logarithmic formula:

   k + 2   
2 ×k ×  log 2  − 1 + 1
   3   
Dk = 
(k − 1)×(k − 2)
304 Ulrich Thaler

Inserting this formula for Dk as well as that for RA in the calculation of RRA, we arrive at:

RA  2 ×(MD − 1) ×(k − 1)×(k − 2)


RRA = =  
Dk     k + 2   
 
2 × k ×  log  ×(k − 2)
   2  3  − 1 + 1 
    

With RRA, we have arrived at a numerical indictor for integration that has proven robust in practi-
cal application with regard to size-­sensitivity. As, however, its calculation no longer produced values in a
standardized range, as RA, while retaining RA’s counterintuitive trait of giving high values for low and
low ones for high integration, RRA eventually came to be replaced by an indicator simply termed, as the
quality it described, ‘integration’ (I; Al Sayed et al., 2014, pp. 15, 114) and calculated, equally simply, as
the reciprocal of RRA:

   
2 × k ×  log  k + 2  − 1 + 1
 ×(k − 2)

   
 3    
1   
2
 


I HH = =
RRA  2 ×(MD − 1) ×(k − 1)×(k − 2)
 

If this seems like a less than perfectly elegant way of arriving at a crucial measure, we should not be
surprised by criticism or efforts at improvement. For axial analysis, these include, e.g. the proposals of an
alternative series of gridded rather than diamond-­shaped standard j-­g raphs for the production of correc-
tion factors (Kruger, 1989), and even of an alternative calculation of integration based not on the prob-
lematic RA, but directly on the sum of path lengths (Teklenburg, Timmermans, & van Wagenberg, 1993):

 k − 2 
ln 
 2 
I Tekl =
ln (∑ D − k + 1)

Similarly, in the context of VGA, both the revival of P-­values (de Arruda Campos & Fong, 2003),
developed by Hillier and Hanson (1984, pp. 113–114, cf. pp. 73, 95) as an alternative to D-­values in a
very specific form of analysis of building-­to-­settlement relationships, but never widely used, and, more
radically, the abandonment of standardized integration measures in favour of simple mean path lengths
(Sailer, 2010, pp. 132–133) have been suggested. To the best of my knowledge, however, such proposed
alternatives seem to have been mostly ignored rather than refuted (let alone accepted and adopted),
leaving the calculation of I a black box that a thriving research community mostly seems loath to open.
Recent discussions of normalisation in the context of angular analysis need to be noted as an exception
(Al Sayed et al., 2014, pp. 77–78, 117; Hillier, Yang, & Turner, 2012), but the results are not transferable
to other types of analysis.
This might encourage us to look at other spatial qualities and their numerical indicators within space
syntax, of which there are a number. We have already encountered, in the introductory section, con-
nectivity as a simple local measure, i.e. the number of other nodes with which a given node shares edges.
Control is another local measure, calculated by assigning, for each space, the reciprocal of its connectiv-
ity to each of its neighbours and then summing up the apportioned values for each space; it is taken to
capture the degree to which a space controls access from other parts of the network to its immediate
neighbours (Hillier & Hanson, 1984, p. 109). The correlation of connectivity and integration (which,
Space syntax methodology 305

of course, cannot escape potential problems with the latter) is referred to as intelligibility and taken to
describe how far the global connective role of a spatial unit can be inferred from within that space (Al
Sayed et al., 2014, p. 15; Conroy, 2000, pp. 61–88; Hillier, 1996, p. 120).3
Despite their more specific uses, however, none of these measures comes close to integration in
either its apparent predictive potential or in its popularity among researchers, whereas a more seri-
ous ‘contender’ or complement to integration has come to the fore in recent years for some types of
analysis: choice (Al Sayed et al., 2014, pp. 15, 77, 114–115, 117; Freeman, 1977; Hillier & Iida, 2005,
p. 483; Turner, 2007). Like integration, choice is defined in reference to the set of shortest paths
between any pair of nodes within a system. But in contrast to the integration value of a given space,
which takes into account the shortest routes from that space to all others (which are the same as those
from all others to this particular space), its choice value reflects how many shortest routes between
pairs of other nodes pass through the space under consideration; fittingly, ‘betweenness’ has been used
as an alternative designation (Al Sayed et al., 2014, pp. 114, 117; Turner, 2007, p. 540). Consequently,
choice has been advocated as an indicator of the “through-­movement potential” (Al Sayed et al.,
2014, pp. 26, 73; Hillier, 2007/2008, p. 2) of a node, i.e. its likeliness of attracting passing traffic, by
contrast with integration which is held to capture a node’s “destination potential” (Hillier, 2007/2008,
p. 2) or “to-­movement potential” (Al Sayed et al., 2014, p. 73), i.e. its likeliness to attract visitors or
simply its accessibility. This reflects, to some degree, the earlier opposition of axial space as connected
with movement and convex space as linked to static activity. It is therefore perhaps not surprising that
choice, considered “descriptive of movement rather than occupation” (Al Sayed et al., 2014, p. 15), is
not usually used in convex analysis, despite the popularity it has gained in forms of axial analysis and
in the context of line segment analysis in particular.
A last methodological aspect that we need to address concerns the contrast between local indicators,
such as connectivity and control, and global indicators, like integration and choice: while the latter make
reference to a space’s relationship with all other spaces within a given layout, the former take into account
only the relationships of a space to its immediate neighbours. There is a third possibility, which is consid-
ering a space’s relationship to all those which fall within a certain radius around it (Al Sayed et al., 2014,
pp. 15, 25, 114; Hillier, 1996, pp. 99–101). A space’s integration in the context of all those spaces within
three topological steps from it, e.g. is referred to as integration at r = 3 or, with a more general term that
can refer to other small radii as well, as local integration (global integration, in this sense, is at r = n, while
local indicators in the strict sense are established at r = 1). While calculating integration locally offers a
useful supplement to global integration values in, e.g. a convex break-­up where it can help to establish
independent hubs of circulation, in studies of angular choice the analysis of different metric rather than
topological radii takes on even greater significance in that it promises, at least in present-­day contexts,
a means of distinguishing between factors influencing vehicular and pedestrian traffic flows (Al Sayed
et al., 2014, pp. 25, 74). How readily the latter can be transposed into different archaeological contexts
may be debatable, but an example for the usefulness of local convex integration will be given in the fol-
lowing case study.

Case study
The Late Bronze Age palace of Pylos in Western Messenia (Blegen & Rawson, 1966), one of the early
state centres of Mycenaean Greece, offers very favourable conditions for space syntax analysis in two
regards in particular: First, a careful study of changes to the building over the course of the 13th century
BC allows us to distinguish – and then analytically compare – an earlier and a later state of the building
(Nelson, 2017, pp. 360–365 Figures 4.7–4.8; Thaler, 2018, pp. 39–59; Wright, 1984; Figure 16.4(a–b)).4
Figure 16.4 The Palace of Pylos (Ano Englianos), Messenia, Greece (13th c. BCE): (a) a simplified plan of the
earlier building state with results of VGA and the shortest convex routes to throne room 6 superimposed as a
partial topological graph. (b) a simplified plan of the later building state with results of VGA and the shortest
convex routes to throne room 6 superimposed as a partial topological graph. (c) a simplified plan of the later
building state with shading indicating areas of convex spaces most easily accessible from the three different
points of access (indicated by arrows) and separate j-­graphs for access through each of the latter (grey indicating
the spaces of the service ‘wing’). (d) a simplified plan of the later building state with shading indicating areas
of convex spaces most easily accessible from the three main courts 58, 63 and 88, “A” marking the archive,
“NEB” the Northeastern Building (the presumed clearing-­house) and “P” pantries (courts assumed to be
served from these are indicated by subscript nos.). (e) j-­g raph of the later building state (grey indicating the
spaces of the service ‘wing’).
Sources: Figures 16.2(b–d),16.3,16.4: by the author
Space syntax methodology 307

Second, a more impressionistic comparison of those building states has already been crucial in formulat-
ing a hypothesis that dominated our understanding of the Pylian palace complex for over two decades, i.e.
the assumption that its architectural development reflected a long-­term economic decline (Shelmerdine,
1987; Wright, 1984), in reaction to which, among other things, “changes to the palace [. . .] consistently
[. . .] restrict[ed] access and circulation” (Shelmerdine, 1987, p. 564); sometimes this has even been associ-
ated with defensive considerations in a military sense (e. g. Shelmerdine, 1998, p. 87).
Restriction of access (from outside) and circulation (within) translates readily into space syntax terms
as a significant lowering of, on the one hand, the integration value of the carrier space, i.e. the outside
of the building, and, on the other, mean integration for the entire system. The carrier space does indeed
display a noticeable, if not dramatic, loss of integration, from 0.84 to 0.69 for the convex and 1.27 to 1.11
for the axial break-­up; yet even these lowered values remain virtually identical or even slightly higher
than the mean for integration, calculated at 0.70 for the convex and 0.96 for the axial break-­up in the
later state. If the carrier is as well integrated with the building as a whole as is the average space within
it, this hardly constitutes a defensive architecture. As to circulation within, the just cited mean values of
integration hardly change at all from the earlier state, for which they are calculated as 0.72 and 1.01 in
the axial and convex analysis respectively; in fact, if one of the aforementioned alternative suggestions for
the calculation of integration values (Teklenburg et al., 1993) is followed, this minimal drop is reversed
into an (equally insignificant) rise in integration (Thaler, 2005, p. 327).
This is remarkable not only in that it clearly contradicts (one of the underlying assumptions of) the
decline hypothesis, but also when viewed against relative proportions of space types, particularly the
increase of spaces of type b, 29% in the earlier and 38% in the later building state, and the concomitant
decrease of d-­type ring-­connectors, which account for 18% of all spaces in the earlier state, but only 9%
later on. It is not ease of movement that decreased, but options of route choice; i.e. circulation was not
restricted, but rather channelled. Channelling of traffic towards distinct routes is an aspect of the grow-
ing architectural differentiation of the palace complex and is particularly evident with regard to what
might be described as its service ‘wing’: If we compare the j-­g raph for the palace (Figure 16.4(e)) and a
mapping of the areas most quickly (i.e. with the least topological steps) reached from each of its three
points of access (Figure 16.4(c)), then an area of store rooms at and around the back of the palace’s main
building stands out as a coherent subsystem with only few connections to the remainder of the complex.
A j-­graph constructed for access only through this ‘tradesmen’s entrance’ (as it was termed in another
earlier and rather perceptive study, Kilian, 1984, p. 43), i.e. omitting the two other access points, shows the
palace as a remarkably deep and inaccessible structure; clearly, the larger (and more representative part) of
the complex was not meant to be accessed from this direction.
If we look at how official visitors were meant to enter a Mycenaean palace (or, at least, the most high-­
ranking visitors, since differentiations in rights of access seem to have held great importance), there is a
canonical route through first a propylon and then a courtyard (both elements could be repeated) into
the megaron; inside the megaron itself, there was first an open porch, from which a vestibule could be
accessed which in turn lead into the hearth/throne room. Although the latter, numbered as room 6 by
the excavators, was the most integrated a-­type space, thus combining an accessible/commanding position
and privacy, in both the earlier and later state of the Pylos palace, it was only in the later state that the
topologically shortest route through the convex map into the throne room came to coincide with the
canonical route just set out (Figure 16.4(a–b)); concomitantly, VGA documents a shift of visual integra-
tion within the large courtyard in front of the main building from its sides to its centre and thus towards
the propylon. This may well be a case of a specific social practice, i.e. a canonical way of approaching
the ruler’s seat, becoming embodied in the (architectural but also highly) social structure of the palace
building, which then, of course, was instrumental in perpetuating it.
308 Ulrich Thaler

Another aspect of the growing differentiation of the palace complex besides such channelling can be
found in the comparison not of routes, but of individual spaces and none are more informative in this
regard than the major courts. While the earlier building state displayed a largely undifferentiated ring of
hypaethral spaces around the main building, three distinct and separate courtyards develop there in the
course of the 13th century, courts 58, 63 and 88 of the later building state (Figure 16.4(d)). Their crucial
role in the palace complex is documented by the fact that, of all convex spaces, the three display the high-
est values for local integration (58 > 63 > 88), connectivity (58 > 63 = 88) and control (58 > 88 > 63).
Clearly, together these were the circulation hubs of the later palace, but nonetheless their roles were not
one and the same, as a comparison between 58 and 63 in particular illustrates.
Court 58 displays a significantly lower depth from the carrier space, indeed it is a mere two steps
from two access points to the palace (out of a total of three, the third being the aforementioned ‘trades-
men’s entrance’) and no route from these two entrances into the deeper sections of the palace can bypass
58. Little surprisingly, the palace archive is nearby and what has been identified as a clearing-­house for
the redistributive palace economy opens directly onto 58, which can be identified as the interface for
all official outside contacts of both political/ceremonial and formal economic nature (anything other
than deliveries of goods for consumption within the palace, it would seem). And yet, in terms of global
integration, i.e. in its relevance for circulation within the palace, 58 is eclipsed by court 63 as the most
highly integrated convex space of the entire complex. To add insult to injury, both courts can be associ-
ated with pantries containing, among other things, thousands of drinking vessels, presumably employed
during palatial feasts; but, by comparison with the kylikes apparently used in the more deeply sited and
thus apparently more exclusive court 63, those associable with 58 are of noticeably inferior quality, indi-
cating how architectural differentiation could be translated into social differentiations during specific
events hosted within the palace (Bendall, 2004). As to court 88, mapping those areas topologically more
closely associated with this courtyard rather than either 58 or 63 (Figure 16.4(d)) produces a result that is
almost completely congruent with the aforementioned mapping of primary access through the ‘trades-
men’s entrance’; court 88 was the hub of the service ‘wing’ and presumably no more than a staging area
in the context of feasts.

Conclusion
Given its underlying (and, it would appear, empirically proven) premise that a three-­dimensional Euclid-
ian space can be reduced to a two-­dimensional topological one and still be meaningfully analysed in social
terms, it should not surprise us that space syntax methodology is not difficult to apply, both in general
and in particular, i.e. to archaeological contexts (cf. Hacıgüzeller & Thaler, 2014). This ease of applica-
tion is further helped, to no small degree, by the fact that the Bartlett School of Architecture has made
fairly user-­friendly analytical software available free of charge for academics, first with Alasdair Turner’s
Depthmap (Turner, 2011; cf. Turner, 2001b, 2004), which largely replaced an earlier bundle of Macintosh-­
based programmes, and more recently with Tasos Varoudis’s depthmapX (Varoudis, 2012). It is so simple
(and, indeed, takes so little understanding of the underlying procedures) to produce a plausible(-­looking)
output with just a few mouse-­clicks, that anyone planning to work with space syntax more extensively
or in some depth should perhaps consider starting with a ‘manual’ analysis or two.
Any prospective user should also keep in mind that space syntax does not produce any meaningful
results in a vacuum. I have elsewhere (Thaler, 2006, 2018, pp. 8–26) proposed an analytical framework
that uses different levels of diachronic stability as a guiding principle to meaningfully relate different
perspectives on the social definition and the archaeological documentation of architectural spaces, includ-
ing space syntax. Yet a more general and two-­fold caveat should be emphasized in the present context,
Space syntax methodology 309

namely that both a critical assessment of the source materials, i.e. the plans intended for analysis, and a
considered contextualization of results are important and, at least in the latter case, indispensable steps in
archaeological studies employing space syntax methods. We are no longer in a position to see whether
the high integration value of space X, Y or Z in any building or settlement under analysis correlates with
actual movement patterns, but have to assume that it does based on the analogy with observations on
present-­day contexts. Similarly, we cannot question occupants on what specific spaces are used for and
therefore will have to relate analytical results to archaeological indications of space use. Finds inventories
for specific spaces are the obvious example, but the study of wall-­painting locations through space syntax
(Letesson, 2012) – and thus of a category of non-­movable, ‘diachronically stable’ finds less prone to pre-­
and postdepositional dislocation than most other finds – provides a good illustration that we need not
confine ourselves to the obvious.
That said, the fact remains – or comes into focus even more clearly – that space syntax approaches
entail high demands on archaeological data. This holds true with regard to both ‘coverage’ – understand-
ably, the analysis of incomplete building plans is not an issue widely discussed in the non-­archaeological
space syntax community – and level of detail. Some of the most exciting approaches in archaeological
space syntax research, although ones whose potential may not have been fully realized in extant case
studies, are therefore those which explicitly address the weaknesses in archaeological data quality, e.g.
by harnessing the concept of topological genotypes in order to reconstruct incomplete building plans
(Romanou, 2007), by trying to open up the large-­scale coverage of geophysical survey to space syntax
analysis (Spence-­Morrow, 2009) or by aligning space syntax perspectives with data recovery methods
that promise very fine-­g rained information on space use in built contexts, such as micro-­refuse analysis
(Milek, 2006).
In the light of these latter studies, it could be suggested that the greatest hope for space syntax in
archaeology does not lie in specialist desk-­ and literature-­based studies, though given the inherently
comparative stance of space syntax their potential contribution remains great, but in research designs for
fieldwork that take into account the needs of topological analyses of social space, not in order to ‘cater
for’ specialists, but in order to enlist a further useful tool for the detailed published study that is the aim
and raison d’être of research excavations and surveys.

Notes
1 It should be noted that the use of the term ‘depth’ in the present introductory text differs slightly – and deliber-
ately – from much of the extant literature, where a wider meaning is adopted. Hence, what is here termed simply
‘depth’ may be read as ‘depth from carrier’ in more conventional terms, whereas what in the following will be
referred to as ‘path length’ or ‘step distance’, i.e. the distance between any two nodes in a system, will often be
described simply as ‘depth’. There is a clear logic in the latter usage in that, e.g. the mean depth of (or for) a given
node will indicate how deep the system is from that node, but the more descriptive designation ‘path length’ was
felt to offer a more intuitive appreciation of methodological foundations and is thus preferred here. Correspond-
ingly, in the formulae given in the text ‘MD’ can be read as either ‘mean distance’ (in the terms chosen in this paper)
or ‘mean depth’ (in the more conventional usage). For analytical purposes, calculating the integration (cf. below)
of the carrier will often be a strong alternative to considering the mean depth (from the carrier) of a system.
2 Further illustration of this point can be seen in deliberate divergences from the strict definition of convexity in
Figure 16.3(a). While it seems clear – to the author and hopefully the reader, too – that the insertion of a staircase
in the room labelled ’1’ breaks the latter up into two separate spaces, the non-­convexity of rooms 5 and 9 was
considered too little pronounced to warrant subdividing these spaces. The most arbitrary decision was certainly
to subdivide room 6, which like room 1 houses a staircase in one corner, into two spaces in such a manner that
the larger one, room 6 a, ‘controls’ the door to corridor 3 at the cost of its strict convexity (rather than assign that
control to the ‘nook’ 6 b or sharing it between 6 a and 6 b). In the strictest sense, the floor area inside each door
opening would have to be considered a separate convex space; these connecting spaces would, coincidentally, paral-
lel the ‘connectors’ needed in the early space syntax software ‘Pesh’. Elsewhere, I have used the term “semiconvex
310 Ulrich Thaler

breakup” to indicate “a suitable compromise between the analysis of convex and bounded spaces in a complex
containing both roofed and open areas” (Thaler, 2005, p. 327; cf. Thaler, 2018), but arguably such a designation
could be considered to imply an impractically absolute concept of convexity.
3 In a similar vein, entropy, calculated through a logarithmic formula, and other measures derived from it were
studied, particularly in the context of VGA, for their potential to “give an insight into how ordered the system
is from a location“ (Turner, 2001b, p. 9); but while high entropy could be associated with even distributions of
path lengths from a given space to other spaces, a low entropy value for a space could either indicate many other
spaces in close proximity or the opposite, a clustering of spaces at a distance.
4 The case study briefly set out here is presented in more detail in: Thaler (2005, 2018, pp. 39–185) and Hacıgüzeller
and Thaler (2014). Its results are contextualized in: Thaler (2006, 2018). As also briefly discussed in n. 2, the con-
cept of convexity was applied with a deliberate degree of latitude in this case study, but corresponding terms like
‘convex space’, ‘convex break-­up’ etc. are retained in their conventional form in the present introductory text.

References
Al Sayed, K., Turner, A., Hillier, B., Iida, S., & Penn, A. (2014). Space Syntax methodology (4th ed.). London: University
College London, Bartlett School of Architecture.
Batty, M. (2004). CASA working papers: Vol. 74. A new theory of space syntax. Retrieved from www.casa.ucl.ac.uk/
working_papers/paper75.pdf
Bendall, L. (2004). Fit for a king? Hierarchy, exclusion, aspiration and desire in the social structure of Mycenaean
banqueting. In P. Halstead & J. C. Barrett (Eds.), Sheffield studies in Aegean archaeology: Vol. 5. Food, cuisine and
society in prehistoric Greece (pp. 105–135). Oxford: Oxbow Books.
Benedikt, M. L. (1979). To take hold of space: Isovists and isovist fields. Environment and Planning B, 6, 47–65.
Blanton, R. E. (1994). Houses and households: A comparative study. New York, NY: Springer.
Blegen, C. W., & Rawson, M. (1966). The palace of Nestor at Pylos in Western Messenia: Vol. 1. The buildings and their
contents. Princeton: Princeton University Press.
Conroy, R. (2000). Spatial navigation in immersive virtual environments (Doctoral dissertation). Retrieved from www.
thepurehands.org/phdpdf/thesis.pdf
Dafinger, A. (2010). Die Durchlässigkeit des Raums: Potenzial und Grenzen des Space Syntax-­Modells aus sozialan-
thropologischer Sicht. In P. Trebsche, N. Müller-­Scheeßel, & S. Reinhold (Eds.), Der gebaute Raum: Bausteine einer
Architektursoziologie vormoderner Gesellschaften (pp. 123–142). Münster: Waxmann.
Dalton, N. (2001). Fractional configuration analysis and a solution to the Manhattan problem. In J. Peponis, J. D.
Wineman, & S. Bafna (Eds.), Proceedings of the 3rd international symposium on space syntax (pp. 26.1–26.13). Ann
Arbor: University of Michigan, College of Architecture & Urban Planning.
de Arruda Campos, M. B., & Fong, P. S. P. (2003). A proposed methodology to normalise total depth values
when applying the visibility graph analysis. In J. Hanson (Ed.), 4th international space syntax symposium, confer-
ence, London, 17–19 June 2003 (pp. 35.1–35.10). Retrieved from http://217.155.65.93:81/symposia/SSS4/
fullpapers/35Campos-­Fongpaper.pdf
Foster, S. M. (1989). Analysis of spatial patterns in buildings (access analysis) as an insight into social structure:
Examples from the Scottish Atlantic Iron Age. Antiquity, 63, 30–40.
Freeman, L. (1977). A set of measures of centrality based on betweenness. Sociometry, 40, 35–41.
Garfield, S. (2012). On the map: Why the world looks the way it does. London: Profile Books.
Gilchrist, R. (1988). The spatial archaeology of gender domains: A case study of medieval English nunneries. Archaeo-
logical Review from Cambridge, 7, 21–28.
Hacıgüzeller, P., & Thaler, U. (2014). Three tales of two cities? A comparative analysis of topological, visual and
metric properties of archaeological space in Malia and Pylos. In E. Paliou, U. Lieberwirth, & S. Polla (Eds.), Topoi:
Berlin studies of the ancient world: Vol. 18. Spatial analysis in past built environments: Proceedings of the international and
interdisciplinary workshop (pp. 203–262). Berlin: de Gruyter.
Hanson, J. (1998). Decoding homes and houses. Cambridge: Cambridge University Press.
Hillier, B. (1996). Space is the machine: A configurational theory of architecture. Cambridge: Cambridge University Press.
Hillier, B. (1999a). Guest editorial: The need for domain theories. Environment and Planning B, 26, 163–167.
Hillier, B. (1999b). The hidden geometry of deformed grids: Or, why space syntax works, when it looks as though
it shouldn’t. Environment and Planning B, 26, 169–191.
Space syntax methodology 311

Hillier, B. (2007/2008). Using DepthMap for urban analysis: A simple guide on what to do once you have an analysable map
in the system. Unpublished manuscript, Bartlett School of Architecture, University College London, London, UK.
Hillier, B. (2008). Space and spatiality: What the built environment needs from social theory. Building Research &
Information, 36, 216–230.
Hillier, B., & Hanson, J. (1984). The social logic of space. Cambridge: Cambridge University Press.
Hillier, B., Hanson, J., & Graham, H. (1987). Ideas are in things: An application of the space syntax method to dis-
covering house genotypes. Environment and Planning B, 14, 363–385.
Hillier, B., & Iida, S. (2005). Network and psychological effects in urban movement. In A. G. Cohn & D. M. Mark
(Eds.), Lecture notes in computer science: Vol. 3693. Proceedings of spatial information theory: International conference
(pp. 475–490). Berlin: Springer-­Verlag. Retrieved from http://eprints.ucl.ac.uk/1232/
Hillier, B., Leaman, A., Stansall, P., & Bedford, M. (1976). Space syntax. Environment and Planning B, 3, 147–185.
Hillier, B., Penn, A., Hanson, J., Grajewski, T., & Xu, J. (1993). Natural movement: Or, configuration and attraction
in urban pedestrian movement. Environment and Planning B, 20, 29–66.
Hillier, B., Yang, T., & Turner, A. (2012). Normalising least angle choice in Depthmap: And how it opens up new
perspectives on the global and local analysis of city space. Journal of Space Syntax, 3, 155–193.
Kilian, K. (1984). Pylos – Funktionsanalyse einer Residenz der späten Palastzeit. Archäologisches Korrespondenzblatt,
14, 37–48.
Kruger, M. J. T. (1989). On node and axial grid maps: Distance measures and related topics. Paper presented at the European
Conference on the Representation and Management of Urban Change, Cambridge.
Leach, E. (1978). Does space syntax really “constitute the social”? In D. R. Green, C. Haselgrove, & M. Spriggs
(Eds.), British archaeological reports: International series: Vol. 47ii. Social organisation and settlement: Contributions from
anthropology, archaeology and geography (pp. 385–401). Oxford: BAR Publishing.
Letesson, Q. (2012). ‘Open day gallery’ or ‘private collections’? An insight on Neopalatial wall paintings in their spatial
context. In D. Panagiotopoulos & U. Günkel-­Maschek (Eds.), Aegis: Vol. 5. Minoan realities: Approaches to image,
architecture, and society in the Aegean Bronze Age (pp. 27–61). Louvain-­la-­Neuve: Presses universitaires de Louvain.
Milek, K. (2006). Houses and households in early Icelandic society: Geoarchaeology and the interpretation of social space (Doctoral
dissertation). Retrieved from www.dropbox.com/sh/qirb8c8m5o7lisa/AAB7UD7cwNIYAF9WDTJqVB_6a
Nelson, M. C. (2017). The architecture of Epano Englianos, Greece. In F. A. Cooper & D. Fortenberry (Eds.), Brit-
ish archaeological reports: International series:Vol. 2856. The Minnesota Pylos project, 1990–98 (pp. 283–418). Oxford:
BAR Publishing.
Romanou, D. (2007). Residence design and variation in residential group structure: A case study, Mallia. In R. West-
gate, N. R. E. Fisher, & J. Whitley (Eds.), Building communities: House, settlement and society in the Aegean and beyond:
Proceedings of a conference held at Cardiff University, 17–21 April 2001 (pp. 77–90). London: British School at Athens.
Sailer, K. (2010). The space-­organisation relationship: On the shape of the relationship between spatial configuration and collective
organisational behaviours (Unpublished doctoral dissertation). TU Dresden, Dresden.
Schwarzberg, H. (2002). Zu Geschichte und baulicher Entwicklung von Schloß Friedeburg im Mansfelder Land.
Burgen und Schlösser in Sachsen-­Anhalt: Mitteilungen der Landesgruppe Sachsen-­Anhalt der Deutschen Burgenvereinigung,
11, 217–238,
Seamon, D. (1994). The life of the place: A phenomenological commentary on Bill Hillier’s theory of space Syntax.
Nordisk Arkitekturforskning, 7, 35–48.
Seamon, D. (2003). Review of the book Space is the machine, by B. Hillier. Environmental and Architectural Phenomenol-
ogy, 14(3), 6–8.
Shelmerdine, C. W. (1987). Architectural change and economic decline at Pylos. Minos, 20–22, 557–568.
Shelmerdine, C. W. (1998). The palace and its operations. In J. L. Davis (Ed.), Sandy Pylos: An archaeological history
from Nestor to Navarino (pp. 81–96). Austin, TX: University of Texas Press.
Spence-­Morrow, G. (2009). Analyzing the invisible syntactic interpretation of archaeological remains through
geophysical prospection. In D. Koch, L. Marcus, & J. Steen (Eds.), Proceedings of the 7th international space syntax
symposium (pp. 106.1–106.10). Stockholm; KTH Royal Institute of Technology.
Stöger, H. (2011). Archaeological studies Leiden university:Vol. 24. Rethinking Ostia: A spatial enquiry into the urban society
of Rome’s imperial port-­town. Leiden: Leiden University Press.
Teklenburg, J. A. F., Timmermans, H. J. P., & van Wagenberg, A. F. (1993). Space syntax: Standardised integration
measures and some simulations. Environment and Planning B, 20, 347–357.
312 Ulrich Thaler

Thaler, U. (2005). Narrative and syntax: New perspectives on the Late Bronze Age palace of Pylos, Greece. In A. van
Nes (Ed.), Space Syntax: 5th international symposium (pp. 323–339). Amsterdam: Techne Press.
Thaler, U. (2006). Constructing and reconstructing power: The palace of Pylos. In J. Maran, C. Juwig, H. Schwen-
gel, & U. Thaler (Eds.), Geschichte: Forschung und Wissenschaft: Vol. 19. Constructing power: Architecture, ideology and
social practice (pp. 93–116). Münster: LIT Verlag.
Thaler, U. (2018). Universitätsforschungen zur prähistorischen Archäologie: Vol. 320. me-­ka-­ro-­de: Mykenische Paläste als
Dokument und Gestaltungsrahmen frühgeschichtlicher Sozialordnung. Bonn: Habelt.
Tilley, C. (2005). Phenomenological archaeology. In C. Renfrew & P. Bahn (Eds.), Archaeology: The key concepts
(pp. 201–207). London: Routledge.
Turner, A. (2000). CASA working papers:Vol. 23. Angular analysis: A method for the quantification of space. Retrieved from
http://discovery.ucl.ac.uk/1368/
Turner, A. (2001a). Angular analysis. In J. Peponis, J. D. Wineman, & S. Bafna (Eds.), Proceedings of the 3rd interna-
tional symposium on space syntax (pp. 30.1–30.11). Ann Arbor: University of Michigan, College of Architecture &
Urban Planning.
Turner, A. (2001b). Depthmap: A program to perform visibility graph analysis. In J. Peponis, J. D. Wineman, &
S. Bafna (Eds.), Proceedings of the 3rd international symposium on space syntax (pp. 31.1–31.9). Ann Arbor: University
of Michigan, College of Architecture & Urban Planning.
Turner, A. (2004). Depthmap 4: A researcher’s handbook. London: University College London, Bartlett School of Gradu-
ate Studies. Retrieved from http://eprints.ucl.ac.uk/2651/
Turner, A. (2007). From axial to road-­centre lines: A new representation for Space Syntax and a new model of route
choice for transport network analysis. Environment and Planning B, 34, 539–555.
Turner, A. (2011). UCL Depthmap: Spatial network analysis software (Version 10.14) [Computer software]. London:
University College London, VR Centre of the Built Environment.
Turner, A., Doxa, M., O’Sullivan, D., & Penn, A. (2001). From isovists to visibility graphs: A methodology for the
analysis of architectural space. Environment and Planning B, 28, 103–121.
Turner, A., & Penn, A. (1999). Making isovists syntactic: Isovist integration analysis. Proceedings 2nd International
Symposium on Space Syntax, Universidad de Brasil, Brazil. Retrieved from www.vr.ucl.ac.uk/publications/
turner1999-­000.html
Varoudis, T. (2012). DepthmapX – Multi-­platform spatial network analysis software [Computer software]. London:
University College London and The Bartlett School of Architecture. Retrieved from http://varoudis.github.io/
depthmapX/
Wright, J. C. (1984). Changes in the form and function of the palace at Pylos. In C. W. Shelmerdine & T. G.
Palaima (Eds.), Pylos comes alive: Industry and administration in a Mycenaean palace (pp. 19–29). New York: Fordham
University Press.
Plate 13.2 Southern portion of the coastal Georgia study area: maximum available calories for all shellfish
species for the month of September (ca. 500 BP).
Plate 13.3 Southern portion of the coastal Georgia study area: returnable calories for all resources combined
for the month of January (ca. 500 BP).
Plate 13.4 Southern portion of the coastal Georgia study area: returnable calories for all resources combined
for the month of September (ca. 500 BP).
Plate 14.1Schematic illustration of the features of an Agent Based Model (ABM) with cognitive agents, based
on the model described (Lake, 2000a).

Plate 14.2 Example of the realistic rendering of a simulated landscape.

Plate 14.3 Graphed Agent Based Model (ABM) simulation results which collectively illustrate several aspects
of experimental design: (a) plotted points of the same colour and k value differ due to stochastic effects alone;
(b), two different parameters σ and k are varied; and (c) two different agent rules, “CopyTheBest” and “Copy-
IfBetter” are explored.
Plate 14.4 Comparison of Long House Valley simulation results with archaeological evidence.

Plate 15.1 Four different network data representations of the same hypothetical Mediterranean transport
network. (a) adjacency matrix with edge length (in km) in cells corresponding to a connection; (b) node-­
link-­diagram where edge width represents length (in km). Please refer to the colour plate for a breakdown by
transport type where red lines = sea, green = river, grey = road); (c) edge list; (d) geographical layout. Once
again, please refer to the colour plate for a breakdown of transport type.
Plate 15.2 A planar network representing transport routes plotted geographically (a) and topologically (b). A
non-­planar social network representing social contacts between communities plotted geographically (c) and
topologically (d). Note the crossing edges in the non-­planar network.

Plate 15.5 Network representation of the Orbis network: geographical layout (a, c) and topological layout (b,
d). Node size and colour represent betweenness centrality weighted by physical distance in (a) and (b), and
they represent unweighted betweenness centrality in (c) and (d): the bigger and darker blue the node, the more
important it is as an intermediary for the flow of resources in the network. By comparing (a, b) with (c, d), note
the strong differences in which settlement is considered a central one depending on whether physical distance
is taken into account (a, b) or not (c, d). Edge colours represent edge type: red = sea, green = river, grey = road.
Plate 15.6 Geographical network representation of the Orbis network: geographical layout (a) and topological
layout (b). Node size and colour represent increasing physical distance over the network away from Rome: the
larger and darker the node, the further away this settlement is from Rome following the routes of the trans-
port system. Note the fall-­off of the results with distance away from Rome structured by the transport routes
rather than as-­the-­crow-­flies distance. Edge colours represent edge type: red = sea, green = river, grey = road.

Plate 17.7 (a) The cumulative viewshed generated by summing the binary viewsheds of the 17 coin hoard loca-
tions depicted in Figure 17.4 with a maximum view radius of 6,880 m. Colours at the red end of the green-­red
scale indicate locations from which higher numbers of mounds are visible. (b) The total viewshed calculated for
the entire study region (in this case, the convex hull depicted in Figure 17.4 with a 500 m buffer – 8,938 view-
point locations). This encodes views-­from the individual viewpoints, where the red end of the green-­red scale
indicates those locations from which a larger area is modelled as being visible. (c) The above analysis repeated
with viewpoint/target offsets adjusted to encode views-­to the viewpoint locations.
Plate 17.8 Viewsheds generated for each of the tower-­kivas. The green zone represents the view from ground
level and the red the top of the tower. Blue dots indicate Puebloan archaeological sites in the landscapes of the
tower kivas. The radiating buffers extend for 20 km around each site – the maximum viewing range used for
the analyses (Kantner & Hobgood, 2016, Figure 3).
Plate 20.1 Multisensor (8 sensors) vertical magnetic gradient survey with SENSYS at Pella, North. Greece. Left
image indicates the original data suffering from various spikes, traverse striping and grid mismatching. Right image
indicates the results of processing that tried to remove those specific effects.

Plate 20.2 Application of the Fast Fourier Transform (FFT) power spectrum analysis of the magnetic data
obtained from the Bronze Age cemetery of the Békés Koldus-­Zug site cluster – Békés 103 (BAKOTA proj-
ect). The depth of the various targets (h) is easily determined by measuring the slope of the power spectrum
at different segments and dividing it by 4π (Spector & Grant, 1970). The radially averaged spectrum was cal-
culated and used to separate the magnetic signals coming from deep sources (h=2.87 m) and shallow sources
(h=0.73 m) below the sensor. The spectrum was also used as a guide to define a bandwidth filter in order to
eliminate the sources with wavenumber more than 550 radians/m and less than 100 radians/m respectively,
and enhance the magnetic signal coming from the potential archaeological structures.
Plate 20.6 Results of a seismic refraction survey at the area of the assumed ancient port of Priniatikos Pyrgos in East Crete,
Greece: (a) 2D image representing the depth to the bedrock, which reaches about 40 m below the current surface (bluish
colors). The black dots represent the position of the geophones along the seismic transects. The area has been completely
covered by alluvium deposits and other conglomerate formation fragments as a result of past landslide and tectonic activity.
The interpretation of the velocity of propagation of the acoustic waves revealed the spatial distribution of (b) the alluvium
deposits at the top (velocity of 491 m/sec), (c) the lower and upper terrace deposits (velocity of 1830 m/sec), (d) the medium
depth sandstones and conglomerates (velocity of 2400 m/sec) and (e) the deeper weathered limestone or cohesive conglom-
erates (velocity of 4589 m/sec) (Sarris, Papadopoulos, & Soupios, 2014).
Plate 20.7 Results of the geophysical surveys at Velestino Mati. The magnetic data (a) indicates the nucleus of the settlement
at the west top of the magoula with some expansion towards the east top. A number of high dipolar magnetic anomalies are
associated with burnt daub foundations that were also confirmed from the Electromagnetic Induction (EMI) soil magnetic
susceptibility (b) and the soil resistance data (c). Magnetic susceptibility also confirmed the existence of enclosures around
the tell.
Plate 20.8 Results of the geophysical surveys at Almyriotiki. The magnetic data (a) presented a clear image of the
internal planning of the settlement: Burnt daub structures follow a circular orientation around the top of the tell.
The houses expand further to the south, where some weaker magnetic anomalies representing stone houses with
internal divisions are also present. An irregular wide ditch system encloses the settlement from the east and the
north and it is confirmed from the EMI magnetic susceptibility (b) and soil conductivity measurements (c). The
high soil conductivity to the north coincides with an area susceptible to periodic flooding. The above were also
confirmed from the soil viscosity measurements (d) as an indicator of the soil permittivity.
Plate 20.9 Results of the geophysical surveys at Almyros 2. The magnetic data (a) depict clearly the concentration
of burnt daub structures at the centre of the tell, expanding further to the south. The settlement is surrounded by
a double ditch system, which is confirmed by both EMI magnetic susceptibility (b) and soil conductivity data (c).
A number of breaks in this double enclosure are most probably associated with multiple entrances to the settle-
ment. Soil conductivity seems also to increase outside the settlement to the south and west directions (north to
the top), namely in the area which is most susceptible to flooding.
Plate 21.5 The TimeMap Data Viewer (TMView) (from Johnson and Wilson 2003, 127).

Plate 21.6 Sample frame from an animation in the ‘Up In Flames’ study that combined synchronised animated
density graphs (produced in the R Software Environment) with animated density maps (produced in ArcGIS).
Plate 21.8 Time-­GIS (TGIS) screenshot showing dates symbolised according to temporal topology. The
colour coding is according to the temporal topological relationship between each date and the currently
selected time period.

Plate 23.1 A 3D model of the house of Caecilius Iucondus visualized through Unity 3D.

Plate 23.5 3D Model of the plaster head (21666) after conservation. The model was generated using Agisoft
PhotoScan pro version 1.2.6. with acquisition campaign and processing done by Nicoló Dell’Unto.
Plate 24.1 A citation network of results returned from a Google Scholar search of ‘geographic + visualization’,
using Ed Summers’ python package ‘Etudier’. “Google Scholar aims to rank documents the way researchers do,
weighting the full text of each document, where it was published, who it was written by, as well as how often
and how recently it has been cited in other scholarly literature.” https://scholar.google.com/intl/en/scholar/
about.html. The results give us a sense of the most important works by virtue of these citation patterns. Thus,
MacEachren, Boscoe, Hau, & Pickle (1998); Slocum et al. (2001); Brewer, MacEachren, Abdo, Gundrum, &
Otto (2000); Crampton (2002); Howard & MacEachren (1996) are most functionally important in tying
scholarship together. This is not the same thing as being the most often cited work. Rather, these are the works
whose ideas bridge otherwise disparate clumps; they are most central.

Plate 24.2 Citation analysis using Summers’ Etudier package, from a Google Scholar Search for ‘data + soni-
fication’. Colours are works that have similar patterns of citation; size are central works that tie scholarship
together. This is not the same thing as ‘most cited’. On this reading, one should begin with Madhyastha and
Reed (1995); Wilson and Lodha (1996); Zhao, Plaisant, Shneiderman, and Duraiswami (2004); De Campo
(2007); Zhao, Plaisant, Shneiderman, and Lazar (2008).
17
GIS-­based visibility analysis
Mark Gillings and David Wheatley

Introduction

What do archaeologists mean by visibility and why are they so interested in it?
Archaeologists have long recognised that the visual properties of landscape locations (and configurations
of such) were sometimes significant factors in the structuring of past activities and as a consequence
analyses of visibility have become common within landscape-­based archaeological research. Whether
carried out through rich description, simple mapping or formal modelling and statistical analysis, the
visualisation, exploration and interrogation of visual patterns have increasingly relied upon the function-
ality of GIS (Jerpåsen, 2009).
The thoroughness with which visual properties of landscape have been investigated has varied widely.
At its simplest it has taken the form of anecdote or simple description, perhaps noting that a given loca-
tion was visually-­commanding or acknowledging that visual relationships played a role in the organisa-
tion of a given landscape (e.g. Cummings & Pannett, 2005; Bongers, Arkush, & Harrower, 2012). More
methodologically informed studies have sought to interrogate the relationships observed, grounding such
investigations in explicit bodies of theory. Some have taken a broadly functionalist approach, in which
observed visual properties have been related to how the landscape operated or was used in the past. Good
examples concern questions of defensibility and the assumed inter-­visibility of watchtowers and signal-
ling systems (e.g. Gaffney & Stančič, 1991; Sakaguchi, Morin, & Dickie, 2010). Others have adopted a
more experiential stance, approaching visibility as a perceptual act carried out by an animal in an envi-
ronment. A variety of theoretical frameworks have been deployed in these latter studies, ranging from
Gibson’s theory of direct perception and his concept of affordance (Llobera, 1996 see also Webster, 1999;
Gillings, 2012) to Higuchi’s landscape theories and proposed visual indices (Wheatley & Gillings, 2000;
Ogburn, 2006; Ruestes Bitriá, 2008; Murrieta-­Flores, 2014; Williamson, 2016). Many of these studies
have drawn direct inspiration from landscape phenomenology, which seeks to investigate the relational
properties of looking and seeing in order to understand how people comprehended and engaged with the
world around them. This might involve the deliberate positioning of a sequence of monuments in order
to ‘dominate’ views or effect a gradually unfolding visual choreography of concealment and revelation,
314 Mark Gillings and David Wheatley

or the careful positioning of motifs on a rock-­art panel (e.g. Tilley, 1994, 2004; Cummings & Pannett,
2005; Rennell, 2012).
From the previous paragraph, it will be clear that ‘visibility’ has been characterised by archaeologists
in a variety of ways: a property (functional, aesthetic or otherwise) inherent to certain locations; some-
thing that manifests only in a network or configuration of locations; or an act of perception on the part
of an animal in an environment. In turn these characterisations have resulted in different methodologies
and interpretative strategies, ranging from explicit attempts to simulate (or replicate) individual acts of
looking and seeing, to the identification, extraction and interrogation of visual linkages and, finally, more
abstracted analyses of global visual affordance.
Each of these engages its own set of concerns and considerations, and they often differ in terms of the
degree of quantitative rigour that is demanded of a given analysis. Compare, for example, approaches that
simply aim to map and describe visual zones (e.g. Risbøl, Petersen, & Jerpåsen, 2013) with studies that
instead seek to assess the statistical veracity of the visual patterns observed in order to explain locational
choices in the past (e.g. Lopez-­Romero de la Aleja, 2008; Lake & Ortega, 2013; Wright, MacEachern, &
Lee, 2014). The focus of analyses can also be direct, insofar as the analysis of visibility is the desired end-­
product of the research (e.g. Loots, 1997), or more circumspect, the resultant analysis merely an ingredi-
ent (or step) in a more complex programme of analysis or exploration (e.g. Llobera, 2003; Paliou, 2013).
What they all share is a series of core concerns with questions of scale, clarity, acuity, verisimilitude, with
the particular emphasis shifting as a result of the type of analysis being attempted.

Method

How is visibility analysed in the GIS?


Regardless of the theoretical reasons for undertaking visibility analyses, researchers have increasingly
turned to Geographical Information Systems (GIS) in order to carry them out (for useful reviews of the
history of this process see Lake & Woodman, 2003; Kantner & Hobgood, 2016). At the heart of any
GIS-­based study is a model of the real world, usually a Digital Elevation Model (DEM) in the form of
an altitude matrix or Triangulated Irregular Network (TIN) in which the values represent topographic
height within the terrain to be analysed.
Geocomputation, usually within a GIS, then permits determination of a line-­of-­sight (LOS) between
two points of interest, which is the basic building-­block of more complex methods. Taken individually
LOS calculations can be used to make useful assessments of inter-­visibility: for example, where two
sites are thought to have been linked in a signalling system, calculation of an LOS (if based on correct
assumptions, as outlined below) can establish whether communication was, in fact, possible. Combin-
ing many LOS determinations enables delineation and mapping of the potential field-­of-­view from a
given location – often termed a binary viewshed, or simple viewshed (Figure 17.1). For example, to map
all locations that could potentially be seen from a given signal station, you would carry out LOS cal-
culations from the ‘origin’ viewpoint (i.e. the site location) to all potential locations in the study area
landscape, resulting in the viewshed for that particular site (under a particular set of assumptions, as
will be discussed). A range of other visual properties can also be mapped, depending on the algorithm
used. These include the angle-­of-­view (in which the azimuth, or compass direction, of each line of
sight is mapped), the elevation-­of-­view (in which the angle of elevation of each LOS is mapped) or
the distance-­of-­view (in which the length of each LOS is recorded). A further property that may also
be available for mapping is Above Ground Level (AGL) where the values mapped correspond to how
much higher each invisible location would have to be in order to come into view (Gillings, 2015).
Figure 17.1 A single, binary viewshed (the hatched area) designed to explore the visual impact of a 5 m high
wooden post (the white dot) that had been erected at the Avebury monument complex in Wiltshire, England.
A substantial structure, the post had been raised in the early Neolithic period at a location that would eventu-
ally be traversed by a megalithic setting of paired standing stones. The viewshed identifies those areas of the
Avebury landscape where a 1.65 m high viewer could have theoretically seen the post.
Source: Incorporates data © Crown database right 2007. An Ordnance Survey/(EDINA) supplied service
316 Mark Gillings and David Wheatley

This is effectively the complement of the viewshed (or less elegantly a ‘how-­f ar-­out-­of-­view-­shed’)
and can be used to assess how hidden locations appear to be. These basic ingredients (the LOS, the
viewshed and its variants) lie at the heart of routine GIS-­based visibility studies. One point of note is
that in urban studies, and within the field of space syntax (see Thaler this volume), the terms ‘Isovist’
and ‘Isovist Field’ have usually been used to describe maps that are functionally equivalent to the view-
shed (Benedikt, 1979; Turner, Doxa, O’Sullivan, & Penn, 2001) although sometimes calculated on a
slightly different basis.

LOS and viewshed algorithms


Whilst visibility can be calculated on both raster elevation models (digital elevation models (DEMs) and
digital terrain models (DTMs)) (see Conolly, this volume) as well as vector triangulated digital terrain
models (such as Triangulated Irregular Networks – TINs), routine implementations of the latter are cur-
rently rare, and this perhaps explains why the vast majority of visibility applications in archaeology have
focused upon gridded elevation data.
Given this predominance we will limit our discussion of algorithms to those applicable to raster eleva-
tion models. It should be noted, however, that depending upon the quality and density of elevation points
used in their creation, TINs can be much better at capturing changes in topography and the nuances and
detail of surface terrain (Chapman, 2006, pp. 72–74; Hengl & Evans, 2009, p. 41). Whilst determining
which portion(s) of each component triangle of a triangulated surface falls within a given viewshed is a
computationally complex process, research is on-­going into the development of efficient algorithms for
the determination of continuous visibility on triangulated surfaces (e.g. De Floriani & Magillo, 1994;
Hurtado et al., 2013).
Taking a viewpoint (or series of such) and a gridded elevation model as the input, various algorithms
can be used in order to determine the LOS between a given viewpoint and other locations on the eleva-
tion model. Algorithms for calculating LOS can be usefully broken down into what are termed accurate
and approximate (or optimised) techniques (Franklin & Ray, 1994; Larsen, 2015). The former is exem-
plified by the R3 algorithm that iteratively calculates the LOS from the viewpoint to each of the cells
in the DEM (Figure 17.2(a)). If this intersects the surface on route (i.e. is obstructed by intervening
terrain) the target cell is deemed to be out-­of-­view, otherwise a target is flagged as visible. In practice
whilst LOS that are drawn along the eight cardinal and ordinal directions from the viewpoint will cross
the centres of the component grid cells (and thus can read the elevation of the intersection directly from
the DEM cell value) LOS carried out in any other direction require some form of interpolation (e.g.
bilinear or nearest neighbour) in order to determine the elevations of the grid lines crossed by the LOS.
If the gradient of the line linking the viewpoint to the intersection falls below that of the LOS then it has
not blocked the view and vice versa. This is termed a discrete approach insofar as the status of each cell
(in-­view or out-­of-­view) is determined on an individual basis, the final viewshed comprising the set of
all visible cells (Toma, 2012). Whilst the R3 approach is considered the most accurate, the sheer number
of LOS calculations required to generate entire viewsheds (where the number of cells may be extremely
high in high-­resolution data sets) make it slow in operation.
Whilst this is not necessarily an issue in the case of individual viewshed determinations, in the case
of Cumulative Viewshed or Total Viewshed generation (see later in this chapter) with potentially millions of
possible viewpoints, the time taken to carry out such analyses can be prohibitive. The growing availability
of large, extremely high-­resolution DEMs (e.g. through LiDAR or time-­of-­flight photogrammetry) and
the realisation of the heuristic value of complex viewshed products has led to a growing body of research
into the development of more optimised algorithms, such as R2 (and its variants), Radial Sweep, Radar,
GIS-­based visibility analysis 317

Figure 17.2 (a) Conceptualisation of the basic line-­of-­sight (LOS) algorithm: LOS between two locations in an
altitude matrix can be established by comparing the height of each cell that intersects the line with the height
of the line at that location, interpolating where necessary. (b) Note that view-­to and view-­from are not neces-
sarily reciprocal because they represent different assumptions about the location of the viewer. An R3 viewshed
algorithm essentially repeats this calculation for every cell in the altitude matrix (except the viewpoint) and
records the result(s) for each cell.

Concentric Sweep and XDraw, often in creative combination. In each case the aim is to trade accuracy
for speed of calculation which is why the latter approaches are termed approximate.
In brief, the R2 approach optimises R3 by reducing the number of individual LOS calculations that
need to be carried out (Figure 17.3). It does this by first running an LOS from the viewpoint to each
location on the horizon or study area boundary. It then works outwards from the viewpoint along each
of the LOS, determining the elevations of the grid intersections crossed for each grid cell adjacent to the
line. By calculating the gradients in each case the status of the intersection can be determined (in-­view
or out-­of-­view) and this is then assigned to the closest grid cell (Larsen, 2015, pp. 26–30; Kaučič & Žalik,
2002, p. 179). Where cells receive multiple approximations (as multiple LOS may pass by) it is the closest
that determines the visibility status (Franklin & Ray, 1994, p. 6).
In much the same way as R2, Sweep (Haverkort, Toma, & Zhuang, 2008; van Kreveld, 1996) and Radar
(Ben-­Moshe, Carmmi, & Katz, 2008) approaches rely upon approximating the visibility status of cells
along the line of a given LOS, in this case rotating the LOS around the selected viewpoint like the second
hand of a clock. Many other algorithms exist, both variants of the above (such as Distribution and Con-
centric sweeping) and innovative horizon based approaches such as the XDraw algorithm (Franklin &
Ray, 1994; Wang, Robinson, & White, 2000) and it is important to be aware that research into optimised
viewshed algorithms continues apace (e.g. Izraelevitz, 2003; Xu & Yao, 2009; Ferreira, Andrade, Magal-
hães, Franklin, & Pena, 2014). The key point to stress here is that different algorithms are available that –
in an attempt to balance speed of computation with accuracy – will produce output of varying quality. In
any given GIS analysis it is therefore important to be aware of the particular strengths and weaknesses of
the algorithm being deployed. Unfortunately, outside of artificial (and carefully controlled) test datasets,
assessing the accuracy of a given LOS determination or viewshed can be extremely difficult and this has
been exacerbated by a tendency for analysts to treat one algorithm (invariably black-­boxed but usually an
R3 implementation) as a de-­facto standard against which optimised algorithms can be tested (e.g. Fisher,
1993; Kaučič & Žalik, 2002). As we discuss below, GIS-­based visibility determinations are perhaps best
treated as probabilities established using imperfect models rather than actualities. As a consequence, the
318 Mark Gillings and David Wheatley

Figure 17.3 The concept of the R2 viewshed algorithm which operates by: (a) generating a ‘view horizon’ (the
white box), noting the visibility of each cell on that horizon, and storing the elevation angle of view to the
observer a1, a2 etc.; (b) expanding the horizon by one cell and calculating new angles of view B for each cell
on the new view horizon; (c) the angle of intersection with the previous horizon A is inferred (from a2 and
a3) and the new angle B is compared with A to determine whether a new cell is visible or not.

field testing of results is still strongly recommended although, whilst very much in the spirit of approaches
such as landscape phenomenology, for reasons that will be discussed next this also presents difficulties.

2 or 3D?
In landscape and urban studies, visibility has traditionally been mapped as a 2D property of landscape in
the form of a map (generally stored as a 2D matrix). More recently, there has been a growing interest in
the investigation of the 3D properties of visibility fields. At its simplest, this has acknowledged the vertical
dimension of landscape through the analysis of Above Ground Level (AGL) metrics and the calculation of
vertical visibility indices (Nutsford, Reitsma, Pearson, & Kigham, 2015), however these are still recorded
as 2D matrices of values, albeit related to 3D properties of space.
In highly complex environments, such as the interior of buildings or urban settings, this simplification
can be too restrictive and 3D matrices of visibility values which record variation on three, rather than
two, axes become important. Instead of an altitude matrix, fully 3D methods analyse 3D models which
can represent the complex forms of rooms and buildings, permitting the representation of archways, for
example. This has led to the development of the 3D isovist (e.g. Suleiman, Joliveau, & Favier, 2013) and
to visibility analyses conducted within 3D modelling systems. A good example can be found in the work
of Paliou (2011, 2013, 2014), in which the visibility of wall paintings within complex buildings and from
the outside of buildings (through windows and doorways) was formally analysed to gain insights into the
consumption of Theran mural art.
GIS-­based visibility analysis 319

Issues relating to the elevation model


Viewshed and LOS calculations are sensitive to imperfections and errors in the DEM, particularly if
they are very close to the viewing point, or occur on ridges or crests such that they may open or close
large vistas beyond them. To minimise this, some workers recommend filtering or smoothing the DEM
before carrying out visibility analyses (Reuter, Hengl, Gessler, & Soille, 2009), although it is not clear
that this improves the absolute accuracy of the result. Many kinds of viewshed or visibility calculations
(particularly cumulative or total viewsheds, see below) also suffer from ‘edge effects’. These tend to occur
when viewpoints close to the edge of a study area (i.e. edge of the DEM) result in LOS or viewshed
estimates that are artificially truncated (Figure 17.4). The simplest solution is to ensure that the DEM

Figure 17.4 Buffering to avoid edge effects. The map depicts a group of Roman coin hoards in the Don
Valley in northern England. A series of visibility analyses were carried out in order to determine whether the
hoards were preferentially placed in relatively concealed (or hidden) locations. (a) In the centre is the convex
hull bounding the group of hoard locations. (b) Assuming a maximum view radius of 3,440 m (corresponding
to Ogburn’s (2006) limit of normal 20/20 vision for a 1 m wide object) we would need to process the area
included in this buffer to avoid edge effects. (c) If we increased this to 6,880 m (the limit of human acuity for
a 1 m wide object) we would need to extend our processing area accordingly – in this case to the outer buffer.
Hoard data taken from Bland et al. (2019).
Source: (Incorporates data © Crown database right 2007. An Ordnance Survey/(EDINA) supplied service.)
320 Mark Gillings and David Wheatley

Figure 17.5 Using a scaling factor to compensate for edge effects.

is sufficiently buffered to negate such effects. For example, if your maximum viewing range is set to
6,880m – the absolute limit of human resolution and recognition acuity under ideal conditions (Ogburn,
2006, p. 410) – the DEM used to carry out the analysis needs to cover an area equivalent to the study area
plus a 6,880m-­wide buffer, and results in the buffer zone can be discarded. An alternative to buffering
would be to use a scaling factor to account for the loss of available area as we approach the edge of the
region. For example, we could apply a factor ranging from 1 (for viewpoint locations with no edge effect)
through 0.5 (for locations on the edge of the DEM) to 0.25 (for those in the corners). It should be noted
that this would only offer a partial solution as whilst it would correct the scaling error, the accuracy of
the parameter estimates would still be compromised (Figure 17.5).
It is important to understand that the crisp, binary viewshed (or definitive LOS determination) is a
simplified model of visibility within a virtual environment, and so we should perhaps regard the results of
such analyses more as theoretical possibilities than definitive statements of in-­view or out-­of-­view. One
particularly elegant solution to this was proposed by Fisher (1994, 1995) through the notion of the probable
and fuzzy viewshed though these important ideas have not gained traction within archaeological studies
(for notable exceptions see Nackaerts & Govers, 1997; Ruggles & Medyckyj-­Scott, 1996). Fisher set out a
probabilistic model of the errors in DEMs, including a term for spatial autocorrelation (because errors in
DEMs are not spatially uncorrelated), and advocated using a Monte Carlo approach involving repeated vis-
ibility determinations on different simulated DEMs. The resulting viewshed estimates were then summed
to produce the final output in the form of mapped view probability (the ‘probabilistic viewshed’) (Fig-
ure 17.6). Fuzzy viewsheds, according to Fisher’s definition, employ fuzzy set theory to incorporate acuity
effects (Loots, Nackaerts, & Waelkens, 1999) (see de Runz & Fusco, this volume). Rather than being clear
cut, the edges of a given viewshed are instead graded between 1 and 0 depending upon distance from the
viewpoint. Needless to say, probable and fuzzy viewshed analyses can also be combined.
A further issue is that the vast majority of the DEMs on which we base our analyses reflect contem-
porary topography which may have changed (through e.g. erosion, deposition, or later activity) from the
GIS-­based visibility analysis 321

Figure 17.6 (a) A binary viewshed generated from the prehistoric post setting at Avebury depicted in Fig-
ure 17.1 (circled in white) shown over a shaded relief model. (b) A probabilistic viewshed calculated from the
same location (digital elevation model (DEM) errors are modelled as normally distributed with a root mean
square error (RMSE) of 3 m). White areas represent 100% probability, with the probability declining as the
shading becomes darker.
Source: (Incorporates data © Crown database right 2007. An Ordnance Survey/(EDINA) supplied service.)

period we are interested in. DEMs generally also lack the vegetation which may significantly reduce the
level of visibility possible within a landscape, and which can introduce a strong seasonal dynamic. Whilst
contemporary vegetation may be represented in LiDAR data (for example) or can be added to a ‘bare’
DEM (as a series of height stands), it cannot model sufficient detail to identify where views through sparse
stands of trees are possible (particularly in winter), for example. A more obvious problem is not knowing
where, exactly, vegetation grew in antiquity. Here a Monte Carlo approach similar to that used by Fisher
for modelling DEM errors (see earlier) may be fruitful: a stochastic model of the vegetation distribution
can be used to simulate many different modified DEM-­plus-­vegetation surfaces, and the results summed
to produce an estimate of the probability of view, given the vegetation model.

Enriching basic LOS and binary viewshed determinations


In modelling LOS or viewsheds in a GIS, it is also important to remember that something, or someone,
needs to be doing the looking. As a result, a number of variables need to be considered and controlled,
and most implementations of LOS and viewshed functions therefore permit considerable variability in
the specific model. The physics of light propagation over longer distances means that algorithms should
account for factors such as the curvature of the earth, and it is the case that specific environmental
322 Mark Gillings and David Wheatley

conditions (fog, mist, haze) may place restrictions on the level of visibility and may change at different
times or seasons (requiring an estimate of the refractive index of the environment to be input).
The most obvious parameters that need to be considered are elevation offsets to represent the height
of the viewing and observed locations. For example, to generate a viewshed for the region visible from
a walkway atop a defensive structure would require an offset to be entered for the viewer height at the
origin location, which will in this case be an estimate of the eye-­height of an individual plus the height
of the walkway they are standing on (e.g. Mitcham, 2002). It is also important to remember that LOS
determinations cannot be assumed to be reciprocal (Figure 17.2(b)) so that, to extend the earlier example,
if we are interested instead in establishing where in the landscape a guard on the walkway could be visible
from, we would also need to enter an offset for every other location (i.e. each other cell in the DEM) to
represent the eye-­height of the potential viewer looking towards the walkway. This issue was recognised
in some of the earliest systematic archaeological studies of visibility (e.g. Fraser, 1983) and confirmed in
a GIS context by Fisher (1996, p. 1298). Loots (1997) even proposed that two distinct terms were needed
to describe viewsheds depending on what was being modelled: projective (views-­from) and reflective
(views-­to) though this terminology does not appear to have been widely adopted. A conventional and
widely used estimate of the height of the human eye above the ground is 1.65m, although that should be
considered for each context as, clearly, it may not be an appropriate figure (or may not be the only appro-
priate figure) for a given human population, and would be inappropriate for modelling, for example, the
visual field of children or seated viewers. In some circumstances, we may also wish to model the direction
in which the viewer is looking. If not instructed otherwise, most algorithms will rather artificially assume
that a viewer will rotate through the full 360 degrees whilst viewing at all possible elevations of the head,
much like a terrestrial laser scanner. Needless to say, people rarely ‘look’ in that way. Fortunately, viewing
angles in the horizontal and vertical planes can usually be controlled.
Acuity is more difficult to model, despite the fact that a number of robust metrics are available to
allow the limits of vision to be established for different distances and scales of target object (Wheatley &
Gillings, 2000; Ogburn, 2006). Being able to see something and being able to recognise what it is that
you are looking at are not necessarily the same thing and we must assume that the quality of eyesight in
the past would have varied in much the same way that it does today. As we have already noted, a host of
dynamic effects such as the weather and air quality can equally impact upon the distance that we can see
and the maximum viewing distance we establish can have a marked impact on the final results generated
(Figure 17.6). It is also the case that there is more to acuity than simply distance, as some targets are easier
to see than others (large, distinctively coloured, moving versus small, camouflaged and still). Although
there are methods for accounting for some of these factors (e.g. the ‘Fuzzy Viewsheds’ discussed earlier),
these factors bring into question the usefulness of many simple binary visibility analyses. With the excep-
tion of horizons, fields of view do not abruptly end, and the precise viewshed possible on different days
and for different viewers will vary.
The upshots of all of this are twofold. First, all LOS and viewshed calculations must be carefully con-
sidered, and then the parameters modelled through the appropriate selection of variables such as offsets
and maximum view distances. Second, great care should be taken in the selection and preparation of the
DEM upon which they are based.

Scaling up – building more complex methods


Analysis of single locations can be surprisingly informative, but more complex methodologies can
also be built up by combining the calculations for groups of viewpoints or even entire landscapes.
In practice, this usually begins by summing viewsheds that have been generated from each location
GIS-­based visibility analysis 323

of interest to create a new map. That may be a super-­set of the viewsheds in which cells that can see
(or can be seen from) one or more of the locations of interest are identified (often called a Multiple
Viewshed) or they may be numerical summaries in which the cells encode the frequency of locations
of interest that are in-­view. A consistent terminology for these has not emerged although, where the
number of viewsheds being summed is relatively small (e.g. less than 100) and relate to a set of related
locations, these tend to be referred to as Cumulative Viewsheds (Wheatley, 1995; though see Lake,
Woodman, & Mithen, 1998) and statistical analysis of these as Cumulative Viewshed Analysis (CVA)
(see Figure 17.7).
Where the goal is to obtain the visibility characteristics of an entire landscape such that, ideally, each
cell in the DEM is iteratively treated as a separate viewpoint, a range of terms have been proposed for the
resulting map: Total Viewsheds (Llobera, 2003); Inherent Viewsheds (Llobera, Wheatley, Steele, Cox, &
Parchment, 2010); visual exposure density (Berry, 1993, p. 169); visibility index (Olaya, 2009, p. 157);
viewgrid, dominance-­viewgrid (Lee & Stucky, 1998, p. 893); affordance viewshed (Gillings, 2009); vis-
ibility fields (Eve & Crema, 2014); and visibility surfaces (Caldwell, Mineter, Dowers, & Gittings, 2003).
Incidentally, the same operations can be carried out with AGL layers in order to produce maps of summed
‘hiddenness’ in relation to the sample of viewpoints. The utility of the “Affordance Viewshed” (to pick
one) is twofold – it provides a visual and quantitative summary of some visual property of the landscape
(usually area-­of-­view) that can reveal subtle patterns in the opportunities afforded by the landscape with
respect to visibility. It also represents the statistical population of that property against which hypothesis
testing of groups of locations is possible, enabling significant patterns to be more rigorously identified
(Figure 17.7).
Taking inspiration from Graph theory, an alternative approach is to integrate multiple LOS deter-
minations into a visibility network (e.g. De Montis & Caschili, 2012; Brughmans, Keay, & Earle, 2015;
Brughmans, Waal, Hofman, & Brandes, 2018; Van Dyke, Bocinsky, Windes, & Robinson, 2016; for a
detailed methodological discussion see Brughmans & Brandes, 2017). In these studies individual loca-
tions (viewpoints) are linked by edges if an LOS exists between them. In the same way as viewsheds these
viewpoints can correspond to particular sites of interest or simply locations in the landscape. The edges
linking inter-­visible points can be coded in terms of factors such as direction and can also be weighted.
Once generated, the configuration and density of the resultant network can then be analysed to explore
factors such as centrality, neighbourhood size, degree of clustering and mean shortest path length. The
networks can also be used to derive second-­order graphs (e.g. where locations lacking direct LOS can be
connected if they share other visible locations) (O’Sullivan & Turner, 2001; Brughmans & Brandes, 2017,
pp. 7–9). As such they offer a powerful set of heuristics and an intriguing alternative to more standard
raster approaches.

What can you do with viewsheds?


A single LOS can serve to address very specific questions of the form ‘can X see (or be seen by) Y’. In
the case of individual viewsheds this can be extended to ‘what can be seen from (or see) X’. Cumulative
viewsheds and visibility networks extend the range of questions possible to include variations on ‘how
often can a location see or be seen’.
Cumulative Viewshed Analysis can be used to establish whether patterns of inter-­visibility are statisti-
cally significant, or whether two groups of locations within a landscape differ significantly from each
other with respect to their visibility characteristics. Typically, this is done by comparing the Cumulative
Viewshed generated from the viewpoints of interest against a random sample of non-­site locations (e.g.
Wheatley, 1995; Lake et al., 1998; Bongers et al., 2012; Wright et al., 2014).
Figure 17.7 (a) The cumulative viewshed generated by summing the binary viewsheds of the 17 coin hoard
locations depicted in Figure 17.4 with a maximum view radius of 6,880 m. Colours at the red end of the
green-­red scale indicate locations from which higher numbers of mounds are visible. (b) The total viewshed
calculated for the entire study region (in this case, the convex hull depicted in Figure 17.4 with a 500 m buf-
fer – 8,938 viewpoint locations). This encodes views-­from the individual viewpoints, where the red end of the
green-­red scale indicates those locations from which a larger area is modelled as being visible. (c) The above
analysis repeated with viewpoint/target offsets adjusted to encode views-­to the viewpoint locations. A colour
version of this figure can be found in the plates section.
Source: (Incorporates data © Crown database right 2007. An Ordnance Survey/(EDINA) supplied service.)
GIS-­based visibility analysis 325

There are also a host of descriptive properties that can be extracted from viewsheds in order to furnish
information about the perceptual character of a given view. For example, shape, compactness, directional-
ity, eccentricity and degree of fragmentation to name but a few (e.g. Aguiló & Iglesias, 1995; Wheatley &
Gillings, 2000; Llobera, 2003; Trick, 2004). In the case of Cumulative Viewsheds and Total Viewsheds,
these can be treated as visibility surfaces and subjected to a range of geomorphometric analyses in order
to extract descriptive parameters such as roughness and texture (Olaya, 2009; Gillings, 2015). Neighbour-
hood based analyses are also possible in order to move beyond discretely bounded viewpoints to consider
instead the visual properties of chunks of landscape (Brughmans, van Garderen, & Gillings, 2018).
As well as end-­products in their own right, viewsheds can be fed forward into GIS-­based studies of
other phenomenon. When applied to the entire landscape, Total Viewsheds offer what is in effect a loca-
tion independent global index of visibility that can serve as a key ingredient in the study of landscape
affordances such as visual prominence (Llobera, 1996, 2001; Gillings, 2009; Bernardini, Barnash, & Wong,
2013), concealment (Gillings, 2015) and visual exposure (Llobera, 2003). They can also enrich analyses
of properties not immediately thought of as visual such as liminality (e.g. Gillings, 2017) and movement
(e.g. Murrieta-­Flores, 2014), the latter through the use of viewsheds as frictions in the generation of
cost-­surfaces and least-­cost pathways (Lee & Stucky, 1998; Lu, Zhang, & Fan, 2008; Lock, Kormann, &
Pouncett, 2014; see also Herzog, this volume).

Case study

Visibility analysis in action: a GIS-­based viewshed analysis of Chacoan tower


kivas in the US Southwest
The following case study is drawn from the work of Kantner and Hobgood (2016). For further detail
on this topic, including a complimentary set of GIS analyses, interested readers should also refer to the
work of Van Dyke et al. (2016). The study centres upon the Puebloan landscapes of the American
Southwest with particular focus upon tower-­kiva structures (stacked circular rooms with as many as
four storeys) and their associated monumental great houses. Dating to the late 11th century AD, these
monument complexes were often constructed in high, visually prominent locations in desert land-
scapes characterised by dramatic landforms (mountains and mesas) and long uninterrupted views (Van
Dyke et al., 2016, pp. 205–206). To date, interpretations of the tower kivas have tended to follow one
of two paths, drawing upon the monumental character of the structures, general location and their
pronounced verticality. These paths can broadly be termed functional and symbolic-­ceremonial. Utilitarian
approaches have argued that they served as defensive structures or elements of a long distance signal-
ling system (in each case locations to be looked out from). The second group of interpretations have
argued instead that these dramatic structures served to visually enhance the status of the settlements
with which they were attached, directly referencing wider cosmological and symbolic concerns (i.e.
designed to be looked at). Given that visual properties, characteristics and capacities lie at the heart of
all of these explanatory schema it is perhaps not surprising that GIS-­based visibility studies have been
used in order to evaluate them. In the following discussion, as well as outlining the methodologies
employed by the researchers and results gained, we also include some thoughts as to how the analyses
might be enhanced or further extended
Kantner and Hobgood analysed a 2,500km square study area containing two tower kivas – Kin Ya’a
and Haystack – as well as over 2,000 other archaeological sites (Figure 17.8). A range of viewsheds were
generated with a maximum viewing distance of 20km established on the basis of visual acuity multipli-
ers, representing the maximum range at which a substantial piece of architecture such as a great house or
326 Mark Gillings and David Wheatley

Figure 17.8 Viewsheds generated for each of the tower-­kivas. The green zone represents the view from
ground level and the red the top of the tower. Blue dots indicate Puebloan archaeological sites in the landscapes
of the tower kivas. The radiating buffers extend for 20 km around each site – the maximum viewing range
used for the analyses (Kantner & Hobgood, 2016, Figure 3). A colour version of this figure can be found in
the plates section.

tower kiva would be discernible. To assess the signalling station hypothesis, intervisibility between the two
sites was assessed. In practice two viewsheds were generated for each site, one based upon a 1.7m high
viewer on the ground surface and the second placing the same viewer on top of the tower, reconstructed
to a height of 12m (i.e. 13.7m in total). The complete lack of either direct intervisibility or overlap
between the viewsheds generated (where a relay station could have been placed) led the researchers to
conclude that signalling was not a function.
GIS-­based visibility analysis 327

POINTS TO CONSIDER: it could be argued that the problem of reciprocity would have been better
addressed by also controlling target height offsets – for example, where a viewer on a tower (i.e.
viewing location offset = 13.7m) could see a viewer on a tower (i.e. target offset also of 13.7m). Given
the historically documented use of smoke as a signalling mechanism, a sensitivity analysis would also
have been prudent. This could have taken the form of repeated viewshed calculations with increasing
target heights to ascertain how high a smoke plume would have had to rise in order to have been
visible. Incidentally, the same information could be extracted directly from an AGL layer.

To analyse the defensive hypothesis, a sensitivity analysis was carried out to assess the extent to which
vertical height enhanced the location’s ability to see over long distances (thus providing critical early-­
warning of any approach). In practice a series of viewsheds were generated with an incremental increase
in viewing location height from 1.7m (i.e. on the ground) to 17.6m in steps of 2m. By measuring the area
of viewshed within a series of radiating 1km bands (Buffers) around each site the results demonstrated
that beyond a distance of 7km (Kin Ya’a) and 9km (Haystack) increases in maximum viewing distance
were negligible suggesting that defence was not the driver for constructing the tower kivas.

POINTS TO CONSIDER: no detail was given with regard to target offsets – i.e. what observers
in the tower kivas were expected to provide early warning of. Were they looking for approach-
ing individuals (on foot or horseback for example) or dust clouds generated by groups of such?
Either way, appropriate target heights could have been factored into the analyses. Likewise, the
maximum viewing distance could have been calibrated to reflect the maximum range at which a
person could be discerned against the background. The assumption is that 20km was used again,
but how likely is it that objects 20m in height and 10m in width would have been sneaking up on
the tower kiva?

To investigate the question of status, and enhanced visual presence, views-­to the tower kivas were inves-
tigated and cross-­referenced against the number of contemporary sites in the surrounding landscape.
The aim was to determine to what extent the tower-­structures were a prominent part of daily life in the
landscape; i.e. afforded a tangible visual presence. The results were argued to demonstrate that construc-
tion of the tower kiva resulted in 34% more sites within 5km of Kin Ya’a being able to see the great house
complex. At Haystack the figure was 12.5%. In each case increases beyond 5km were extremely low. This
result was used to argue that the role of the tower kivas was to enhance the visibility of the monumental
great house complex within its immediate community, thus reinforcing its status as a social focus and
political centre. Indeed, the authors likened them to minarets or church steeples.

POINTS TO CONSIDER: once again offsets and acuity are key factors that are not described in the
case-­study. There are also other ways in which local visual prominence could have been assessed.
For example, using the workflows described by Llobera (1996) and Bernardini et al. (2013) or
perhaps a views-­to Total Viewshed to assess the extent to which the locations of the monument
328 Mark Gillings and David Wheatley

complexes were more (or less) visible than any other locations in the immediate landscape. Is
there preferential clustering of sites in certain parts of the visual envelope? Does the envelope itself
have any inherent directionality? At present the population of sites surrounding the tower kivas are
also presented as a uniform, homogenous block (all sites – dots on the map – are the same) with no
sense of chronology or temporality (they were all seemingly in place and active at the same time).
A network approach here would enable the researchers to determine precisely which kinds of site
make up the 34% and 12.5% increases effected by the tower kiva’s height. This approach has been
taken by Van Dyke et al. (2016) in their analyses of great house location through the creation of
inter-­visibility networks they term ‘viewnets’ (2016). It would also be prudent to take a careful look
at sites that are spatially close yet visually excluded.

This research demonstrates nicely the value of GIS-­based approaches to the carefully structured analysis
and exploration of visibility. A series of hypotheses, grounded in clear bodies of archaeological theory,
have been posited and carefully explored. The questions and suggestions we have raised as to where the
study might go next, are designed to stress the key point that rather than an end in themselves, the results
of GIS-­based visibility studies are always best thought of as the first stage in the analytical progress, gen-
erating as many provocative and productive questions as they answer.

Conclusion

Looking ahead
Whilst the full range of possible visibility heuristics was defined at an early stage in the development of
GIScience (e.g. Nagy, 1994; Aguiló & Iglasias, 1995) the traditional barrier to realising the full potential of
computational approaches to visibility has been the time it takes to generate them. With the introduction
of optimised algorithms and more efficient architectures this has now been overcome. Archaeology has also
worked hard to address early criticisms of visibility analyses from within the discipline itself that highlighted
the lack of any robust theoretical (or indeed archaeological) rationale for carrying them out (e.g. Rennell,
2012; Brughmans et al., 2015; Gillings, 2017). Whilst the research field is currently a vibrant one, we would
argue that three essential developments need to take place if GIS-­based visibility research in archaeology is to
reach its full potential. First, we need to unite what has been a rather fragmented field (based upon 30 or so
years of often piecemeal proof-­of-­method experiments and developments) into a single place with a coher-
ent nomenclature. In this way we will at the very least shift the balance between true innovation and the
repeated re-­discovery of the same good ideas more towards the former (Gillings, 2017). Second, the veracity
of the myriad of tweaks and refinements to simple, binary visibility analyses that have been proffered need
to be evaluated and assessed. Finally, in order to prevent analyses from becoming formulaic and limiting, the
trend towards treating the results of a given visibility analysis as merely the first stage in a process of analysis
rather than the end-­product, needs to be embraced and encouraged.

References
Aguiló, M., & Iglesias, E. (1995). Landscape inventory. In E. Martínez-­Falero & S. González-­Alonso (Eds.), Quantita-
tive techniques in landscape planning (pp. 47–85). Boca Raton: CRC.
Benedikt, M. L. (1979). To take hold of space: Isovists and Isovist fields. Environment and Planning B: Urban Analytics
and City Science, 6, 47–65.
GIS-­based visibility analysis 329

Ben-­Moshe, B., Carmmi, P., & Katz, M. J. (2008). Approximating the visible region of a point on terrain. GeoIn-
formatica, 12(1), 21–36.
Bernardini, W., Barnash, A., & Wong, M. (2013). Quantifying visual prominence in social landscapes. Journal of
Archaeological Science, 40, 3946–3954.
Berry, J. K. (1993). Beyond mapping: Concepts, algorithms and issues in GIS. Fort Collins: GIS World Books.
Bland, R., Chadwick, A., Haselgrove, C., Mattingly, D., Rogers, A., & Taylor, J. (2019). Iron age and Roman coin hoards
in Britain. Oxford: Oxbow.
Bongers, J., Arkush, E., & Harrower, M. (2012). Landscapes of death: GIS-­based analyses of chullpas in the western
Lake Titicaca basin. Journal of Archaeological Science, 39, 1687–1693.
Brughmans, T., & Brandes, U. (2017). Visibility network patterns and methods for studying visual relational Phenom-
ena in archeology. Frontiers in Digital Humanities, 4, 1–17. https://doi.org/10.3389/fdigh.2017.00017
Brughmans, T., Keay, S., & Earle, G. (2015). Understanding inter-­settlement visibility in Iron age and Roman Southern
Spain with exponential random graph models for visibility networks. Journal of Archaeological Method and Theory,
22(1), 58–143.
Brughmans, T., van Garderen, M., & Gillings, M. (2018). Introducing visual neighbourhood configuration for total
viewsheds. Journal of Archaeological Science, 96, 14–25. https://doi.org/10.1016/j.jas.2018.05.006
Brughmans, T., Waal, M. S. de, Hofman, C. L., & Brandes, U. (2018). Exploring transformations in Carib-
bean indigenous social networks through visibility studies: The case of late pre-­colonial landscapes in East-­
Guadeloupe (French West Indies). Journal of Archaeological Method and Theory, 25(2), 475–519. doi:10.1007/
s10816-­017-­9344-­0
Caldwell, D. R., Mineter, M. J., Dowers, S., & Gittings, B. M. (2003). Analysis and visualisation of visibility surfaces
[Poster]. Retrieved from www.geocomputation.org/2003/Papers/Caldwell_Paper.pdf
Chapman, H. (2006). Landscape archaeology and GIS. Stroud: Tempus.
Cummings, V., & Pannett, A. (2005). Island views: The settings of the chambered cairns of southern Orkney. In
V. Cummings & A. Pannett (Eds.), Set in Stone: New approaches to Neolithic monuments in Scotland (pp. 14–24).
Oxford: Oxbow.
De Floriani, L. D., & Magillo, P. (1994). Visibility algorithms on triangulated digital terrain models. Geographical
Information Systems, 8(1), 13–41.
De Montis, A., & Caschili, S. (2012). Nuraghes and landscape planning: Coupling viewshed with complex network
analysis. Landscape and Urban Planning, 105, 315–324.
Eve, S., & Crema, E. (2014). A house with a view? Multi-­model inference, visibility fields, and point process analysis
of a Bronze Age settlement on Leskernick Hill (Cornwall, UK). Journal of Archaeological Science, 43, 267–277.
Ferreira, C., Andrade, M. V., Magalhães, S. V., Franklin, W. R., & Pena, G. C. (2014). A parallel algorithm for viewshed
computation on grid terrains. Journal of Information and Data Management, 5, 171–180.
Fisher, P. F. (1993). Algorithm and implementation uncertainty in viewshed analysis. International Journal of Geographi-
cal Information Systems, 7(4), 331–347.
Fisher, P. F. (1994). Probable and fuzzy models of the viewshed operation. In M. F. Worboys (Ed.), Innovations in
GIS: Selected papers from the First national conference on GIS research UK (pp. 161–175). London: Taylor and Francis.
Fisher, P. F. (1995). An exploration of probable viewsheds in landscape planning. Environment and Planning B: Planning
and Design, 22, 527–546.
Fisher, P. F. (1996). Extending the applicability of viewsheds in landscape planning. Photogrammetric Engineering &
Remote Sensing, 62(11), 1297–1302.
Franklin, W. R., & Ray, C. K. (1994). Higher isn’t necessarily better: Visibility algorithms and experiments. In
T. Waugh & R. Healey (Eds.), Advances in GIS research: Sixth international symposium on spatial data handling
(pp. 751–770). Edinburgh: Taylor and Francis.
Fraser, D. (1983). Land and society in neolithic Orkney. British Series 117. Oxford: British Archaeological Reports.
Gaffney, V., & Stančič, Z. (1991). GIS approaches to regional analysis: A case study of the island of Hvar. Ljubljana: Znanst-
veni inštitut, Filozofske fakultete.
Gillings, M. (2009). Visual affordance, landscape and the megaliths of Alderney. Oxford Journal of Archaeology, 28(4),
335–356.
Gillings, M. (2012). Landscape phenomenology, GIS and the role of affordance. Journal of Archaeological Method and
Theory, 19(4), 601–611. https://doi.org/10.1007/s10816-­012-­9137-­4
330 Mark Gillings and David Wheatley

Gillings, M. (2015). Mapping invisibility: GIS approaches to the analysis of hiding and seclusion. Journal of Archaeologi-
cal Science, 62, 1–14. http://dx.doi.org/10.1016/j.jas.2015.06.015
Gillings, M. (2017). Mapping liminality: Critical frameworks for the GIS-­based modelling of visibility. Journal of
Archaeological Science, 84, 121–128. http://dx.doi.org/10.1016/j.jas.2017.05.004
Haverkort, H., Toma, L., & Zhuang, Y. (2008). Computing visibility on terrains in external memory. ACM Journal
of Experimental Algorithmics, 13, Article 1.5, 1–23.
Hengl, T., & Evans, I. S. (2009). Mathematical and Digital Models of the Land Surface. In T. Hengl, & H. I. Reuter
(Eds.), Geomorphometry: Concepts, Software, Applications (pp. 31–63). Amsterdam: Elsevier
Hurtado, F., Löffler, M., Matos, I., Sacristán, V., Saumell, M., Silveira, R., & Staals, F. (2013). Terrain visibility with
multiple viewpoints. In L. Cai, S. Cheng, & T. Lam (Eds.), Algorithms and computation: ISAAC 2013: Lecture notes
in computer science (Vol. 8283, pp. 317–327). Berlin: Springer.
Izraelevitz, D. (2003). A fast algorithm for approximate viewshed computation. Photogrammetric Engineering & Remote
Sensing, 69(7), 767–774.
Jerpåsen, G. B. (2009). Application of visual archaeological landscape analysis: Some results. Norwegian Archaeological
Review, 42(2), 123–145.
Kantner, J., & Hobgood, R. (2016). A GIS-­based viewshed analysis of Chacoan tower kivas in the US Southwest:
Were they for seeing or to be seen? Antiquity, 90(353), 1302–1317.
Kaučič, B., & Žalik, B. (2002). Comparison of viewshed algorithms on regular spaced points. In A. Chalmers (Ed.),
Proceedings of the 18th spring conference on computer graphics (pp. 177–183). New York: ACM.
Lake, M., & Ortega, D. (2013). Compute-­intensive GIS visibility analysis of the settings of prehistoric stone circles.
In A. Bevan & M. Lake (Eds.), Computational approaches to archaeological space (pp. 213–242). Walnut Creek: Left
Coast Press.
Lake, M., & Woodman, P. (2003). Visibility studies in archaeology. Environment & Planning B: Planning and Design,
30, 689–707.
Lake, M., Woodman, P. E., & Mithen, S. (1998). Tailoring GIS Software for Archaeological Applications: An example
concerning viewshed analysis. Journal of Archaeological Science, 25, 27–38.
Larsen, M. V. (2015). Viewshed algorithms for strategic positioning of vehicles. Norwegian Defence Research Establishment
(FFI): FFI-­Rapport 2015/01300.
Lee, J., & Stucky, D. (1998). On applying viewshed analysis for determining least-­cost paths on Digital Elevation
Models. International Journal of Geographical Information Science, 12(8), 881–905.
Llobera, M. (1996). Exploring the topography of mind: GIS, social space and archaeology. Antiquity, 70, 612–622.
Llobera, M. (2001). Building past landscape perception with GIS: Understanding topographic prominence. Journal
of Archaeological Science, 28, 1005–1014.
Llobera, M. (2003). Extending GIS-­based visual analysis: The concept of the visualscape. International Journal of
Geographical Information Science, 17(1), 25–48.
Llobera, M., Wheatley, D., Steele, J., Cox, S., & Parchment, O. (2010). Calculating the inherent visual structure of a
landscape (inherent viewshed) using high-­throughput computing. In F. Niccolucci & S. Hermon (Eds.), Beyond
the artefact: Digital interpretation of the past: Proceedings of CAA2004, Prato, 13–17 April 2004 (pp. 146–151). Buda-
pest: Archaeolingua.
Lock, G., Kormann, M., & Pouncett, J. (2014). Visibility and movement: Towards a GIS-­based integrated approach.
In S. Polla & P. Verhagen (Eds.), Computational approaches to the study of movement in archaeology: Theory, practice and
interpretation of factors and effects of long term landscape formation and transformation (pp. 23–42). Topoi – Berlin Studies
of the Ancient World/Topoi – Berliner Studien der Alten Welt, 23. Berlin: De Gruyter.
Loots, L. (1997). The use of projective and reflective viewsheds in the analysis of the Hellenistic City defence system
at Sagalassos, Turkey. Archaeological Computing Newsletter, 49, 12–16.
Loots, L., Nackaerts, K., & Waelkens, M. (1999). Fuzzy viewshed analysis of the Hellenistic City defence system
at Sagalassos, Turkey. In L. Dingwall, S. Exon, V. Gaffney, S. Laflin, & M. van Leusen (Eds.), Archaeology in the
age of the internet, CAA97: Computer applications and quantitative methods in archaeology: Proceedings of the 25th
anniversary conference, University of Birmingham, April 1997. BAR International Series, 750 [CD-­ROM]. Oxford:
Archaeopress.
Lopez-­Romero Gonzalez de la Aleja, E. (2008). Characterising the evolution of visual landscapes in the late prehistory
of south-­west Morbihan (Brittany, France). Oxford Journal of Archaeology, 27(3), 217–239.
GIS-­based visibility analysis 331

Lu, M., Zhang, J., & Fan, Z. (2008). Least visible path analysis in raster terrain. International Journal of Geographical
Information Science, 22(6), 645–656.
Mitcham, J. (2002). In search of a defensible site: A GIS analysis of Hampshire Hillforts. In D. Wheatley, G. Earl, &
S. Poppy, S (Eds.), Contemporary themes in archaeological computing (pp. 73–79). Oxford: Oxbow Books.
Murrieta-­Flores, P. (2014). Developing computational approaches for the study of movement: Assessing the role of
visibility and landscape markers in terrestrial navigation during Iberian Late Prehistory. In S. Polla & P. Verhagen
(Eds.), Computational approaches to the study of movement in archaeology: Theory, practice and interpretation of factors and
effects of long term landscape formation and transformation (pp. 99–132). Topoi – Berlin Studies of the Ancient World/
Topoi – Berliner Studien der Alten Welt, 23. Berlin: De Gruyter.
Nackaerts, K., & Govers, G. (1997). A non-­deterministic use of a DEM in the calculation of viewsheds. Archaeological
Computing Newsletter, 49, 3–11.
Nagy, G. (1994). Terrain visibility. Computers and Graphics, 18(6), 763–773.
Nutsford, D., Reitsma, F., Pearson, A., & Kigham, S. (2015). Personalising the viewshed: Visibility analysis from the
human perspective. Applied Geography, 62, 1–7.
Ogburn, D. E. (2006). Assessing the level of visibility of cultural objects in past landscapes. Journal of Archaeological
Science, 33, 405–413.
Olaya, V. (2009). Basic land surface parameters. In T. Hengl & H. I. Reuter (Eds.), Geomorphometry: Concepts, software,
applications (pp. 141–169). Amsterdam: Elsevier.
O’Sullivan, D., & Turner, A. (2001). Visibility graphs and landscape visibility analysis. International Journal of Geo-
graphical Information Science, 15(3), 221–237.
Paliou, E. (2011). The communicative potential of theran murals in late Bronze age Akrotiri: Applying viewshed
analysis in 3D townscapes. Oxford Journal of Archaeology, 30(3), 30–33.
Paliou, E. (2013). Reconsidering the concept of visualscapes: Recent advances in three-­dimensional visibility analysis.
In A. Bevan & M. Lake (Eds.), Computational approaches to archaeological spaces (pp. 243–263). Walnut Creek: Left
Coast Press.
Paliou, E. (2014). Visibility analysis in 3D built spaces: A new dimension to the understanding of social space.
In S. Polla, U. Lieberwirth, & E. Paliou (Eds.), Spatial analysis and social spaces: Interdisciplinary approaches to
the interpretation of prehistoric and historic built environments (pp. 91–114). Berlin: De Gruyter. http://dx.doi.
org/10.1515/9783110266436.91
Rennell, R. (2012). Landscape, experience and GIS: Exploring the potential for methodological dialogue. Journal of
Archaeological Method and Theory, 19(4), 510–525.
Reuter, H. I., Hengl, T., Gessler, P., & Soille, P. (2009). Preparation of DEMs for geomorphometric analysis. In T.
Hengl & H. I. Reuter (Eds.), Geomorphometry: Concepts, software, applications (pp. 141–169). Amsterdam: Elsevier.
Risbøl, O., Petersen, T., & Jerpåsen, G. (2013). Approaching a mortuary monument landscape using GIS-­and ALS-­
generated 3D models. International Journal of Heritage in the Digital Era, 2(4), 509–525.
Ruestes Bitriá, C. (2008). A multi-­technique GIS visibility analysis for studying visual control of an Iron age land-
scape. Internet Archaeology, 23. http://intarch.ac.uk/journal/issue23/4/
Ruggles, C., & Medyckyj-­Scott, D. (1996). Site location, landscape visibility, and symbolic astronomy: A scottish case
study. In H. D. G. Maschner (Ed.), New methods, old problems: Geographic information systems in modern archaeologi-
cal research (pp. 127–146). Carbondale: Center for Archaeological Investigations Occ. Paper No. 23, Southern
Illinois University.
Sakaguchi, T., Morin, J., & Dickie, R. (2010). Defensibility of large prehistoric sites in the Mid-­Fraser region on the
Canadian Plateau. Journal of Archaeological Science, 37(6), 1171–1185.
Suleiman, W., Joliveau, T., & Favier, E. (2013). A new algorithm for 3D isovists. In S. Timpf & P. Laube (Eds.),
Advances in spatial data handling: Advances in geographic information science (pp. 157–173). Berlin: Springer.
Tilley, C. (1994). A phenomenology of landscape. London: Berg.
Tilley, C. (2004). The materiality of stone: Explorations in landscape phenomenology. London: Berg.
Toma, L. (2012). Viewsheds on terrains in external memory. SIGSPATIAL Special, 4(2), 13–17.
Trick, S. (2004). Bringing it all back home: The practical visual environments of Southeast European Tells. Internet
Archaeology, 16. https://doi.org/10.11141/ia.16.7
Turner, A., Doxa, M., O’Sullivan, A., & Penn, A. (2001). From isovists to visibility graphs: A methodology for the
analysis of architectural space. Environment and Planning B: Urban Analytics and City Science, 28(1), 103–121.
332 Mark Gillings and David Wheatley

Van Dyke, R., Bocinsky, R. K., Windes, T. C., & Robinson, T. J. (2016). Great houses, shrines, and high places: Inter-
visibility in the Chacoan world. American Antiquity, 81(2), 205–230.
van Kreveld, M. (1996). Variations on sweep algorithms: Efficient computation of extended viewsheds and class
intervals. In M. J. Kraak & M. Molenaar (Eds.), Proceedings of the 7th international symposium on spatial data handling
(pp. 15–27). Delft: TU Delft.
Wang, J., Robinson, G., & White, K. (2000). Generating viewsheds without using sightlines. Photogrammetric Engineer-
ing & Remote Sensing, 66(1), 87–90.
Webster, D. (1999). The concept of affordance and GIS: A note on Llobera (1996). Antiquity, 73, 915–917.
Wheatley, D. W. (1995). Cumulative viewshed analysis: A GIS-­based method for investigating intervisibility, and
its archaeological application. In G. Lock & Z. Stančič (Eds.), Archaeology and geographical information systems: A
European perspective (pp. 171–186). London: Taylor and Francis.
Wheatley, D. W., & Gillings, M. (2000). Vision, perception and GIS: Developing enriched approaches to the study of
archaeological visibility. In G. Lock (Ed.), Beyond the map (pp. 1–27). Amsterdam: IOS Press.
Williamson, C. G. (2016). Mountain, myth and territory: Teuthrania as focal point in the landscape of Pergamon. In
J. McInerney & I. Sluiter (Eds.), Valuing landscape in classical antiquity: Natural environment and cultural imagination
(pp. 70–100). Leiden: BRILL.
Wright, D. K., MacEachern, S., & Lee, J. (2014). Analysis of feature intervisibility and cumulative visibility using
GIS, Bayesian and spatial statistics: A study from the Mandara mountains, Northern Cameroon. PLoS One, 9(11),
e112191. doi:10.1371/journal.pone.0112191
Xu, Z., & Yao, Q. (2009). A novel algorithm for viewshed based on digital elevation model. 2009 Asia-­Pacific Confer-
ence on Information Processing, 2, 294–297.
18
Spatial analysis based on
cost functions
Irmela Herzog

Introduction
Many archaeologists are no longer satisfied with presenting distribution maps, but their aim is the iden-
tification of the patterns of movement that explain how the people and the artefacts of the time period
considered got to the sites (e.g. Rademaker, Reid, & Bromley, 2012). For most regions and periods of
time, the distribution of artefacts provides the most important evidence for the movement of people.
Archaeological remains of ancient paths, roads and ship wrecks are fairly rare, and in most cases indicate
only small sections of the original trajectory. Moreover, dating land routes is often difficult due to con-
tinuous use after initial path creation and absence of diagnostic finds. However, the archaeological record
of human movement can sometimes be supplemented by historical sources.
Each model of past movement based on the historical and archaeological evidence nowadays relies
implicitly or explicitly on a cost function estimating costs of movement in terms of time, calories or some
other currency for the study area and period of time considered. Evidence for the popularity of such
approaches in archaeology are not only numerous case studies published since 2000 but also several ses-
sions at the annual Computer Applications in Archaeology (CAA) conference dealing with this subject
as well as two edited volumes with contributions focusing solely on least-­cost methods or applications
(Polla & Verhagen, 2014; White & Surface-­Evans, 2012).
Two popular GIS-­based areas of spatial analysis in archaeology are based on cost functions: site catch-
ments and least-­cost paths. A site catchment is the region accessible from a site, and often archaeological
studies analyse the resources within this region (Conolly & Lake, 2006, p. 214). A least-­cost path (LCP)
ideally is the route minimizing the costs of movement between two given locations (Conolly & Lake,
2006, pp. 294, 252–255).
In fact, the most basic application of a cost function is the generation of a least-­cost site catchment
(LCSC). The LCSC includes all areas that can be reached by expending less than a user-­selected cost limit.
The term isochrone is often used for the boundary when costs are measured in terms of time. According
to Wheatley and Gillings (2002, p. 159), the concept of LCSC was derived from defining the exploitation
territory of a site. Beyond the boundary of this territory the costs of exploitation exceed the benefit.
This concept is closely related to time geography introduced by Mlekuz (2013) into archaeological least-­
cost modelling. Site catchment analysis for foraging societies mainly focuses on the types and quantities
334 Irmela Herzog

of resource areas within each catchment zone (e.g. Surface-­Evans, 2012). Most publications presenting
site catchments for sedentary agrarian cultures study the potentials for crop production, in terms of soil,
topography, slope etc. (e.g. Korczyńska, Cappenberg, & Kienlin, 2015). Wheatley and Gillings (2002,
p. 160) refer to a widely cited paper published in 1970 suggesting a cost limit of a 1 hour walk for a sed-
entary agricultural site and 2 hours for a herding/hunting community. The LCSC derived from several
cost limits may be appropriate, if each settlement is surrounded by rings of different utilisation, as was
described in the early 19th century by von Thünen’s model of rural land use (Waugh, 2002, pp. 471–475).
For instance, Posluschny (2010) uses the popular Tobler hiking function (Tobler, 1993) with two time
limits, 60 and 15 minutes, with 15 minutes delimiting the area of daily farming activities. For Pos­luschny’s
study area in southwestern Germany, early Iron Age settlements with overlapping catchments on the
15 minute scale are probably not contemporary. So catchment overlap may indicate some issues with
dating or the cost limit selected. The aim of Gaffney and Stančič (1992) was to define realistic mutually
exclusive exploitation areas for the seven principal hillforts on the island of Hvar, Croatia. Therefore, they
calculated LCSC based on a 90-­minute walking time limit. For a project reconstructing land use patterns
of sites, catchments may define the survey area (Peeples, Barton, & Schmich, 2006). Comparing catch-
ment sizes may provide insights into the function of settlements. For instance, the study of Pos­luschny
(2010) mentioned above compares the catchment sizes for early Iron Age princely sites with those of
non-­princely settlements of the same period and comes to the conclusion that agriculture was more
important for the normal settlements. Site catchment analysis was introduced by processual archaeology
(Conolly & Lake, 2006, p. 209) focusing on economic costs although it is also possible to include social
aspects such as visibility or taboo zones in a cost model (Table 18.1). Lee and Stucky (1998) provide a
comprehensive overview of approaches for including viewsheds in least-­cost calculations. As the LCSC
comprises all LCPs that expend the catchment’s cost limit or less, it can be spoken of as the ‘potential
path area’ (Mlekuz, 2013).
In archaeological studies, LCPs often provide reconstructions of ancient routes or route sections (e.g.
Chapman, 2006, pp. 110–111; Herzog, 2013e; Rademaker et al., 2012; Verhagen & Jeneson, 2012), take
for example Rogers, Collet, and Lugon (2015) who calculate LCPs in an attempt to predict high moun-
tain passes in prehistoric times. LCPs may also be applied to identify the principal factors governing the
construction of known roads or road segments (e.g. Bell & Lock, 2000; Fovet & Zakšek, 2014; Güimil-­
Fariña & Parcero-­Oubiña, 2015; van Lanen, 2017, pp. 123–134). If LCPs coincide with known roads
only after forcing the LCP to visit an intermediate location, this is evidence for the importance of this
additional node (e.g. Güimil-­Fariña & Parcero-­Oubiña, 2015).
In most studies, a set of points is connected by LCPs (e.g. Canosa-­Betés, 2016). Alternatively, LCPs in
all directions can be constructed starting from a given site, resulting in focal mobility networks (Fábrega
Álvarez & Parcero Oubiña, 2007; Herzog, 2013c; Llobera, Fábrega-­Álvarez, & Parcero-­Oubiña, 2011;
Lynch & Parcero-­Oubiña, 2017). If the movement costs depend on topography, topographic data (mostly
a digital elevation or surface model) with adequate resolution is required, whereas in ancient cities, each
house is a barrier and should be modelled accordingly (Branting, 2007). Often the reconstruction of past
routes by LCPs is the basis of further research, e.g. Hudson (2012).
The sites of many cultural groups prefer locations close to ancient roads or paths (e.g. Fovet & Zakšek,
2014), therefore, successful road reconstruction often allows predictive modelling of road-­related sites
such as mansiones, i.e. resting places along Roman roads, but also of archaeological features such as rock
art or burial mounds. Another focus has been on what might be termed ‘natural’ pathways in a given
landscape. An early example for a site prediction approach based on pathway reconstruction using a cost
model was presented by Bellavia (2002) who sought to derive “natural pathways” from a digital elevation
Spatial analysis based on cost functions 335

Table 18.1 Cost components applied in selected archaeological least-­cost studies published in 2010 or later.

Reference Slope cost component Additional cost components

Canosa-Betés (2016) Tobler (1993); walker cost function of 4 categories of water courses (breadth:
Llobera and Sluckin (2007); Herzog (2013a 200 m, 150 m, 50 m, and 25 m); no extra
based on Minetti, Moia, Roi, Susta, and costs for possible locations of fords or
Ferretti (2002) bridges
Fovet and Zakšek 3rd degree polynomial based on Minetti Visibility, based on a variable similar to
(2014) et al. (2002) sky view
Güimil-Fariña and Tobler (1993); Pandolf, Givoni, and Penalty for crossing rivers equivalent to
Parcero-Oubiña (2015) Goldman (1977); Herzog (2013a based on ascending a 15° gradient
Minetti et al., 2002); walker cost function of
Llobera and Sluckin (2007)
Groenhuijzen and Velocity estimate derived from Pandolf et al. Terrain coefficients based on Soule and
Verhagen (2017) (1977) assuming constant values Goldman (1972); coefficient 20 for rivers
for metabolic rate, weight and load. and streams
Herzog (2013e) Vehicle cost function. Avoiding wet soils including streams;
Herzog (2013a) lower costs for fords
Korczyńska et al. (2015) Tobler (1993) none
Van Lanen (2017) Slope classes based on natural breaks; slopes Terrain classification: factor 1.2 for higher
> 10% are considered impassable sandy heath land, 1.8 for lower wetlands;
groundwater level
Lynch and Parcero Walker cost function of Llobera and Impedance factor 2 for areas from which
Oubiña (2017) Sluckin (2007) no high mountain top is visible.
Posluschny (2010) Tobler (1993) none
Rademaker et al. (2012) Pandolf et al. (1977) with various values for Terrain coefficients based on Soule and
variables W, L, V (see Table 18.2) Goldman (1972)
Rogers et al. (2015) Tobler (1993); alternatively: Swiss 15th landcover
degree polynomial
Surface-Evans (2012) Tobler (1993) none
Verhagen and Jeneson Tobler (1993) Alternative to slope: Visibility based on
(2012) low-pass filtered openness

model (DEM) in several areas of the UK including Stonehenge. Some evidence has also been published
for animals travelling on least-­effort routes (Ganskopp, Cruz, & Johnson, 2000), so if early hunters fol-
lowed the paths of animals as has been suggested by some authors (e.g. Whitley & Burns, 2008) they
most probably walked on LCPs.
An appropriate cost function determines the costs of movement in the region studied and should take
the means of transportation available to the people living at that time into account, e.g. the use of pack
or draft animals, wheeled vehicles or boats. Nearly all archaeological case studies applying cost functions
include the cost factor slope, often combined with factors depending on soil, land use, the presence of
streams, or visibility (Table 18.1; cf. Herzog, 2014b for some additional archaeological LCSC and LCP
publications).
336 Irmela Herzog

Method

Overview
The initial step in LCSC and LCP calculation is the decision concerning the principal factors govern-
ing movement costs, and establishing an appropriate cost function combining the costs of these factors.
The next step is the creation of an accumulated cost surface (ACS), that is a raster grid storing the costs
of movement from the origin to every other cell in the raster grid. The ACS is normally calculated by
spreading out from the origin and accumulating the costs of the cells as each is visited. For LCSC, the
origin is the site location. For LCPs, the origin is one of the two locations to be connected. The LCSC
is derived from the ACS by stopping the spreading process for cells whose accumulated costs exceed the
predefined cost limit. These cells form the boundary of the catchment. Alternatively, an isoline at the
cost limit value may be derived from the ACS. The LCP is derived from the ACS by backtracking from
the target location to the origin. These three steps will be described in more detail in the next sections.
Finally, some validation and analysis of the stability of the outcomes should be included in each study
creating LCSC or LCPs; this is discussed in the Conclusion below.

Estimating movement costs


In archaeological case studies, movement costs are typically measured in time or energy expenditure. The
two measurement systems differ, and some authors give reasons for preferring energy expenditure to time
costs (e.g. Rademaker et al., 2012) or vice versa. The best option may depend on the culture considered.
Some studies are based on cost estimations using different units, such as percent or angle slope (e.g. Bell &
Lock, 2000; Bellavia, 2002). In all applications of a cost function, validation of the function chosen should
be part of the study.
Table 18.1 illustrates that the most popular cost component in recent archaeological least-­cost studies
is slope, with most of the cost functions applied and listed in Table 18.2 being rules of thumb rather than
based on a large sample of measurements. Formulae derived from measurements of modern humans who
do not walk as frequently as people of the period considered may not necessarily outperform cost func-
tions based on practical experience. Some of the slope-­dependent cost functions in Table 18.2 (including
the most popular ones) have already been discussed in Herzog (2013a).
Tobler (1993) proposed the most popular cost function (no. 1 in Table 18.2), but some care has to
be taken to implement this formula properly because the original estimates velocity (and not time) and
relies on mathematical slope which differs substantially from slope in percent or degrees (Herzog, 2014a).
The modified Tobler function (no. 2 in Table 18.2) was suggested by Márquez-­Pérez et al. (2017) based
on GPS data for 21 trails located in Spain. The original Tobler estimates are about 1.35 faster than the
modified version, with a low standard deviation of 0.064 (tested for steps of 1 percent in the range of –70
to +70 percent). The estimates provided by this function outperformed those of MIDE, Langmuir, and
standard Tobler for 10 of the 21 trails. Tobler (personal communication, October 17, 2017) recommended
the study by Irmischer and Clarke (2017), the formulae presented here (no. 3 in Table 18.2) rely on data
collected from 200 volunteer cadets between 17 and 23 years of age. Compared to Tobler’s formula that
refers to soldiers hiking a known route, a lower average speed is recorded in this study, which is attributed
by the authors to way finding costs.
Most of the slope-­dependent cost functions are anisotropic, i.e. the costs for descending a gentle
gradient are less than that of climbing a gentle slope (Figure 18.1). This may result in different
optimal paths connecting two locations A and B, depending on the direction of movement. But
Table 18.2 Slope-­dependent cost functions, with ŝ percent slope, and s = ŝ/100 mathematical slope. If Δd (or ΔD)
is missing in the cost formula, the result of the cost formula is to be multiplied by the distance covered. Rows 1 to 7
list cost functions estimating time, the formulae in rows 10 to 12 estimate energy consumption. The cost functions
listed in rows 8 and 9 measure abstract cost units, which can best be understood by comparing estimates resulting
from movement on a gradient with that on level ground.

No. Name/Reference Formula Properties

1 Tobler (1993) V(s) = 6*exp(–3.5*abs(s+0.05)) ­ V(s) estimates the velocity (km/h) on a


cost(s, ΔD) = 60*(ΔD/V(s)) gradient, cost(s, ΔD) estimates the time in
minutes for covering the distance ΔD in km
on a gradient with slope s.
2 Márquez-­Pérez, V(s) = 4.8*exp(–5.3*abs­ Modified Tobler: first formula as published,
Vallejo-­Villalta, and ((s*0.7)+0.03)) second formula is equivalent except for
Álvarez-­Francoso V(s) = 4.8*exp(–3.71*abs(s+0.04286)) round-­off errors
(2017)
3 Irmischer and Von(ŝ) = f*(0.11 + ­ Von(ŝ) estimates the on-­road velocity
Clarke (2017) exp(–­(ŝ +5)²/1800))) (m/s) of walkers, Voff(ŝ) refers to off-­road
Voff(ŝ) = f*(0.11 + 0.67*­ movement,
exp(–­(ŝ +2)²/1800))) f = 1.00 for male and f = 0.95 for female
walkers.
4 Garmy, Kaddouri, V(α) = 4*exp(–0.008*α²) α is slope in degrees.
Rozenblat, and
Schneider (2005)
5 Langmuir (2004); cost(Δd, ΔH_up, ΔH_gd, ΔH_sd) = Estimates time in seconds.
implemented in a*Δd + b*ΔH_up + c*ΔH_gd + ­ All Δ values are in m.
r.walk (GRASS) d*ΔH_sd Δd = horizontal distance covered
Langmuir: ΔH_up = positive height change
a=0.72, b=6.0, c=1.9998, d=–­1.9998 ΔH_gd = gentle descent
Downhill default slope value threshold ΔH_sd = steep descent
is at 21.25%
6 Ericson and cost(Δd, ΔH_up, ΔH_dn) = Δd and ΔH_up as in row 5
Goldstein (1980) Δd + 3.168*ΔH_up + ­ ΔH_dn = negative height change
1.2*abs(ΔH_dn)
7 MIDE: París Roche cost(N, Δd, ΔH_up, ΔH_dn) = Δd, ΔH_up, ΔH_dn as in row 6
(2008, p. 11) N*0.012*Δd + 0.15*ΔH_up Estimates time in minutes.
+ 0.1*abs(ΔH_dn) N is a terrain factor
8 Bellavia (2002) Cost(N, α) = N * (abs(α)+1) α is slope in degrees.
N is a terrain factor.
9 Vehicle cost Cost(ŝ) = 1 + (ŝ/š)² š is the critical slope, i.e. for slopes
function. The abbreviation Q(š) refers to this exceeding š, hairpin turns are more effective
Herzog (2013a) cost function, Q is short for quadratic. than direct ascent or descent.
based on Llobera
and Sluckin (2007)
10 Llobera and Cost(s) = 2.635 + 17.37*s + 42.37*s² Walker cost function: Estimates energy
Sluckin (2007) – 21.43*s3 + 14.93*s4 consumption in kJ/m
11 Herzog (2013a) Cost(s) = 1337.8*s6 + 278.19*s5 Walker cost function: Estimates energy
based on Minetti, – 517.39*s4–78.199*s3 + 93.419*s2 consumption in kJ/(m*kg)
Moia, Roi, Susta, + 19.825*s + 1.64
and Ferretti (2002)
12 Pandolf, Givoni, Cost(W, L, N, V, ŝ) = 1.5*W Estimates metabolic rate in watts. ­
and Goldman + 2.0*(W+L) * (L/W)² W = weight (kg),
(1977) + N*(W+L)*(1.5*V² + 0.35*V*|ŝ|) L = load (kg), N = terrain factor,
V = velocity (m/s)
338 Irmela Herzog

Figure 18.1 Cost functions estimating walking time; on the x-­axis downhill slopes are negative.

most paths are used in both directions so that a cost function averaging the costs of movement in
both directions seems appropriate in many situations. By averaging, the asymmetric cost curve is
converted to a symmetric curve (Herzog, 2013a, Figure 18.2). Likewise, if the load carried by a
descending walker, pack animal or vehicle differs from the load on the way up, the cost functions
used should vary accordingly. Note that the slope-­dependent cost functions do not include the costs
for climbing stairs or ladders.
Several publications provide terrain factors that model reduced speed or energy consumption of a
walker (Table 18.3). In many least-­cost studies, water is considered as barrier, for instance Rogers et al.
(2015) select factor 5 for traversing water courses and 499.5 for water bodies. In general, the effort
needed for crossing a stream depends on many factors including width, depth and current (Langmuir,
2004, pp. 185–199).
The formula of Pandolf et al. (1977) shows one way of combining cost components, i.e. slope and a
terrain factor. It consists of a term depending on weight and load only (estimating the energy consump-
tion of standing) plus the extra energy required for movement, and only the latter term is multiplied by
the terrain coefficient. The formula presented by Givoni and Goldman (1971), that is also used by Soule
and Goldman (1972), is the product of the terrain coefficient with a factor depending on weight, load,
velocity and slope. Alternatively, a (weighted) sum of the two or more cost components may be applied
(e.g. Fovet & Zakšek, 2014), if the cost components are independent. Additional approaches for combin-
ing cost components and their drawbacks are discussed by Herzog (2013a).
The time and energy consumption required for walking a path also depend on factors varying
throughout the year and on weather conditions: Rain, muddy paths due to recent rain, snow, storm, fog,
Spatial analysis based on cost functions 339

Figure 18.2 Cost functions estimating walking time: uphill and downhill costs are averaged.

high humidity, very high or very low temperatures may slow down progress considerably. Moreover,
movement costs depend on the sex, age, weight, load and fitness of the walker as well as on the number of
hikers in the walking group. Therefore, the stability of any least-­cost result should be analysed by varying
the model parameters.
Estimating the costs of movement by boat or ship is even more difficult than estimating the costs of
walking, due to differences in boat or ship technology, seasonal variations, currents and substantial changes
of the rivers or coastlines since the period considered. Some estimates for the costs of water transport
provided by different studies can be found in Herzog (2014a).
Animals play different roles in path creation: according to Lay (1992, pp. 6–7), the first human ways
had an animal path origin (cf. Whitley & Burns, 2007). Moreover in some areas special paths for herds
existed in the past. Horse riding, pack or draft animals also had an impact on the velocity of the travel-
ler. It is very difficult to find appropriate cost functions taking animal movement into account due to
the large variety within each species (e.g. oxen or horses) and the high number of possible species to be
considered (Ganskopp & Vavra, 1987).
340 Irmela Herzog

Table 18.3 Published terrain factors for cost functions measuring time (unit: hour) or energy consumption (unit:
joule) of a walker. Note: ‘m asl’ refers to metres above sea level.

Factor Terrain unit Formula/reference

1.00 Blacktop roads and improved dirt paths hour MIDE: París Roche (2008), p. 11
1.00 Pavement (cement) hour de Gruchy, Caswell, and Edwards (2017)
1.03 Lawn grass hour de Gruchy et al. (2017)
1.19 Loose beach sand hour de Gruchy et al. (2017)
1.24 Disturbed ground (former stone quarry) hour de Gruchy et al. (2017)
1.25 horse riding paths, flat trails and meadows hour MIDE: París Roche (2008), p. 11
1.35 Tall grassland (with thistle and nettles) hour de Gruchy et al. (2017)
1.50 Open space above the treeline i.e. 2000 m asl hour Rogers et al. (2015)
1.67 Bad trails, stony outcrops and river beds hour MIDE: París Roche (2008), p. 11
1.67 Off-­path hour Tobler (1993)
1.79 Bog hour de Gruchy et al. (2017)
2.00 Off-­path areas below the treeline including hour Rogers et al. (2015)
pastures, forests, heathland, beaches etc.
2.50 Rock hour Rogers et al. (2015)
5.00 Swamp, water course hour Rogers et al. (2015)
1.00 Asphalt/blacktop joule de Gruchy et al. (2017)
1.10 Dirt road or grass joule de Gruchy et al. (2017)
1.20 Hard-­surface road joule Givoni and Goldman (1971)
1.20 Light brush joule de Gruchy et al. (2017)
1.30 Ploughed field joule de Gruchy et al. (2017)
1.50 Ploughed field joule Givoni and Goldman (1971)
1.50 Heavy brush joule de Gruchy et al. (2017)
1.60 Hard-­packed snow joule de Gruchy et al. (2017)
1.80 Swampy bog joule de Gruchy et al. (2017)
1.80 Sand dunes joule Givoni and Goldman (1971)
2.10 Loose sand joule de Gruchy et al. (2017)

Creating the accumulated cost surface (ACS)


In the early days of LCSC and LCP applications in archaeology, these studies were mainly based on an
isotropic cost grid. Such a cost grid stores for each cell the costs of traversing the cell independent of the
direction of movement. In this case, the costs of movement from a cell to one of its four direct neigh-
bours is the average of the costs of the start and the target cell. If the movement is diagonal to a corner-­
connected cell, the length of the move must be taken into account, i.e. the average of the costs has to be
multiplied by √2 (Figure 18.3).
The cell centres and the possible moves to the neighbouring cell centres form a graph. For graphs,
efficient algorithms calculating the least-­cost route from a given origin to all other nodes (i.e. cell centres)
are known if all cost distances are positive (Figure 18.4 is based on Cormen, Leiserson, Rivest, & Stein,
2001, pp. 476–495, 595–599; Dijkstra, 1959).
3 x 3 cost grid ACS
8 8 80 2*0.5*(10+8) 0.5*(10+8) 2*0.5*(10+80)
10 10 90 0.5*(10+10) 0 0.5*(10+90)
12 16 100 2*0.5*(10+12) 0.5*(10+16) 2*0.5*(10+100)

Figure 18.3 Simple example of an isotropic cost grid (left) and the corresponding accumulated cost surface
(ACS) (right). The origin of the accumulation process is the centre of the cost grid (cost value = 10).

Assign an ACS value of 0 to the origin


Step 1 All cell centres are marked as unvisited.
Create a candidate set, which contains only the origin.

Find the cell centre with the lowest ACS value in the candidate set.
Step 2 This is now the current posi on.

Step 3 For each neighbouring cell centre of the current posi on check:

Cell marked yes Calculate tenta ve costs =


unvisited?? ACS value of the current
posi on
no + neighbour cost distance

Cell in yes tenta ve costs


candidate
< ACS value ?
set??
no
yes
Insert cell in
candidate set

ACS value of neighbour


= tenta ve costs

no All
Proceed with next
neighbours
neighbour
checked?
yes

current posi on is marked as visited.


current posi on is removed from candidate set..

Proceed to no candidate
Step 2 set empty?

yes
The end

Figure 18.4 Dijkstra’s algorithm applied to a cost grid.


342 Irmela Herzog

For LCSCs, the algorithm has to be modified: in Step 3 only those cell centres are inserted in the
candidate set, whose ACS value is below the predefined cost limit.
For LCP generation modifications of the algorithm are also needed. Firstly, whenever a new ACS value
is assigned to a cell centre in Step 3, the backlink for this cell is stored, i.e. the current position. Secondly,
if the target of the LCP is selected in Step 2, the LCP is generated by connecting the backlinks from the
target to the origin. An optional procedure may save computation time: Initially the costs of the most
direct connection to the target may be calculated. Only those possible candidates should be inserted in
the set, for which the sum of the current ACS value and the minimum costs for the straight-­line distance
to the target are below this initial cost limit.
The results of the LCP algorithm are independent of the units of measurement chosen, i.e. they do
not change if all costs are multiplied by a constant factor. So applying the multiplier of 0.8 for horse-­
riding suggested by Tobler (1993) will result in the same calculated paths as the initial formula for hikers.
Similarly, the LCPs generated for male and female walkers based on one of the formulas proposed by
Irmischer and Clarke (2017; no. 3 in Table 18.2) do not differ.
Both the conversion of vector data to a cost raster and the subsequent conversion of this raster to a
graph may produce unexpected results. These issues are illustrated in Figures 18.5 and 18.6. Figure 18.5(a)
shows an isotropic cost grid with a linear barrier (cost value of 100) in an area of uniform costs (white
cells are assigned a cost value of 10). The grey cells indicate three ways of converting the barrier to raster
cell values: including all cells whose centre is within a 5, 7.5 and 10 m distance from the line.

Figure 18.5 (a) Simple isotropic cost grid, (b) possible moves starting at the origin, (c) traversing the barrier
cells by long moves, (d) subdividing long moves.
Spatial analysis based on cost functions 343

Figure 18.6 (a–f) Depict ACS results based on the cost grid shown in Figure 18.5(a). The outcomes of an
inadequate barrier radius of merely 5 m are shown in (a–c). For (d–f) an adequate barrier radius of 7.5 m was
chosen. The N values indicate the number of nearest neighbouring cells that can be reached from the origin
without detour. Images (g) and (h) illustrate the impact of different N values by grids showing the differences
in accumulated costs.

Three ways of converting grid cells to a graph have been used in archaeological least-­cost calculations:
(a) linking each cell with its 8 nearest neighbours (queen moves in Figure 18.5(b–d)), (b) ensuring that
all cells within a 24 cell neighbourhood can be reached without detour (queen and knight moves), or
(c) connecting to cells in all directions in a 48 cell neighbourhood (all lines starting from the origin in
Figure 18.5(b)).
Figure 18.5(c) illustrates how a simple diagonal move may traverse the 5 m radius barrier without
paying due costs; knight moves can jump over the 7.5 m radius barrier, and the long 3–1 and 3–2 moves
cross the 10 m radius barrier without touchdown on a high cost cell. This issue can be avoided by cut-
ting the long moves into two or three sub-­moves respectively, as indicated by the lines with arrows in
Figure 18.5(d). The cost values of the cut points are the weighted averages of the two values stored in the
cells connected by the arrow lines, with the weights depending on the distance to the cut point.
Figure 18.6(a–c) depicts the ACS based on the cost grid with the 5 m radius barrier shown in Fig-
ure 18.5(a). Due to the diagonal moves that jump over the barrier, some ACS cells beyond the barrier
have a lower accumulated cost value than any barrier cell. This unwanted effect is avoided with the 7.5 m
barrier (Figure 18.6(d–f)). The cut point implementation of the long moves ensures that due costs are
paid for the barrier. The correct shortest paths from the origin to the cells in the west half of the cost
grid are straight lines, so that cost distance and straight line distance should coincide in this area, i.e. cells
of equal cost distance from the origin should form a semicircle in the west. The image for N=48 shows
more of a semi-­circular structure than the N=8 image. In fact, by increasing the number of move direc-
tions, the detours necessary for reaching the target locations in general are made smaller. This is illustrated
by the difference images in Figure 18.6(g) and 18.6(h): in the uniform terrain area the largest difference
is about 6 m, the distance of the corresponding cells to the origin is about 77 m. So with N=8, the larg-
est detour is about 7.8 percent of the true shortest distance. This elongation error decreases substantially
for N=24 (Figure 18.6(h)): in the uniform area, it is about 1.5 m on covering a straight line distance of
63 m, i.e. about 2.4 percent of the true shortest distance (Herzog, 2013b).
344 Irmela Herzog

Figure 18.7 Small digital elevation model (DEM) with a cell size of 10 m and a constant slope value of 10%
with the corresponding ACS for the cost function Q(ŝ) = 1 + (ŝ/š)², with š = 10 (cf. no. 9 in Table 18.2).

Various algorithms for calculating slope exist (Conolly & Lake, 2006, pp. 191–192; Lock & Pouncett,
2010; Wheatley & Gillings, 2002, pp. 120–121) providing different outcomes. Moreover, confusion of
units for measuring slope may result in unrealistic ACS grids. This is why deriving slope directly from
the DEM in the process of ACS calculation is recommended as illustrated in Figure 18.7.
Even with an isotropic slope-­dependent cost function, the direction of movement is important. The
move from the centre cell with an altitude of 100 to the north has a slope of 10 percent accumulating
costs of 2, whereas the moves to the east or west remain at the same altitude accumulating only half
of the costs. The moves to the diagonal cells are longer, resulting in a lower slope than for the cells to
the main directions (ŝ =10√2), but still a higher accumulated cost value. Figure 18.8 presents another
visualisation of anisotropic movement on different gradients with constant slope and different cost
functions.
According to the approach presented in Figure 18.7, movement on the contour line of a DEM is
equivalent to the movement on level ground. For contour lines of a steep gradient, this is only true if
some construction work has been done to create the path (Figure 18.9). For informal routes that did not
involve any building effort, very steep gradients may be considered as barriers. It would be important to
note that with the exception of contour line routes, construction work such as the removal of outcrops,
building bridges and tunnels is mostly not included in ACS generation. Clearly, the outcome of the
approach outlined above depends on the accuracy and resolution of the DEM (Herzog & Posluschny,
2011).
An anisotropic cost grid may be combined with an isotropic grid by multiplying or adding accu-
mulated cost values (Herzog, 2013a). The terrain factors listed in Table 18.3 suggest multiplication, and
multiplication is independent of the resolution of the cost grids. However, several authors prefer adding
cost components (e.g. Fovet & Zakšek, 2014), often a weighted sum of cost grids is created (e.g. Whit-
ley & Burns, 2008). For modelling anisotropic costs such as currents or wind directions, more complex
approaches are required (Collischonn & Pilar, 2000; Indruszewski & Barton, 2005, 2007).
After deciding on the relevant cost model, the main difficulty for LCSC generation is the decision
for a cost limit. For single farmsteads with a crop-­based economy the farm sizes listed in the study by
Kerig (2008) may provide some guidance: the minimum farmland is 2 hectare, and the maximum is 4 to
5 hectare if all work is to be carried out by humans. With oxen larger farmlands of up to 10 hectare can
be ploughed. Horses allow ploughing even larger plots, up to 33 hectare.
Spatial analysis based on cost functions 345

Figure 18.8 ACS for different slope dependent cost functions (nos. 1, 6, and 9 in Table 18.2) on three gra-
dients, with an additional barrier as in Figure 18.5(a) (radius 7.5 m). For each cost function, the costs vary
from 0 at the origin, depicted in white, to the largest accumulated cost value depicted in black. (a) Ericson &
Goldstein, (b) Tobler, (c) Q(12).

Figure 18.10 clearly shows that the LCP may deviate from the true shortest path depending on N (for
details see Herzog, 2013b). By increasing N, calculations will be rendered more accurate but the com-
putation time will increase as well. The figure shows LCPs created with Dijkstra’s algorithm for paths in
both directions, based on the ACS grids presented in Figure 18.6(d–f). In an area of uniform costs such
as the western part of the cost grid displayed in Figure 18.5(a), the optimal path is a straight line. But for
N=8 and uniform costs, only the LCPs in the eight directions considered coincide with the straight line,
346 Irmela Herzog

Figure 18.9 A path built to be level on a steep slope in the hilly area east of Cologne, Germany.

an example is the LCP to target no. 3. But for paths in other directions such as to target no. 5, the LCP
deviates from the correct shortest path (dotted line in Figure 18.10(a)), this deviation decreases when
increasing N. Moreover, two different LCPs are depicted for most targets: the return path accumulates
the same costs.

Case study
In the hilly rural area southwest of Cologne quite a few Roman sites including the remains of farms
(villae rusticae), temples and roads have been recorded. The aim of the case study is to compare the site
catchments of a farm (a villa in Blankenheim) and a temple (known as Görresburg). The cost model
derived from the Roman roads in this area is applied for the catchments. Although the movement pat-
terns of travelling on a Roman road may differ from small-­scale movement within a site catchment, no
data concerning the latter is readily available. The first step is to identify the principal factors governing
the construction of Roman roads in this area (Figure 18.11).
In this study area, large parts of the Roman road known as the Agrippa Road have been recorded by
aerial photography, ALS data and some small-­scale excavations (Grewe, 2004, 2007; Horn, 2014, map
p. 169).1 Another Roman road section in this area was proposed and verified by Hagen (1931, p. 176).
The Roman road section suggested by Schneider (1879, p. 21) relies mainly on straight-­line sections of
Spatial analysis based on cost functions 347

Figure 18.10 Least-Cost Paths (LCPs) (black lines) from the origin in the centre to five different targets. The
outcome of the LCP algorithm depends on the number of nearest neighbours that can be reached without
detour (N) and the width of the barrier ((a) width = 5 m; (b, e, g) width = 7.5 m; (c, f, h) width = 10 m; cf.
Figure 18.6).

roads that were still in use in the mid-­19th century and passes a known Roman road site, therefore it is
also tentatively included in this set of Roman roads. Unfortunately, landscape reconstruction is beyond
the scope of this small case study, so Figure 18.11 shows some modern features, mostly roads such as a
modern motorway east of the site labelled “Roman road remains” in the northeast of the map.
When discussing the Agrippa Road, Grewe (2004) pointed out that Roman roads avoided steep slopes
in order to allow horse or oxen driven carts to proceed. According to Grewe, the slopes of Roman roads in
the Rhineland normally do not exceed 8 percent although at some exceptional locations 16 to 20 percent
have been recorded. Due to the slope restrictions for Roman roads, the cost factor slope is an obvious
choice. Slope is derived from the two DEMs available for the study area (Table 18.4).
348 Irmela Herzog

Figure 18.11 The study area southwest of Cologne covering approximately 13 × 10 km.

Table 18.4 DEM data provided by the ordnance survey institution (Geobasis NRW) responsible for this part of
Germany

Name Cell size Projection (EPSG) Data collection Median slope

DEM10 10 m Gauss-­Krüger (31466) Photogrammetry, ALS 8.4%


DEM25 25 m ETRS89 (25832) ALS 8.0%

Experience with another hilly region in the Rhineland suggested including another cost factor that
models streams as barriers (Herzog, 2013a, 2013b, 2013c, 2013e). This was tested for DEM10: a buffer
with a radius of 7.5 m was created for the streams and isotropic costs of 5 assigned to the cell centres
within the buffer (Figure 18.12: iso = 5). All streams in the neighbourhood of the Roman routes con-
sidered belong to the class “width below 3 m”, therefore a uniform penalty for traversing streams is
considered appropriate. The LCPs derived from this cost model agree quite well with the Roman road
proposed by Hagen, but deviate from the route suggested by Schneider. With respect to the Agrippa
Road, the results are not very convincing. Modifying the penalty for crossing streams does not change
the outcome, because the number of stream crossings for the initial LCPs and the Agrippa Road is about
the same. An alternative is a model derived from the soil map that takes the wet soils in the stream val-
leys into account. Moreover, ford or bridge locations were digitized from the maps created in the years
1846–1847 and assigned costs of 2. Based on this model, the LCP generated from the slope-­dependent
cost function with a critical slope of 6 percent reconstructs the Agrippa Road somewhat better than the
other LCPs (Figure 18.12).
Spatial analysis based on cost functions 349

Figure 18.12 LCPs based on the formulas by Tobler, Irmischer, and Herzog/Minetti (see nos. 1, 3, and 11 in
Table 18.2) as well as quadratic slope dependent cost functions (see no. 9 in Table 18.2) combined with costs
for traversing water courses and/or wet soils.

Moreover, the LCPs based on Tobler’s hiking function were generated, and though they do not pay
penalties for crossing water, they agree quite well with some of those derived from the vehicle cost func-
tion with streams modelled as barriers. The LCPs generated from a cost model combining the Irmischer
off-­road cost function with penalties for wet soils except at ford locations are often more direct than the
rest of the LCPs presented and often do not reconstruct the known roads as well as these.
After the first sobering results, Görresburg and the Roman smelting site were included as additional
possible origins besides Pt1 (cf. Figure 18.11). This allows testing if the road made a detour on purpose
to pass these sites.
The Agrippa Road and the Q(10) LCP are of similar length, but the total of elevation differences
derived from a trail elevation profile is considerably lower for the LCP (height change in Table 18.5).
So neither minimising height change nor avoiding crossing streams or wet soils are the principal factors
governing the construction of the Agrippa Road. Long sections of the Q(10) LCP run in the stream
valleys, whereas the Agrippa Road after crossing a stream immediately climbs to more elevated terrain
(Figure 18.13(a)). Viewsheds probably did not play an important role in this forest area, but a method for
calculating local visual prominence can be applied to altitude data for identifying elevated areas (Llobera,
2003; Figure 18.13(b)).
350 Irmela Herzog

Table 18.5 Comparison of the Agrippa Road section and the LCP generated based on the cost function Q(10) with
a critical slope of 10 percent (see no. 9 in Table 18.2) combined with a penalty factor of 5 for crossing streams. For
the two routes, the percentage in each prominence category is given.

Prominence: DEM10, 100 m radius Prominence: DEM25, 250 m radius

Length height –6.3 –1.0 to 0.0 to 1.0 to –15.0 –2.0 to 0.0 to 2.0 to
(km) change to –1.0 0.0 1.0 5.2 to –2.0 0.0 2.0 13.4

Agrippa 10.87 432 m 11 24 36 30 12 8 29 46


Q(10) 10.67 266 m 43 21 33 4 52 19 18 12

Figure 18.13 (a) Comparison of the Q(10) LCPs with the Agrippa road, (b) comparison of the local promi-
nence for these two routes (white = low, black = high prominence) (c) LCPs with increased isotropic costs in
areas of low prominence.

Table 18.5 clearly shows that the Agrippa Road avoids areas of low prominence. Therefore LCPs with
different cost multipliers (w values in Figure 18.13(c)) attributed to low prominence areas were calcu-
lated. The cost multiplier 2 generated the best results combined with slope-­dependent cost functions that
assign less costs to steep slopes than Q(10). Such cost functions tend to generate straight road sections
typical for Roman roads. But omitting the slope-­dependent cost component produces LCPs that are not
as close to the Agrippa Road as the LCPs highlighted in Figure 18.13(c).
Only the LCPs starting at the Roman iron smelting site coincide well with the Agrippa Road sug-
gesting that this site determined the layout of the road to some extent. Figure 18.13(c) shows also that
the choice of the DEM can have an impact on the result when considering the LCPs connecting Pt1
and Pt2. However, the more important LCPs connecting the Roman iron smelting site with Pt3 coincide
quite well independent of the DEM chosen.
Based on the Q(14) cost function and a cost multiplier of 2 for low prominence areas, LCSCs were
calculated for the temple on the Görresburg and the villa near Blankenheim (Figure 18.14). Cost limits
Spatial analysis based on cost functions 351

are measured n terms of moving on level ground, without stepping into low prominence areas and are
given in multiples of 250 m. With respect to sizes, the two different sets of catchments do not differ
substantially (Table 18.6). South of the Görresburg hill, excavations found a Roman settlement (Horn,
2014, pp. 196–197), so agricultural use was probably important for both locations.
This case study covering only a small area mainly suggests hypotheses to be tested in larger areas with
a larger number of Roman sites. It should be noted that the cost model for Roman roads found in this
example bears similarity with that found by Verhagen and Jeneson (2012) dealing with a Dutch Roman
road section close to the German border.

Figure 18.14 The best performing LCPs for the models considered and the Least-Cost Site Catchments
(LCSCs) derived from the Q(14) cost model for the Görresburg temple and the villa.

Table 18.6 Areas included within the LCSCs in hectares.

LC-­distance 250m 500m 750m 1000m 1250m 1500m 1750m

Temple 10.3 33.2 69.3 117.2 180.2 272.4 405.2


Villa 9.9 29.4 63.6 121.5 207.5 316.3 448.7
352 Irmela Herzog

Conclusion

Validation, assessing the accuracy, and analysis of the stability of the outcomes
For a convincing application of a cost function for LCSC or LCP generation, analysing the archaeological
or historical evidence and some validation is required. For instance, Garmy et al. (2005) mention that the
cost function chosen reproduces already known old footpaths in their study region.
GPS trails (Márquez-­Pérez et al., 2017) and walking experiments (Kondo et al., 2011) can provide
some data for validating the cost function applied in the study area considered, but in general, modern
people are not as used to walking as people in past times. Energy expenditure might be overestimated by
cost functions based on modern measurements because the energy consumption of walking or running
is lower for people used to walking or running most of the day compared to that of modern white collar
workers (Pontzer et al., 2012).
If the aim of an LCP study is to reconstruct a known road, the similarity between the LCP to the
known route can be assessed by determining the proportion of the LCP that lies within a buffer distance
from the known road (Goodchild & Hunter, 1997). Applications of this simple measure of similarity in
archaeological LCP studies were published by Canosa-­Betés (2016), Güimil-­Fariña and Parcero-­Oubiña
(2015) as well as Lynch and Parcero-­Oubiña (2017).
If the location of the roads to be reconstructed is not known, validation often relies on road indica-
tor sites, for example grave monuments or mile stones that can be found close to Roman roads (e.g.
Güimil-­Fariña & Parcero-­Oubiña, 2015). On a larger scale, archaeologists often assume that settlements
are located close to main roads. For instance, Lynch and Parcero-­Oubiña (2017) calculated the distance
from each site in their study area to the closest calculated path. If the road indicator sites are clustered
(e.g. Güimil-­Fariña & Parcero-­Oubiña, 2015) statistical tests relying on independent observations are
problematic. Therefore, validation based on such sites is not straight-­forward.
Finally, validation of LCP results by survey is another possibility (e.g. Rademaker et al., 2012; Rogers
et al., 2015). However, the number of finds in the vicinity of old routes that can be discovered by field
walking is limited, Rogers et al. (2015) detected only one artefact that was older than 200 years dur-
ing two days of prospection. The remains of minor roads such as sunken lanes may lead to inadequate
conclusions. Moreover, continuous use of routes until today and stray finds are issues in LCP validation
by field survey.

Some conclusions
A wide variety of cost functions for walkers is available, but validated cost functions for the movement
of pack or draft animals as well as for water transport can rarely be found. Moreover, footpaths following
animal tracks might exhibit a large variation of preferred slopes, because the study by Ganskopp and Vavra
(1987) shows that different species prefer different slopes: the average slopes of sites utilised by cattle, feral
horses, mule deer, and bighorn were 5.8, 11.2, 15.7, and 42.5% respectively within one study area. For
non-­pedestrian transport further research is required to provide reliable cost functions.
Some authors of archaeological LCP studies believe that the selection of the cost model is of
minor importance (Bellavia, 2002; Verhagen & Jeneson, 2012). But many publications present quite
different LCP results for different slope-­dependent cost functions (e.g. Canosa-­Betés, 2016; Güimil-­
Fariña & Parcero-­Oubiña, 2015; Rademaker et al., 2012, Plate 2). Often, LCPs derived from several
cost models coincide only in areas where this route is the obvious choice such as mountain passes or
flat areas.
Spatial analysis based on cost functions 353

Most archaeological LCP and LCSC studies rely on software created by somebody else. Gietl, Doneus,
and Fera (2008) showed some time ago that it is often not possible to recreate the LCP results of one soft-
ware package with another. Some of the LCP software used by Gietl and his colleagues has been improved
in the last decade, but there are still substantial differences in their potential for modelling anisotropic
friction and movement steps in more than eight directions.

Cost function based approaches beyond LCPs and LCSCs


This contribution discussed LCPs connecting point pairs. Different concepts of connecting dots by a
network of routes exist, depending on the frequency a route is used and the effort required to construct
roads or paths. Overviews of approaches for connecting a set of points are presented in Herzog (2013c)
as well as Groenhuijzen and Verhagen (2017).
Several least-­cost approaches have been proposed for identifying corridors of movement: adding the
two ACS for two locations results in a raster with low values where progress is easy. The low value cells
indicate possible corridors of movement between the two locations (e.g. Palmisano, 2017).
Adding LCSC for each cell in the study area, independent of selected target locations is an approach
for calculating the accessibility of each cell (Mlekuz, 2013). This and additional methods for calculat-
ing accessibility based on cost functions are discussed in Herzog (2013d). An approach for identifying
zones of high accessibility based on focal mobility networks for random points and kernel density
estimation of cells visited frequently by the focal paths was presented by Canosa-­Betés (2016). Verha-
gen (2013) suggested integrating indicators of accessibility based on least-­cost models in a predictive
modelling framework. A method for avoiding overlapping site catchments is to stop the spreading
process whenever two site catchments meet, this is resulting in the generation of least-­cost Thiessen
polygons (Herzog, 2013c).
People who walk into unknown territory see only part of the landscape ahead whereas LCPs are based
on the total knowledge of the landscape ahead. For modelling dispersal processes into unknown terrain
starting from a given location, paths consisting of locally optimal steps in the direction chosen initially
may be generated. An agent-­based algorithm for modelling such dispersal processes was proposed by
Herzog (2016). The approach presented by Lock and Pouncett (2010) is also based on progress in local
neighbourhoods and introduces the term “corridor of intentionality” in this context. Moreover, they
point out the importance of cultural landscape features as mid-­distance waypoints.
LCP technology can also be applied to reconstruct Roman long-­distance water supply systems
(Orengo & Miró i Alaix, 2013). Wood and Wood (2006) suggest applying least-­cost distances for
economic modelling in archaeology, assuming that “artifacts derived from a particular resource will
inversely correlate with the energetic distance from the origin of that resource”. Least-­cost approaches
for kernel density estimation (Herzog & Yépez, 2013) and Ripley’s K (Negre, Muñoz, & Barcelo, 2017)
have been applied in archaeological studies. In fact, any method of spatial statistics relying on Euclidian
distances can be modified so that another distance measure is used. But it is important to remember
that least-­cost distances in general are no mathematical distances, because the path from A to B may
involve different costs than the return path from B to A, so that cost functions based on averaging
the costs in both directions should be used when modifying an algorithm designed for straight-­line
distances by using cost distances.

Note
1 VIA Erlebnisraum Römerstraße, Stationen [VIA adventure area Roman road, stops] (www.erlebnisraum-­
roemerstrasse.de/stationen).
354 Irmela Herzog

References
Bell, T., & Lock, G. (2000). Topographic and cultural influences on walking the Ridgeway in later prehistoric times.
In G. Lock (Ed.), Beyond the map: Archaeology and spatial technologies (pp. 85–100). Amsterdam: IOS Press.
Bellavia, G. (2002). Extracting “Natural Pathways” from digital elevation model: Applications to landscape archaeo-
logical studies. In G. Burenhult & J. Arvidson (Eds.), Archaeological informatics: Pushing the envelope. CAA, 2001,
BAR International Series, 1016, 5–12. Oxford: Archaeopress.
Branting, S. (2007). Using an urban street network and a PGIST approach to analyze ancient movement. In J.
Clark & E. Hagemeister (Eds.), Digital discovery: Exploring new frontiers in human heritage: Computer applications and
quantitative methods in archaeology: Proceedings of the 34th conference, Fargo, United States, April 2006 (pp. 99–108).
Budapest: Archaeolingua.
Canosa-­Betés, J. (2016). Border surveillance: Testing the territorial control of the Andalusian defense net-
work in center-­south Iberia through GIS. Journal of Archaeological Science: Reports, 9, 416–426. doi:10.1016/j.
jasrep.2016.08.026
Chapman, H. (2006). Landscape archaeology and GIS. Stroud: Tempus Publishing.
Collischonn, W., & Pilar, J. V. (2000). A direction dependent least cost path algorithm for roads and canals. Interna-
tional Journal of Geographical Information Science, 14(4), 397–406.
Conolly, J., & Lake, M. (2006). Geographical information systems in archaeology. Cambridge Manuals in Archaeology.
Cambridge, UK: Cambridge University Press.
Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2001). Introduction to algorithms (2nd ed.). Cambridge, MA:
The MIT Press and McGraw-­Hill Book Company.
de Gruchy, M., Caswell, E., & Edwards, J. (2017). Velocity-­based terrain coefficients for time-­based models of human
movement. Internet Archaeology, 45. doi:10.11141/ia.45.4
Dijkstra, E. W. (1959). A note on two problems in connexion with graphs. Numerische Mathematik, 1, 269–271.
doi:10.1007/BF01386390
Ericson, J. E., & Goldstein, R. (1980). Work space: A new approach to the analysis of energy expenditure within site
catchments. Anthropology UCLA, 10(1&2), 21–30.
Fábrega Álvarez, P., & Parcero Oubiña, C. (2007). Proposals for an archaeological analysis of pathways and movement.
Archeologia e Calcolatori, 18, 121–140.
Fovet, É., & Zakšek, K. (2014). Path network modelling and network of aggregated settlements: A case study in
Languedoc (Southeastern France). In Polla & Verhagen (2014, pp. 43–72).
Gaffney, V., & Stančič, Z. (1992). Diodorus Siculus and the island of Hvar, Dalmatia: Testing the text with GIS. In
G. Lock & J. Moffet (Ed.), Computer applications and quantitative methods in archaeology 1991. BAR International
Series, 577, 113–125. Oxford: Tempvs Reparatvm.
Ganskopp, D., Cruz, R., & Johnson, D. E. (2000). Least-­effort pathways?: A GIS analysis of livestock trails in rugged
terrain. Applied Animal Behaviour Science, 68, 179–190.
Ganskopp, D., & Vavra, M. (1987). Slope use by cattle, feral horses, deer, and bighorn sheep. Northwest Science, 61(2),
74–81.
Garmy, P., Kaddouri, L., Rozenblat, C., & Schneider, L. (2005). Logiques spatiales et “systèmes de villes” en Lodévois
de l’Antiquité à la période moderne. [Spatial investigations concerning village distributions in Lodévois from
antiquity until modern times]. In Temps et espaces de l’homme en société, analyses et modèles spatiaux en archéologie.
XXVème rencontres internationales d’archéologie et d’histoire d’Antibes (pp. 1–12). Editions APDCA.
Gietl, R., Doneus, M., & Fera, M. (2008). Cost distance analysis in an alpine environment: Comparison of differ-
ent cost-­surface modules. In A. Posluschny, K. Lambers, & I. Herzog (Eds.), Layers of perception: Proceedings of the
35th international conference on computer applications and quantitative methods in archaeology (CAA), Berlin, Germany,
April 2–6, 2007. Kolloquien zur Vor-­und Frühgeschichte, 10, (p. 342, full paper on CD). Bonn: Rudolf Habelt.
Givoni, B., & Goldman, R. (1971). Predicting metabolic energy cost. Journal of Applied Physiology, 30, 429–433.
Goodchild, M. F., & Hunter, G. J. (1997). A simple positional accuracy measure for linear features. International Journal
of Geographical Information Science, 11(3), 299–306. doi:10.1080/136588197242419
Grewe, K. (2004). Alle Wege führen nach Rom – Römerstraßen im Rheinland und anderswo [All roads lead to Rome:
Roman roads in the Rhine area and elsewhere]. In H. Koschik (Ed.), Alle Wege führen nach Rom: Internationales
Römerstraßenkolloquium Bonn. Materialien zur Bodendenkmalpflege im Rheinland, 16, 9–42.
Spatial analysis based on cost functions 355

Grewe, K. (2007). Die Agrippastraße zwischen Köln und Trier. [The Agrippa road between Cologne and Trier]. In
Erlebnisraum Römerstraße Köln-­Trier. Erftstadt-­Kolloquium, 31–64.
Groenhuijzen, M. R., & Verhagen, P. (2017). Comparing network construction techniques in the context of
local transport networks in the Dutch part of the Roman limes. Journal of Archaeological Science: Reports, 15,
235–251.
Güimil-­Fariña, A., & Parcero-­Oubiña, C. (2015). “Dotting the joins”: A non-­reconstructive use of least cost paths
to approach ancient roads: The case of the Roman roads in the NW Iberian Peninsula. Journal of Archaeological
Science, 54, 31–44. doi:10.1016/j.jas.2014.11.030
Hagen, J. (1931). Römerstraßen der Rheinprovinz. [Roman roads of the Rheinprovinz] Erläuterungen zum
Geschichtlichen Atlas der Rheinprovinz, Publikationen der Gesellschaft für Rheinische Geschichtskunde
XII 8 (2nd ed.).
Herzog, I. (2013a). Theory and practice of cost functions. In F. Contreras, M. Farjas, & F. J. Melero (Eds.), Fusion of
cultures: Proceedings of the 38th annual conference on computer applications and quantitative methods in archaeology, Granada,
Spain, April 2010. BAR International Series, 2494, 375–382, Granada. Oxford: Archaeopress.
Herzog, I. (2013b). The potential and limits of optimal path analysis. In A. Bevan & M. Lake (Eds.), Computational
approaches to archaeological spaces (pp. 179–211). Walnut Creek, CA: Left Coast Press.
Herzog, I. (2013c). Least-­cost networks. In G. Earl, T. Sly, A. Chrysanthi, P. Murrieta-­Flores, C. Papadopoulos, I.
Romanowska, & D. Wheatley (Eds.), Archaeology in the Digital Era: CAA 2012: Proceedings of the 40th annual confer-
ence of computer applications and quantitative methods in archaeology (CAA) (pp. 237–248). Amsterdam: Amsterdam
University Press.
Herzog, I. (2013d). Calculating accessibility. In G. Earl, T. Sly, A. Chrysanthi, P. Murrieta-­Flores, C. Papadopoulos,
I. Romanowska, & D. Wheatley (Eds.), Archaeology in the Digital Era, Volume II: CAA 2012: Proceedings of the
40th annual conference of computer applications and quantitative methods in archaeology (CAA), Southampton, 720–734.
Retrieved from http://dare.uva.nl/cgi/arno/show.cgi?fid=545855
Herzog, I. (2013e). Medieval mining sites, trade routes, and least-­cost paths in the Bergisches Land, Germany. In P.
Anreiter, K. Brandstätter, G. Goldenberg, K. Hanke, W. Leitner, K. Nicolussi, . . . P. Tropper (Eds.), Mining in
European history and its impact on environment and human societies: Proceedings for the 2nd mining in European history
conference of the FZ HiMAT, November 7–10, 2012, Innsbruck (pp. 201–206). Innsbruck: Innsbruck University
Press.
Herzog, I. (2014a). Least-­cost paths: Some methodological issues. Internet Archaeology, 36. doi:10.11141/ia.36.5
Herzog, I. (2014b). A review of case studies in archaeological least cost analysis. Archeologia e Calcolatori, 25, 223–239.
Herzog, I. (2016). Dispersal versus optimal path calculation. In S. Campana, R. Scopigno, G. Carpentiero, & M.
Cirillo (Eds.), CAA 2015: Keep the revolution going: Proceedings of the 43rd annual conference on computer applications
and quantitative methods in archaeology held in Siena 2014 (pp. 567–577). Oxford: Archaeopress.
Herzog, I., & Posluschny, A. (2011). Tilt: Slope-­dependent least cost path calculations revisited. In E. Jerem, F. Redö, &
V. Szeverényi (Eds.), On the road to reconstructing the past: Proceedings of the 36th CAA conference 2008 in Budapest,
Budapest, 212–218 (on CD: pp. 236–242). Budapest: Archaeolingua.
Herzog, I., & Yépez, A. (2013). Least-­cost kernel density estimation and interpolation-­based density analysis applied
to survey data. In F. Contreras, M. Farjas, & F. J. Melero (Eds.), Fusion of cultures: Proceedings of the 38th annual
conference on computer applications and quantitative methods in archaeology, Granada, Spain, April 2010 (pp. 367–374).
BAR International Series, 2494. Oxford: Archaeopress.
Horn, H. G. (2014). Agrippa Straße: Von Köln bis Dahlem in 4 Etappen und 8 Exkursen. [Agrippa road: Cologne to
Dahlem in 4 stages and 8 excursions]. Cologne: Bachem Verlag.
Hudson, E. (2012). Walking and watching: New approaches to reconstructing cultural landscapes. In White &
Surface-­Evans (2012, pp. 97–108).
Indruszewski, G., & Barton, C. M. (2005). DSMs, anisotropic spread and least cost paths for simulating Viking Age
routes in the Baltic Sea. In Stadtarchäologie Wien (Ed.), Workshop 9, Archäologie und Computer, November 3–5, 2004
(PDF file on CD). Vienna: Phoibos Verlag.
Indruszewski, G., & Barton, C. M. (2007). Simulating sea surfaces for modeling Viking Age seafaring in the Bal-
tic sea. In J. Clark & E. Hagemeister (Eds.), Digital discovery: Exploring new frontiers in human heritage: Computer
applications and quantitative methods in archaeology: Proceedings of the 34th conference, Fargo, United States, April 2006
(pp. 616–629). Budapest: Archaeolingua.
356 Irmela Herzog

Irmischer, I. J., & Clarke, K. C. (2017). Measuring and modeling the speed of human navigation. Cartography and
Geographic Information Science, 45(2), 177–186. doi:10.1080/15230406.2017.1292150
Kerig, T. (2008). Als Adam grub . . . Vergleichende Anmerkungen zu landwirtschaftlichen Betriebsgrössen in
prähistorischer Zeit. [When Adam was digging . . . remarks on farm sizes in prehistoric times]. Ethnographisch-­
Archäologische Zeitschrift, 48, 375–402.
Kondo, Y., Ako, T., Heshiki, I., Matsumoto, G., Seino, Y., Takeda, Y., & Yamaguchi, H. (2011). FIELDWALK@
KOZU: A preliminary report of the GPS/GIS-­aided walking experiments for remodelling prehistoric pathways
at Kozushima Island (East Japan). In E. Jerem, F. Redő, & V. Szeverényi (Eds.), On the road to reconstructing the past:
Computer applications and quantitative methods in archaeology (CAA): Proceedings of the 36th international conference,
Budapest, April 2–6, 2008 (pp. 226–232; CD-­ROM 332–338). Budapest: Archaeolingua.
Korczyńska, M., Cappenberg, K., & Kienlin, T. L. (2015). Lauter Lausitzer Burgwälle? Zur Bedeutung land-
wirtschaftlicher Gunstfaktoren während der späten Bronzezeit und frühen Eisenzeit entlang des Dunajec [So-­
called Lusatian Ramparts? Significance of agricultural factors favouring settlement in the Dunajec Valley in the
Late Bronze Age and early Iron Age] In J. Gancarski (Ed.), Pradziejowe osady obronne w Karpatach/Prehistoric fortified
settlements in the Carpathians (pp. 215–244). Krosno.
Langmuir, E. (2004). Mountaincraft and leadership (3rd ed.). Cordee, UK: Mountain Leader Training England &
Mountain Leader Training Scotland.
Lay, M. G. (1992). Ways of the world: A history of the world’s roads and of the vehicles that used them. New Brunswick,
NJ: Rutgers University Press.
Lee, J., & Stucky, D. (1998). On applying viewshed analysis for determining least-­cost paths on digital elevation
models. International Journal of Geographical Information Science, 12(8), 891–905.
Llobera, M. (2003). Extending GIS-­based visual analysis: The concept of visualscapes’. International Journal of Geo-
graphical Information Science, 17(1), 25–48. doi:10.1080/713811741
Llobera, M., Fábrega-­Álvarez, P., & Parcero-­Oubiña, C. (2011). Order in movement: A GIS approach to accessibility.
Journal of Archaeological Science, 38(4), 843–851.
Llobera, M., & Sluckin, T. J. (2007). Zigzagging: Theoretical insights on climbing strategies. Journal of Theoretical
Biology, 249, 206–217.
Lock, G., & Pouncett, J. (2010). Walking the Ridgeway revisited: The methodological and theoretical implications
of scale dependency for the derivation of slope and the calculation of least-­cost pathways. In B. Frischer, J. Webb
Crawford, & D. Koller (Eds.), Making history interactive: Computer applications and quantitative methods in archaeology
(CAA): Proceedings of the 37th international conference, Williamsburg, Virginia, United States of America, March 22–26
(pp. 192–203). BAR International Series, S2079. Oxford: Archaeopress.
Lynch, J., & Parcero-­Oubiña, C. (2017). Under the eye of the Apu: Paths and mountains in the Inka settlement
of the Hualfín and Quimivil valleys, NW Argentina. Journal of Archaeological Science: Reports, 16(Supplement
C), 44–56.
Márquez-­Pérez, J., Vallejo-­Villalta, I., & Álvarez-­Francoso, J. I. (2017). Estimated travel time for walking trails in
natural areas. Geografisk Tidsskrift – Danish Journal of Geography, 117(1), 53–63. doi:10.1080/00167223.2017.
1316212
Minetti, A. E., Moia, C., Roi, G. S., Susta, D., & Ferretti, G. (2002). Energy cost of walking and running at extreme
uphill and downhill slopes. Journal of Applied Physiology, 93, 1039–1046.
Mlekuz, D. (2013). Time geography, GIS and archaeology. In F. Contreras, M. Farjas, & F. J. Melero (Eds.), Fusion of
cultures: Proceedings of the 38th annual conference on computer applications and quantitative methods in archaeology, Granada,
Spain, April 2010 (pp. 359–366). BAR International Series, 2494. Oxford: Archaeopress.
Negre, J., Muñoz, F., & Barcelo, J. (2017). A cost-­based Ripley’s K function to assess social strategies in settlement
patterning. Journal of Archaeological Method and Theory, 25(3), 777–794. doi:10.1007/s10816-­017-­9358-­7
Orengo, H. A., & Miró i Alaix, C. (2013). Reconsidering the water system of Roman Barcino (Barcelona) from
supply to discharge. Water History, 5(3), 243–266. doi:10.1007/s12685-­013-­0090-­2
Palmisano, A. (2017). Drawing pathways from the Past: The trade routes of the old Assyrian Caravans across upper
Mesopotamia and Central Anatolia. In F. Kulakoglu & G. Barjamovic (Eds.), Movement, resources, interaction: Proceed-
ings of the 2nd Kültepe international meeting, Kültepe, July 26–30, 2015. Studies Dedicated to Klaas Veenhof (pp. 29–48).
Kültepe International Meetings 2 (SUBARTU, 39). Turnhout: Brepols.
Spatial analysis based on cost functions 357

Pandolf, K. B., Givoni, B., & Goldman, R. F. (1977). Predicting energy expenditure with loads while standing or
walking very slowly. Journal of Applied Physiology, 43(4), 577–581. doi:10.1152/jappl.1977.43.4.577
París Roche, A. (2008). MIDE: Método para la información de excursiones. Manual de procedimientos [Methodology
for assessing the difficulty of a walking route]. Versión 1.1. Retrieved from www.montanasegura.com/MIDE/
manualMIDE.pdf
Peeples, M. A., Barton, C. M., & Schmich, S. (2006). Resilience lost: Intersecting land use and landscape dynamics in
the prehistoric southwestern United States. Ecology and Society, 11(2), 22. Retrieved from www.ecologyandsociety.
org/vol11/iss2/art22/
Polla, S., & Verhagen, P. (Eds.). (2014). Computational approaches to movement in archaeology: Theory, practice and interpreta-
tion of factors and effects of long term landscape formation and transformation. Topoi Berlin Studies of the Ancient World:
Vol. 23. Berlin: De Gruyter. Retrieved from www.degruyter.com/viewbooktoc/product/182464
Pontzer, H., Raichlen, D. A., Wood, B. M., Mabulla, A. Z. P., Racette, S. B., & Marlowe, F. W. (2012). Hunter-­gatherer
energetics and human obesity. PLoS One, 7(7), e40503.
Posluschny, A. G. (2010). Over the hills and far away? Cost surface based models of prehistoric settlement Hinterlands.
In B. Frischer, J. Webb Crawford, & D. Koller (Eds.), Making history interactive: Computer applications and quantita-
tive methods in archaeology (CAA): Proceedings of the 37th international conference, Williamsburg,Virginia, United States of
America, March 22–26 (pp. 313–319). BAR International Series, S2079. Oxford: Archaeopress.
Rademaker, K., Reid, D. A., & Bromley, G. R. M. (2012). Connecting the dots. In White & Surface-­Evans (2012,
pp. 32–45).
Rogers, S. R., Collet, C., & Lugon, R. (2015). Least cost path analysis for predicting glacial archaeological site potential
in central Europe. In A. Traviglia (Ed.), Across space and time: Papers from the 41st conference on computer applications
and quantitative methods in archaeology, Perth, March 25–28, 2013 (pp. 261–275). Amsterdam: Amsterdam University
Press.
Schneider, J. (1879). Römische Heerstraßen auf der linken Rhein-­und Moselseite. [Roman military roads left of the
rivers Rhine and Mosel]. Bonner Jahrbücher, 67, 21–28.
Soule, R., & Goldman, R. (1972). Terrain coefficients for energy cost prediction. Journal of Applied Physiology, 32,
706–708.
Surface-­Evans, S. L. (2012). Cost catchments: A least cost application for modeling hunter gatherer land use. In
White & Surface-­Evans (2012, pp. 128–151).
Tobler, W. (1993). Non-­isotropic geographic modeling. Technical Report No. 93-­1. Retrieved from https://cloudfront.
escholarship.org/dist/prd/content/qt05r820mz/qt05r820mz.pdf
Van Lanen, R. (2017). Changing ways: Patterns of connectivity, habitation, and persistence in Northwest European lowlands
during the first millennium AD. Utrecht: Utrecht Studies in Earth Science.
Verhagen, P. (2013). On the road to nowhere? Least cost paths and the predictive modelling perspective. In F. Contre-
ras, M. Farjas, & F. J. Melero (Eds.), Fusion of cultures: Proceedings of the 38th annual conference on computer applications
and quantitative methods in archaeology, Granada, Spain, April 2010 (pp. 383–389). BAR International Series, 2494.
Oxford: Archaeopress.
Verhagen, P., & Jeneson, K. (2012). A Roman puzzle: Trying to find the via Belgica with GIS. In A. Chrysanthi,
P. Murrieta Flores, & C. Papadopoulos (Eds.), Thinking beyond the Tool (pp. 123–100). BAR International Series,
2344. Oxford: Archaeopress.
Waugh, D. (2002). Geography: An integrated approach. Cheltenham: Nelson Thornes.
Wheatley, D., & Gillings, M. (2002). Spatial technology and archaeology: The archaeological applications of GIS. London:
Taylor & Francis.
White, D., & Surface-­Evans, S. (Eds.). (2012). Least cost analysis of social landscapes: Archaeological case studies. Salt Lake
City: The University of Utah Press.
Whitley, T. G., & Burns, G. (2007). An explanatory framework for predictive modeling using an example from Mar-
ion, Horry, Dillon, and Marlboro Counties, South Carolina. In J. Clark & E. Hagemeister (Eds.), Digital discov-
ery: Exploring new frontiers in human heritage: Computer applications and quantitative methods in archaeology:
Proceedings of the 34th conference, Fargo, United States, April 2006 (pp. 121–129). Budapest: Archaeolingua.
Whitley, T. G., & Burns, G. (2008). Conditional GIS surfaces and their potential for archaeological predictive mod-
elling. In A. Posluschny, K. Lambers, & I. Herzog (Eds.), Layers of perception: Proceedings of the 35th international
358 Irmela Herzog

conference on computer applications and quantitative methods in archaeology (CAA), Berlin, Germany, April 2–6, 2007
(pp. 292–298). Kolloquien zur Vor-­und Frühgeschichte, 10. Bonn: Rudolf Habelt.
Wood, B., & Wood, Z. (2006). Energetically optimally travel across terrain: Visualizations and new metrics of geo-
graphic distance with anthropological applications. In R. F. Erbacher, J. C. Roberts, M. T. Gröhn, & K. Börner
(Eds.), Visualization and data analysis 2006. SPIE Proceedings, Volume 6060. doi:10.1117/12.644376
19
Processing and analysing
satellite data
Tuna Kalaycı

Introduction
The human production of landscapes requires substantial labour investment. Settlement mounds, field
boundaries, cairns, water canals, road systems and many other anthropogenic features create a cultural
signature within natural space. Furthermore, human disturbance of natural soil results in differences in
mineralogy, chemical constituents, soil moisture, soil structure, particle size and organic material content.
As a consequence, archaeologists can still trace ancient landscapes today. In this respect, satellite remote
sensing has increasingly provided better and more efficient tools for the detection, identification, and
understanding of past human activities – with spatial consequences.
Human intentionality results in cultural features with shapes which rarely occur naturally; for example,
ancient canals and roadways may follow long, linear paths and defensive forts have regular forms. Archae-
ological features are also structurally different from the remainder of the landscape and at times they are
significantly obtrusive within a view; settlement mounds were not only places for habitation, for example,
but also structures against which one could physically and/or socially orientate oneself. Indeed, some are
visible from space and they continue to guide archaeologists today. But it is not only monumental struc-
tures that leave detectable traces. When archaeological features degrade and lose their original forms, they
may create moisture differences with the natural environment in which they are situated; ancient ditches
now filled with loose soil have differential water retention capabilities and so that they can be detected
in satellite imagery as soil marks.
The main axiom in archaeological remote sensing is that human-­made features are detectable in satel-
lite imagery since they create a measurable data contrast with their immediate surroundings. There are
two contrast types: direct and proxy (Beck, 2007). Through direct contrast a sensor registers electromag-
netic data from an archaeological feature (or its remnants) which remains visible on the surface so that
the emitted/reflected energy has clear archaeological origins. Proxy contrast reading, on the other hand,
is materialized when a sensor reads data which is only indirectly related to an archaeological feature. This
type of data is formed by the interaction between a cultural feature and its natural setting, such as crop
marks.
Direct or proxy, the level of contrast determines the success of a remote sensing analysis. The degree
of separation between cultural and natural (i.e. contrast) is determined by a significantly large number
360 Tuna Kalaycı

of variables.1 A simplified but representative list might include local geological, geographical and geo-
morphological background conditions; chemical and physical composition of archaeological materials;
mechanics of degradation; time of data acquisition; and sensor resolutions.
The use of high-­resolution imagery in archaeology goes back as early as the beginning of the 20th
century, with pictures of Stonehenge taken from a military balloon in September 1906 revealing the
value of high-­altitude imagery for archaeology (Barber, 2012). Since that date, aerial photographs have
been widely used in archaeological studies (Brophy & Cowley, 2005; Riley, 1987; Wilson, 2000). On the
other hand, collecting and accessing aerial imagery is restricted in some parts of the world; or the tradi-
tion of practising aerial archaeology may not exist at all, and thus, only a handful of images may exist
for archaeological use.2 Space-­borne imagery has become a competent alternative to aerial archaeology
thanks to the constant increase in the spatial resolutions of satellite systems.
With resolutions of less than 1 metre, Very High Resolution (VHR) Earth observation satellites can
now resolve archaeological features in great detail, as has been reflected in the number of related studies
(e.g. Deroin, Téreygeol, & Heckes, 2011; Garrison et al., 2008; Malinverni, Pierdicca, Bozzi, Colosi, &
Orazi, 2017; Scardozzi, 2012). In particular, scholars have been using VHR imagery extensively for moni-
toring cultural heritage in areas where political turmoil and social unrest prohibit travelling to sites (e.g.
Casana & Panahipour, 2014; Lasaponara, Danese, & Masini, 2012).
While numerous research projects make use of high-­spatial-­resolution true-­colour satellite imagery
(where the software emulates how we see the earth’s surface using the visible portion of the electro-
magnetic spectrum), it is also possible to go beyond the visible spectra and exploit unique relationships
between materials and the ways in which electromagnetic energy is reflected/emitted from them. This is
to say that materials respond differentially to different portions of the electromagnetic spectrum. There-
fore, by determining unique ‘material signatures’ it is theoretically possible to discriminate archaeologi-
cal sites from non-­sites and thoroughly map archaeological features using multispectral analysis3 (e.g.
Oltean & Lauren, 2012).
Between 1982 and 2012, the Landsat 4 and 5 satellites carrying Thematic Mapper (TM) multispec-
tral sensors spearheaded Earth observation, providing invaluable datasets for archaeologists from as early
as the 1980s, especially useful for better understanding environments (Clark, Garrod, & Pearson, 1998;
Cooper, Bauer, & Cullen, 1991; Pope & Dahlin, 1989). Later studies continued this trend, making use
of other satellite systems with higher spatial or temporal resolutions (Kouchoukos, 2001; Schmid, Koch,
DiBlasi, & Hagos, 2008), as well as finer spectral resolutions – also called hyperspectral data (Alexakis,
Sarris, Astaras, & Albanakis, 2009; Savage, Levy, & Jones, 2012). In the latter work, scholars focus on
specific parts of the spectrum to investigate how particular archaeological features are depicted in sensor
data (Altaweel, 2005) or they use mathematical operations to put different parts of the spectrum together
(also called indexing) in order to highlight the differential properties of materials for easier detection
(Agapiou, Hadjimitsis, & Alexakis, 2012; Lasaponara & Masini, 2006). Finally, researchers make use of
derived variables from satellite sensing data; digital elevation models (DEMs) in particular having been
widely used (Hritz & Wilkinson, 2006; Siart, Bubenzer, & Eitel, 2009).

Method

The principles of satellite remote sensing and electromagnetic radiation


Space-­borne sensors collect data by measuring the electromagnetic radiation reflected and/or emit-
ted from the ground surface. In return, these measurements provide information about the physical
Processing and analysing satellite data 361

E
amplitude

directio
propagatnioof
n

wavelen
gth

Figure 19.1 The electric field (E – dashed line) remains perpendicular to the magnetic field (H – solid line)
as the electromagnetic energy propagates.

characteristics and chemical compositions of (archaeological) features. The main source of reflection/
emission is the Sun, as it produces a full spectrum of electromagnetic radiation (from gamma rays to
long waves).
The radiation travelling to the Earth at the speed of light is composed of electric (E) and magnetic (H)
fields. These two components are perpendicular to each other (Figure 19.1). The electrical field (E) var-
ies in amplitude in the direction of propagation and the magnetic field (H) propagates in-­phase with the
electrical field (Jackson, 1962, p. 204). The characteristics of the electromagnetic radiation, and thus the
ways in which it is finally reflected back and/or emitted from surface matter, can be defined through its
wavelength (λ, the distance from one crest to the next), frequency (ν, the number of crests passing a fixed
point at a given period of time), and amplitude (the height of each crest, also called spectral irradiance).
The speed of electromagnetic energy (c) is constant at 299,792 km/sec (in vacuum) and the relationship
between the wavelength and the frequency at a given time is:

c = λ × ν (19.1)

This is also to say that the wavelength and frequency of radiation are inversely proportional to each
other and as frequency increases (or wavelength decreases) the energy of the system increases since the
speed of light is always constant (as measured in a vacuum).
The parameters of electromagnetic radiation are helpful for describing the portions of the spectrum
(Figure 19.2). The most significant part of the spectrum for humans is visible light with a wavelength
range from (approximately) 400 nanometres to 700 nanometres (1 nm = 10–9 m), a narrow limit set by
human visual physiology. The other major categories relevant to remote sensing are infrared and micro-
wave, all of which have their own unique ways of interaction with the surface material and, thus, have
designated areas of scholarly interest (Campbell & Wynne, 2011, Chapters 17–21).
Once the light starts travelling in the atmosphere it interacts with its immediate environment (gases,
particles, differential temperature, etc.) and scatters, refracts, and is absorbed in numerous ways until it
362 Tuna Kalaycı

Figure 19.2 A simplified depiction of electromagnetic radiation. The visible light, which is mainly made up
of red, green and blue, constitutes only a small fraction of the full spectrum.

reaches the ground. Therefore, atmospheric conditions can greatly affect sensor readings. Sensors rely-
ing solely on the energy of the Sun (in the form of electromagnetic radiation) are called passive sensors
and they can only operate within specific atmospheric windows (Campbell & Wynne, 2011, pp. 45–48).
Active satellite sensors, however, have the capability of emitting their own energies (mainly in the micro-
wave portion of the spectrum), and they detect reflected backscattered radiation of their own. Active
sensor measurements can penetrate through clouds and can operate even at night4 (Chapman & Blom,
2013). Both passive (e.g. Agapiou, 2017; Schuetter et al., 2013) and active sensor types (e.g. Holcomb &
Shingiray, 2007; Stewart, 2017) have a wide range of applications in archaeology, but their coupled use
also attracts attention (e.g. Stewart, Oren, & Cohen-­Sasson, 2018).

Sensor design
Modern sensor data is collected in digital format (for a historical analogue counterpart, see the case study
section on CORONA) and is delivered ready for processing and analysis through computer software.
CCD (Charged-­Coupled Device) and CMOS (Complementary Metal Oxide Semiconductor) are con-
sidered to be important digital sensor designs for the collection of digital imagery. Nevertheless, optical-­
mechanical systems are still the dominant sensor types and provide a robust technology for collecting data
across the electromagnetic spectrum (Campbell & Wynne, 2011, Chapter 4).
As the satellite bus passes over a region of interest, the optical-­mechanical sensor scans the surface
of the Earth and collects electromagnetic energy. On-­board filtering (i.e. grating) splits (i.e. diffracts)
radiation into different segments (also called spectral channels), resulting in an orderly arrangement of
wavelengths. Next, detectors convert channelled radiation into a series of electric currents as a representa-
tion of land surface brightness at different wavelengths. Finally, electrical signals are amplified and data is
recorded digitally after conversion (Lillesand, Kiefer, & Chipman, 2004, Chapter 5).

Sensor resolutions
In this complex engineering design, there are some key parameters which are significant determinants
in archaeological applications. First and foremost is the spatial resolution, which reflects the ability of a
sensor to capture geometric details of the area or object under investigation, where each pixel represents
a corresponding area on the ground (Jensen, 2014, pp. 14–16). For instance, a 15-­metre sensor resolution
Processing and analysing satellite data 363

indicates the value of a pixel from the averaging of electromagnetic energy reflected/emitted from a
ground surface area with the dimension of 15 metres by 15 metres. Therefore, the smaller the unit of
ground area, the more detailed the final dataset. A very generic rule of thumb in remote sensing is that the
spatial resolution of a sensor should be half the size of the object of interest. While very high-­resolution
sensor data (0.5–2.0 metres) can provide detailed geometric descriptions of archaeological features, high
acquisition costs and limited ground coverage impedes common scientific usage.
Spectral resolution is determined by the width of the spectral channels. A narrower channel is a
more accurate descriptor of collected radiation as it focuses on a smaller portion of the electromagnetic
spectrum (Jensen, 2014, p. 14). A narrower width, however, also requires a larger number of channels (or
bands in sensor data) to meaningfully represent the spectrum. A multi-­spectral sensor is one that collects
data beyond the visible spectra, such as Landsat-­8 with 11 bands, or Sentinel-­2A with 12 bands. When
the sensor is designed with higher spectral resolution, it is called hyper-­spectral, such as EO-­1 Hyperion
with more than 200 bands (Savage et al., 2012).
Radiometric resolution is the ability of a sensor to collect brightness at greater levels, and it represents
the smallest detectable energy difference (Jensen, 2014, p. 18). A coarse radiometric resolution indicates
that limited levels of data are recorded in smaller computer data bits (2-­bits or 4-­bits), usually resulting in
a high-­contrast image. Fine resolution is the representation of ground brightness in greater detail (8-­bits
or beyond), resulting in many levels of brightness.
Temporal resolution is the frequency with which the sensor visits the same area of interest in a given
time period (Jensen, 2014, pp. 17–18). A fine resolution means there is available data with close time
intervals, so it is possible to detect and/or monitor changes in the landscape (e.g. before looting and after
looting (Lasaponara et al., 2012)). A finer resolution also increases the chances of acquiring the best scene
from a region, such as one without cloud coverage obstructing the view.

A generic workflow for satellite remote sensing analysis


Satellite remote sensing analysis can range from quick feature reconnaissance to complex mathematical
modelling. Therefore, it is not possible to suggest a one-­size-­fits-­all methodology. Nevertheless, a generic
description of a workflow can still be illuminating. The workflow is composed of six packages in a
recursive pipeline (Figure 19.3).

Data
Preprocessing

Problem Defining Data Evaluaon


Definion Data Needs Acquision

Data Analysis
---
Image
Classificaon

Figure 19.3 The main steps in a satellite remote sensing analysis. The workflow has a hierarchical structure.
Evaluation of results may reset the workflow until a satisfactory solution is achieved.
364 Tuna Kalaycı

Problem definition
A project should start with an in-­depth definition of the problem as this determines the next steps of the
analysis. For instance, a mapping project’s needs are completely different from those of remote-­sensing-­
based paleoenvironmental modelling. The former might be conducted by a single researcher using an
online viewing platform (such as Google Earth) and a budget computer with stable internet connection,
while the latter might need the collaboration of multiple researchers from different disciplines working
with large sets of remote sensing data on a computer with high processing power.
Desired outcomes should be clearly defined, since it might not be possible to easily alter the scope
during the workflow; because the process is hierarchical, a change in one step would require changes
in previous steps. An explicit designation of the project boundary will help during data acquisition and
will prevent later additions which would require rolling back the workflow. Finally, it is beneficial to
assign a specific coordinate system at the outset of the project, as later conversions between coordinate
systems and projection types might lead to spatial mismatches with other geographic datasets of the
project.

Defining data needs


Once the problem is set, sensor resolutions can help to define the data needs of the project. If the project’s
aim is to map archaeological features in detail, the use of a very high-­resolution sensor will be appropri-
ate. A basic reconnaissance, however, can be conducted with coarser resolutions. If the target of analysis
is well defined, it is in the researcher’s best interest to investigate its physical characteristics and potential
responses to electromagnetic radiation in different wavelengths. In return, the researcher may pick a sen-
sor which is the most responsive to the object of analysis, as different sensors have different bands (and
spectral ranges). For instance, if the project requires an analysis in time-­series form, the researcher should
concentrate on systems with higher ground repeat cycles (i.e. temporal resolution). Some satellite systems
also allow users to request data on-­demand, with additional data acquisition cost.

Data acquisition
Governmental agencies and companies in the private sector provide remotely sensed data. For publicly
funded satellite systems (e.g. Landsat, Sentinel) it is customary to have open access through an online
portal (e.g. https://earthexplorer.usgs.gov/; https://scihub.copernicus.eu/) or through project proposals
which are subject to evaluation. Private sector launches usually involve very high-­resolution systems (e.g.
WorldView-­3, RapidEye) and data can be purchased directly or via third-­party vendors.

Data pre-­processing
Pre-­processing is a crucial step for correcting data issues which are not strictly related to the relation-
ship between the ground and its radiometric manifestations on satellite sensors. Eliminating these issues
increases the chances of highlighting usually subtle contrasts between features and their backgrounds.
Furthermore, pre-­processing improves data integrity and interoperability. Acquired data might already
be processed at different levels (Arvidson et al., 1986, p. 5), but the researcher may still need to take into
consideration possible sensor malfunctions, atmospheric effects on measurements, and spatial referenc-
ing problems. Data pre-­processing may be divided into two major groups: radiometric and geometric
pre-­processing.
Processing and analysing satellite data 365

Radiometric pre-­processing involves adjusting the digital values of sensor data. Optical-­mechanical
malfunctions might lead to artificial data stripes or gaps due to sensor design issues, failures in scan line
collectors, eventual degradation of detectors, saturation, or problems during data downlink from satellite
to ground stations. Some of these issues might be corrected using statistical techniques ranging from
histogram equalization to interpolation (Pringle, Schmidt, & Muir, 2009; Rakwatin, Takeuchi, & Yasuoka,
2007).
The atmosphere can have considerable effects on remote sensing data collection. A sensor registers
radiation not only from the ground, but also from atmospheric scattering. There are three main types
of corrections: image-­based, modelling-­based, and corrections based on ground data (Hadjimitsis et al.,
2010). Among these, image-­based methodologies are the most common. In this methodology, features
with known/assumed spectral values are used for an adjustment through Dark Object Subtraction (DOS).
Water, for instance, absorbs solar radiation, and ideally water bodies should be spectrally registered at zero
values. Therefore, any value other than zero can be attributed to the atmospheric column over that water
body. Subtracting this value from the whole scene provides a simple approximation for atmospheric cor-
rection. Further modifications and improvements to DOS have been suggested (Chavez, 1996).
Geometric pre-­processing involves manipulating a remote sensing scene so that it is integrable with
other spatial data and represents detected ground geometry more accurately. Resampling is one of the
most common geometric pre-­processing techniques. It is especially necessary when the sampling size of
the ground (i.e. the pixel resolution) of a dataset does not match with other imagery, since such a discrep-
ancy in resolution between different rasters can, for instance, prohibit researchers from applying algebraic
operations. Resampling is performed by interpolation. The most common techniques are nearest-­
neighbourhood, bilinear and cubic convolution. Each interpolation technique has its advantages and
disadvantages (Zitová & Flusser, 2003). Resampling also creates the foundation for the ortho-­rectification
process, where the effects of terrain are removed from the scene and features are represented in their true
positions (Leprince, Barbot, Ayoub, & Avouac, 2007).
Georeferencing is the process of matching image coordinates with Earth coordinates (Ground Control
Points, or GCPs) so that features represented on imagery can be accurately located on the ground for fur-
ther analysis and evaluation (Longley, Goodchild, Maguire, & Rhind, 2005, pp. 109–126). It is routine to
acquire satellite data already in georeferenced form: for example, locational information might be embed-
ded in an image within the header file or come as an auxiliary file. It is also possible that the researcher
may need the camera parameters and flight information in the metadata so that the scene can be treated
photogrammetrically for a geometric correction. Finally, locational information may be available through
Rational Polynomial Coefficients (RPCs), where the relationship between the image and the ground is
mathematically described in high-­order (usually 20) polynomial functions (Tao & Hu, 2001). The RPCs
are preferred when camera vendors do not publicly reveal the parameters of their sensors. Furthermore,
RPCs can be used for any sensor since the mathematical description between an image and the ground
is system independent. Finally, RPCs are considered to be a fast solution to photogrammetric problems
(Dowman & Dolloff, 2000).

Data analysis
Satellite remote sensing has a wide variety of applications, ranging from agriculture to urban planning
and from disaster management to wetland mapping. Therefore, the objectives of research applications are
diverse, resulting in a vast body of literature on data analysis. The discussion below only provides selected
information, with a focus on the techniques which archaeologists employ most frequently.
366 Tuna Kalaycı

Panchromatic sharpening is a data fusion technique in which a higher resolution panchromatic band is
used to increase the spatial resolution of a multi-­spectral band5 (Lasaponara & Masini, 2012a). Sharpening
is usually performed with transformations (e.g. Intensity-­Hue-­Saturation), statistical methods (e.g. Prin-
ciple Component Analysis), or algebraic operations (e.g. Brovey). The final result is a spatially improved
spectral layer which is ready for visual investigation. However, it is important to note that pan-­sharpening
is a form of data manipulation so that altered layers may not be suitable for further quantitative analysis
and modelling.
Image enhancement is the rearrangement of image brightness levels with the aim of increasing the
level of contrast between archaeological features and their natural backgrounds. Enhancement strictly
depends on the data distribution of a particular scene, so there is no rule of thumb for its implementa-
tion. Radiometric approaches include, but are not limited to, contrast enhancement, linear stretching,
histogram equalization, and density slicing. Like panchromatic sharpening, image enhancement variably
alters sensor data values, so it is not advised to use enhanced images in further data analysis if the aim is
to physically model feature signatures.
Image classification is the process of assigning classes to pixels/objects in such a way that the separation
is meaningful and final classes form internally consistent homogeneous entities. In this respect, a classi-
fier is the algorithmic implementation of a formal mathematical/statistical method applied to a remotely
sensed dataset with the aim of providing a robust separation between classes. Due to the complexity of
this process, there have been countless efforts to provide the most powerful classifier, and thus, the remote
sensing literature is rich with a variety of applications. It is important to emphasize that no single clas-
sification method or algorithm is superior to any other, as each remotely sensed scene is unique. Owing
to the same complexity, there is no consensus on how to categorize approaches to the classification prob-
lem. However, considering the latest developments in computational technologies, a comparison between
pixel-­based and object-­based classifications can be proposed (Duro, Franklin, & Dubé, 2012).
Pixel-­based classification methodologies involve the analysis of the spectral properties of pixels
in isolation and disregard potential spatial and contextual relationships among neighbouring pixels.
These methodologies can be further divided into two major categories: unsupervised classification (e.g.
k-­means, ISODATA) where spectral characteristics of classes are not known a priori and an algorithm
divides the scene into clusters based on a statistical criterion. In supervised classification (e.g. parallel-­
piped, maximum likelihood) there is some information about potential classes based on ground work,
expert knowledge, or other geospatial analysis. This information, in turn, is used to train a classification
algorithm for the detection and delineation of classes with similar spectral properties to the training set.
Object-­based classification (or object-­based image analysis, OBIA) methodologies decompose a
remotely sensed scene into unclassified image objects (also called “object primitives”) using a segmenta-
tion process (Szeliski, 2010, Chapter 5). Through image segmentation each pixel is flagged so that pixels
with the same flag form a group. A pixel shares a specific characteristic with other pixels in the same
group. Finally, statistical analysis is used to determine the characteristics of image objects (shape, size,
colour, texture, and context) for a final classification.
Archaeological feature detection can be considered a specific form of image classification, since the
ultimate aim of feature detection is to delineate pixels with specific anthropogenic meanings surrounded
by the pixels of natural background. One of the benefits of satellite remote sensing in archaeology is that
it enables wide-­area detection of archaeological sites and features (Casana, 2014). Also, with its ability
to collect data beyond the visible portion of the electromagnetic spectrum, it highlights features which
are not immediately observable (Oltean & Lauren, 2012). Combined together, these two attributes
open up the possibility for identifying and documenting archaeological material in a given area in its
entirety. Furthermore, advancements in information technologies – but more so the introduction of very
Processing and analysing satellite data 367

high-­resolution sensors in archaeology – help in a wide variety of (semi-­)automated mapping processes


during feature detection. For instance, De Laet, Paulissen, and Waelkens (2007) employ a pixel-­based
maximum-­likelihood classification approach for the identification and extraction of features. Larsen,
Trier, and Solberg (2008) use template matching by moving a synthetic circular shape across the image
to detect potential ring-­shaped structures. Jahjah and Ulivieri (2010) follow a mathematical morphol-
ogy approach where variations in the intensity of data are transformed onto two-­dimensional space for
extracting geometric features. Menze and Ur (2012) investigate multi-­spectral data in time series form
and employ machine learning techniques to locate mounded settlements. Finally, Lauricella, Cannon,
Branting, and Hammer (2017) apply Principle Component Analysis (PCA) to highlight potential targets
of interest (evidence of looting) and use geometric filters to identify false positives for a more accurate
feature detection workflow.

Evaluation
The final step in a generic archaeological remote sensing project is the evaluation of results. Evaluation
can be qualitative and based on visual investigations or results can also be compared quantitatively with
another dataset (or against a set threshold value) for the viability of previous pre-­processing analysis steps.
Therefore, evaluation provides feedback for the overall model, and as a result the researcher may need to
re-­run data analysis with different parameters, employ other algorithms, consider using different sensor
data, or even re-­evaluate the research question.

Case study
A very special form of high-­spatial-­ and temporal-­resolution satellite sensor pre-­dates many of the
available sensors today and has been extensively used in archaeological studies. The CORONA spy sat-
ellite was developed as part of the US intelligence program (1960 to 1972) in the Cold War Era (Day,
Logsdon, & Latell, 1999). Due to its historicity, panchromatic CORONA images provide snapshots of
archaeological landscapes prior to recent large constructions, industrial agriculture, and urban expan-
sion. The impact of such land-­use/land-­cover change on the preservation of ancient material culture
is immense, and in many cases, there is complete loss (Casana, Cothren, & Kalayci, 2012). Among
many CORONA missions, the declassified series with Keyhole-­KH 4A and 4B designators provide the
most suitable data for archaeological research, since they offer high-­spatial-­resolution imagery (2.74m
and 1.83m at nadir, respectively) and have considerably larger ground coverage (17 × 232 and 14 ×
188 kilometres, respectively).6 Traditionally, CORONA is well-­known for its applications in Meso-
potamia, but there are also other examples from Central Asia (Goossens, Wulf, Bourgeois, Gheyle, &
Willems, 2006), China (Min, 2013), Egypt (Moshier & El-­Kalani, 2008), Greece (Kalayci, 2014), and
India (Conesa et al., 2015).
CORONA has been the most useful when integrated with other satellite systems for the explora-
tion of archaeological landscapes. To give a few examples, Richason and Hritz (2007) visually compare
CORONA imagery with later-­dated sensors and detect ancient canals before the destruction of modern
land-­use practices. Menze and Ur (2012) propose CORONA and spectral satellite data coupling in order
to build a multi-­temporal classification methodology with the intention of exploring long-­term settle-
ment patterns. Parcak, Mumford, and Childs (2017) compare CORONA with later-­generation very
high-­resolution sensors in order to assess landscape scale changes.
Details of a specific case study will further highlight the use of CORONA in archaeological research.
The Bronze Age landscapes of Upper Mesopotamia contain a distinct archaeological feature called hollow
368 Tuna Kalaycı

Figure 19.4 A CORONA scene (DS1102–1025DF007, December 1967) showing the location and extent
of hollow ways radiating from Tell Brak, Syria. The terminal points of hollow ways can be mathematically
modelled in order to estimate the area of agricultural production around sites.

ways (Ur, 2003, 2017). Wilkinson (1994) convincingly argues that hollow ways were formed due to the
repetitive movement of flocks between settlements and open pasture. While on the move, flocks were kept
in groups in order to minimize damage to agricultural fields. Once the terminal point was passed, animals
were allowed to disperse in the landscape. Therefore, the end points of hollow ways can be considered as
the markers of agricultural production boundaries (Figure 19.4).
Following this theory, Kalayci (2016) explores the relationship between the size of a settlement –
as a proxy of its ancient population – and its crop production potential. The CORONA imagery is
used to map the hollow ways and determine the boundaries of agricultural production from a sample
of Bronze Age sites. Next, he proposes using a combination of precipitation reconstructions and a
rainfall-­dependent remote sensor vegetation growth model. This model relies on the use of Normal-
ized Difference Vegetation Index (NDVI) values which are calculated from the Advanced Very High
Resolution Radiometer (AVHRR) (Figure 19.5). Finally, he estimates how much yield a site might have
produced within its production boundaries and compares these metrics with corresponding settlement
areas.
Processing and analysing satellite data 369

Figure 19.5 Multi-­spectral data analysis with vegetation indices provides a detailed and dynamic representa-
tion of agricultural landscapes. These models surpass static descriptions of agro-­economic zones, which are
usually based on strict assumptions about productivity. The circles in the figure show production boundaries
of Bronze Age settlements.

Selement Area

Total Producon

Figure 19.6 Scatter plots revealing the strength of the relationship between settlement size and estimates of
total production. The plot (a) shows a weak relationship when estimated production values are directly com-
pared with settlement size. However, when a biennial fallowing strategy is introduced for settlements smaller
than 50 hectares (b), the relationship becomes much stronger.

When settlement area and total production estimates are compared directly, the relationship between
these two variables appears to be weak (correlation coefficient, Pearson’s r=0.3) (Figure 19.6(a)). How-
ever, when a biennial fallowing is introduced to the model for settlements smaller than 50 hectares, the
correlation coefficient rises to 0.85 (Figure 19.6(b)). Thanks to the coupling of CORONA with multi-­
spectral satellite data analysis, the study provides a dynamic representation of land-­use practices and chal-
lenges normative assumptions about population pressure and food production practices.
370 Tuna Kalaycı

Selement Area

Total Producon

Figure 19.6 Continued

Conclusion
Satellite remote sensing is changing the ways in which scholars undertake landscape archaeology projects.
Today there are specialized books dedicated solely to satellite remote sensing applications in archaeology
(Comer & Harrower, 2013; Lasaponara & Masini, 2012b; Parcak, 2009). As a result, satellite remote sens-
ing in archaeology is moving beyond methodological progression and emerging as its own sub-­discipline.
It is now possible to document archaeological landscapes in their entirety and purposefully cascade scales
of analysis from site to regional to supra-­regional levels, in both space and time.
Very high-­resolution satellite imagery can now be acquired at relatively low prices. Governmental
agencies provide open access for advanced multi-­spectral (e.g. SENTINEL-­2) and hyper-­spectral (e.g.
Earth Observing One-­Hyperion) data. In particular, the use of online geospatial imagery platforms (e.g.
Google Earth) in archaeological research is a significant shift. These viewing platforms provide free access
to high-­resolution satellite imagery, at times, in time series form, making it possible to detect changes in
the landscape. In the field, viewing high-­resolution satellite data on smartphones with GPS capabilities is
now almost a standard routine. The scholarship has accomplished astonishing progress since the earlier
pioneering studies (e.g. Behrens & Sever, 1991), which clearly manifests itself in the increasing trend of
scholarly publications (Agapiou & Lysandrou, 2015).
Semi-­or fully automated detection of archaeological features in satellite imagery has been attracting
interest from scholars, and it appears that this trend will continue as data resolutions improve, software
attain more user friendly interfaces, and custom algorithms are developed (e.g. Zingman, Saupe, Penatti, &
Lambers, 2016). However, new research domains are also emerging thanks to big data analytics (see also
Green, this volume). Remote sensing archaeologists have started to explore cloud computing opportuni-
ties in order to access petabytes of global-­scale data with parallel computational power at the server side
(Agapiou, 2017; Liss, Howland, & Levy, 2017).
Also moving in parallel with the advancements in internet technologies, crowdsourcing for the
analysis of remotely sensed data is a further step towards citizen science in archaeology. Among notable
examples of this approach, the GlobalXplorer Project (www.globalxplorer.org) has been the most visible
Processing and analysing satellite data 371

in the public sphere, while other initiatives continue to provide web platforms for online participatory
projects, e.g. TerraWatchers (www.terrawatchers.org).
A continuous critical reading of geospatial technologies – and especially of satellite remote sensing – is
a matter of the utmost importance. Satellite remote sensing brings exceptional insights into archaeological
questions and opens up innovative research avenues, especially in the landscape domain. However, there
are important concerns relating to the science and technology of this advancement, and it is vital to recog-
nize and understand how power mediates through remote sensing instruments, algorithms and software.7
There is a long list to be critically examined, but only two issues will be briefly highlighted here. First, the
relationship between the military industrial complex and archaeology should be unearthed (Hamilakis,
2009). This relationship has materialized through the use of geospatial tools as they constitute dual-­use
technologies (Pollock, 2016, p. 220). In particular, the CORONA spy satellite system as discussed in the
Case Study section, was the continuation of the constant infiltration of army surveillance ideology seeping
into archaeological practice. The second critique is woven around the transformation of surveillance into
scientific objectivity, which finds theoretical underpinnings in Haraway (1988): the findings of a remote
sensing archaeologist are sterilized through electromagnetic energy, disciplined by regular pixels, bounded
by the imagery, and testable for its accuracy at the very least. Archaeology, on the other hand, is messy.

Notes
1 The delineation of sites is not only a methodological problem in satellite remote sensing. There has been no
consensus on the epistemological and ontological status of the archaeological site (Dunnell, 1992). Hence, the
separation between a cultural deposit and its background as observed from space is intrinsically a subjective
archaeological narrative despite its strictly empirical nature.
2 Recent advancements in Remotely Piloted Aircraft Systems (RPAS, also known as drones or Unmanned Aerial
Vehicles (UAVs)) have created an alternative to conventional aerial photography in archaeology. Feasible alterna-
tives, however, still suffer from system constraints, such as limited flight time, payload weight, and sensitivity to
weather conditions.
3 The determination of unique ‘material signatures’ of archaeological sites and features suffers from two major
issues. First, there is no clear way to define the boundary between sites and non-­sites, if such a boundary can even
be said to exist (see endnote 1). Second, in very rare cases a pixel is composed of a single homogeneous material.
Almost always, a pixel value is the average of reflected/emitted radiation from various surface features. Therefore,
a ‘material signature’ of a feature exists only at a hypothetical level. Signal estimation of material contributions to a
pixel value (i.e. spectral mixture analysis) is one of the main research avenues in remote sensing (Adams, Smith, &
Johnson, 1986; Somers, Asner, Tits, & Coppin, 2011).
4 Active Radar sensor data is also based on electromagnetic theory, but sensor conceptualization and data analysis
require a different approach than do multi-­spectral sensors (Wiseman & El-­Baz, 2007). For brevity of discussion,
the remainder of the text is written with multi-­spectral systems in mind.
5 A panchromatic band is the combination of red, green and blue bands and sometimes the near-­infrared band.
Therefore, during the scan a panchromatic sensor needs less radiation to register a value than does a multi-­spectral
sensor. This is to say that relative to multi-­spectral sensors, panchromatic sensors can collect the same energy from
a smaller portion of the ground, translating into finer spatial resolution.
6 A photogrammetrically corrected comprehensive CORONA KH4-­B inventory is hosted at http://corona.cast.
uark.edu (accessed January 2019).
7 In similar fashion, Geographic Information Systems (GIS) have been facing similar critiques (e.g. Gaffney &
Van Leusen, 1995; Wickstead, 2009). It appears that the discussion is now settled (or has been marginalized!) as
one considers the widespread and somewhat uncritical use of geospatial technologies, although efforts to offer
informed practice continue (Gillings, 2012).

References
Adams, J. B., Smith, M. O., & Johnson, P. E. (1986). Spectral mixture modeling: A new analysis of rock and soil types
at the Viking Lander 1 site. Journal of Geophysical Research: Solid Earth, 91(B8), 8098–8112.
372 Tuna Kalaycı

Agapiou, A. (2017). Remote sensing heritage in a petabyte-­scale: Satellite data and heritage Earth Engine© applica-
tions. International Journal of Digital Earth, 10(1), 85–102.
Agapiou, A., Hadjimitsis, D. G., & Alexakis, D. D. (2012). Evaluation of broadband and narrowband vegetation
indices for the identification of archaeological crop marks. Remote Sensing, 4(12), 3892–3919.
Agapiou, A., & Lysandrou, V. (2015). Remote sensing archaeology: Tracking and mapping evolution in European
scientific literature from 1999 to 2015. Journal of Archaeological Science: Reports, 4, 192–200.
Alexakis, D., Sarris, A., Astaras, T., & Albanakis, K. (2009). Detection of Neolithic settlements in Thessaly (Greece)
through multispectral and hyperspectral satellite imagery. Sensors, 9(2), 1167–1187.
Altaweel, M. (2005). The use of Aster satellite imagery in archaeological contexts. Archaeological Prospection, 12,
151–166.
Arvidson, R., Billingsley, R., Chase, R., Chavez, P., Devirian, M., Estes, J., . . . Rossow, W. (1986). Report of the EOS
data panel on the data and information system,Vol. IIa of NASA TM-­87777. Washington, DC: National Aeronautics
and Space Administration (NASA).
Barber, M. (2012). A history of aerial photography and archaeology: Mata Hari’s glass eye and other stories. Swindon: English
Heritage.
Beck, A. R. (2007). Archaeological site detection: The importance of contrast. In Proceedings of the 2007 annual con-
ference of the Remote Sensing and Photogrammetry Society (pp. 307–312). Red Hook, NY: The Remote Sensing and
Photogrammetry Society.
Behrens, C. A., & Sever, T. L. (Eds.). (1991). Applications of space-­age technology in anthropology. Mississippi:
NASA.
Brophy, K., & Cowley, D. (2005). From the air: Understanding aerial archaeology. London: The History Press Ltd.
Campbell, J. B., & Wynne, R. H. (2011). Introduction to remote sensing. New York: Guilford Publications.
Casana, J. (2014). Regional-­scale archaeological remote sensing in the age of Big Data: Automated site discovery vs.
brute force methods. Advances in Archaeological Practice, 2(3), 222–233.
Casana, J., Cothren, J., & Kalayci, T. (2012). Swords into ploughshares: Archaeological applications of CORONA
satellite imagery in the Near East. Internet Archaeology, 32.
Casana, J., & Panahipour, M. (2014). Satellite-­based monitoring of looting and damage to archaeological sites in
Syria. Journal of Eastern Mediterranean Archaeology & Heritage Studies, 2(2), 128–151.
Chapman, B., & Blom, R. G. (2013). Synthetic aperture radar, technology, past and future applications to archaeology.
In D. C. Comer & M. J. Harrower (Eds.), Mapping archaeological landscapes from space (pp. 113–131). New York:
Springer Science & Business Media.
Chavez, P. S. (1996). Image-­based atmospheric corrections-­revisited and improved. Photogrammetric Engineering and
Remote Sensing, 62(9), 1025–1035.
Clark, C. D., Garrod, S. M., & Pearson, M. P. (1998). Landscape archaeology and remote sensing in Southern Mada-
gascar. International Journal of Remote Sensing, 19(8), 1461–1477.
Comer, D. C., & Harrower, M. J. (2013). Mapping archaeological landscapes from space (Vol. 5). New York: Springer
Science & Business Media.
Conesa, F. C., Madella, M., Galiatsatos, N., Balbo, A. L., Rajesh, S. V., & Ajithprasad, P. (2015). CORONA photographs
in monsoonal semi-­arid environments: Addressing archaeological surveys and historic landscape dynamics over
North Gujarat, India. Archaeological Prospection, 22(2), 75–90.
Cooper, F. A., Bauer, M. E., & Cullen, B. C. (1991). Satellite spectral data and archaeological reconnaissance in West-
ern Greece. In C. A. Behrens & T. L. Sever (Eds.), Applications of space-­age technology in anthropology (pp. 63–79).
Mississippi: NASA.
Day, D. A., Logsdon, J. M., & Latell, B. (Eds.). (1999). Eye in the sky: The story of the CORONA spy satellites. Wash-
ington, DC: Smithsonian Institution Press.
De Laet, V., Paulissen, E., & Waelkens, M. (2007). Methods for the extraction of archaeological features from very
high-­resolution IKONOS-­2 remote sensing imagery, Hisar (southwest Turkey). Journal of Archaeological Science,
34(5), 830–841.
Deroin, J.-­P., Téreygeol, F., & Heckes, J. (2011). Evaluation of very high to medium resolution multispectral satel-
lite imagery for geoarchaeology in arid regions: Case study from Jabali, Yemen. Journal of Archaeological Science,
38(1), 101–114.
Processing and analysing satellite data 373

Dowman, I., & Dolloff, J. T. (2000). An evaluation of rational functions for photogrammetric restitution. International
Archives of Photogrammetry and Remote Sensing, 33(B3/1; PART 3), 254–266.
Dunnell, R. C. (1992). The notion site. In J. Rossignol & L. Wandsnider (Eds.), Space, time, and archaeological landscapes
(pp. 21–41). New York: Plenum Press.
Duro, D. C., Franklin, S. E., & Dubé, M. G. (2012). A comparison of pixel-­based and object-­based image analysis
with selected machine learning algorithms for the classification of agricultural landscapes using SPOT-­5 HRG
imagery. Remote Sensing of Environment, 118, 259–272.
Gaffney, V., & Van Leusen, M. (1995). Postscript-­GIS, environmental determinism and archaeology: A parallel text. In
G. R. Lock & Z. Stancic (Eds.), Archaeology and geographical information systems: A European perspective (pp. 367–383).
London: Taylor and Francis.
Garrison, T. G., Houston, S. D., Golden, C., Inomata, T., Nelson, Z., & Munson, J. (2008). Evaluating the use of
IKONOS satellite imagery in lowland Maya settlement archaeology. Journal of Archaeological Science, 35(10),
2770–2777.
Gillings, M. (2012). Landscape phenomenology, GIS and the role of affordance. Journal of Archaeological Method and
Theory, 19(4), 601–611.
Goossens, R., Wulf, A., Bourgeois, J., Gheyle, W., & Willems, T. (2006). Satellite imagery and archaeology: The
example of CORONA in the Altai mountains. Journal of Archaeological Science, 33(6), 745–755.
Hadjimitsis, D. G., Papadavid, G., Agapiou, A., Themistocleous, K., Hadjimitsis, M. G., Retalis, . . . Clayton, C. R. I.
(2010). Atmospheric correction for satellite remotely sensed data intended for agricultural applications: Impact
on vegetation indices. Natural Hazards and Earth System Sciences, 10(1), 89–95.
Hamilakis, Y. (2009). The “war on terror” and the military: Archaeology complex: Iraq, ethics, and neo-­colonialism.
Archaeologies, 5(1), 39–65.
Haraway, D. (1988). Situated knowledges: The science question in feminism and the privilege of partial perspective.
Feminist Studies, 14(3), 575–599.
Holcomb, D. W., & Shingiray, I. L. (2007). Imaging radar in archaeological investigations: An image processing
perspective. In J. Wiseman & F. El-­Baz (Eds.), Remote sensing in archaeology (pp. 11–45). New York: Springer.
Hritz, C. A., & Wilkinson, T. J. (2006). Using shuttle radar topography to map ancient water channels in Mesopo-
tamia. Antiquity, 80(308), 415–424.
Jackson, D. J. (1962). Classical electrodynamics. New York: John Wiley & Sons, Inc.
Jahjah, M., & Ulivieri, C. (2010). Automatic archaeological feature extraction from satellite VHR images. Acta
Astronautica, 66(9–10), 1302–1310.
Jensen, J. R. (2014). Remote sensing of the environment: An Earth resource perspective. Essex: Pearson Education Limited.
Kalayci, T. (2014). A review on the potential use of CORONA images of Greece. In Proceedings of the Computer
Applications in Archaeology conference (CAA-­GR) (pp. 55–63). Rethymno: Crete.
Kalayci, T. (2016). Settlement sizes and agricultural production territories: A remote sensing case study for the Early
Bronze Age in Upper Mesopotamia. Science and Technology of Archaeological Research, 2(2), 217–234.
Kouchoukos, N. (2001). Satellite images and the representation of near Eastern landscapes. Near Eastern Archaeology,
1–2, 80–91.
Larsen, S. O., Trier, D., & Solberg, R. (2008). Detection of ring shaped structures in agricultural land using high
resolution satellite images. In Proceedings of the pixels, objects, intelligence: Geographic object based image analysis for the
21st century conference (pp. 81–86). Alberta, Canada.
Lasaponara, R., Danese, M., & Masini, N. (2012). Satellite-­based monitoring of archaeological looting in Peru. In
Satellite remote sensing (pp. 177–193). New York: Springer Science.
Lasaponara, R., & Masini, N. (2006). Identification of archaeological buried remains based on the Normalized Dif-
ference Vegetation Index (NDVI) from Quickbird satellite data. IEEE Geoscience and Remote Sensing Letters, 3(3),
325–328.
Lasaponara, R., & Masini, N. (2012a). Pan-­sharpening techniques to enhance archaeological marks: An overview.
In R. Lasaponara & N. Masini (Eds.), Satellite remote sensing: A new tool for archaeology (pp. 87–109). New York:
Springer Science.
Lasaponara, R., & Masini, N. (Eds.). (2012b). Satellite remote sensing: A new tool for archaeology (Vol. 16). New York:
Springer Science.
374 Tuna Kalaycı

Lauricella, A., Cannon, J., Branting, S., & Hammer, E. (2017). Semi-­automated detection of looting in Afghanistan
using multispectral imagery and principal component analysis. Antiquity, 91(359), 1344–1355.
Leprince, S., Barbot, S., Ayoub, F., & Avouac, J.-­P. (2007). Automatic and precise orthorectification, coregistration,
and subpixel correlation of satellite images, application to ground deformation measurements. IEEE Transactions
on Geoscience and Remote Sensing, 45(6), 1529–1558.
Lillesand, T. M., Kiefer, R. W., & Chipman, J. W. (2004). Remote sensing and image interpretation. New Jersey: John
Wiley & Sons.
Liss, B., Howland, M. D., & Levy, T. E. (2017). Testing Google Earth Engine for the automatic identification and
vectorization of archaeological features: A case study from Faynan, Jordan. Journal of Archaeological Science: Reports,
15, 299–304.
Longley, P. A., Goodchild, M., Maguire, D. J., & Rhind, D. W. (2005). Geographic information systems and science. West
Sussex: John Wiley & Sons.
Malinverni, E. S., Pierdicca, R., Bozzi, C. A., Colosi, F., & Orazi, R. (2017). Analysis and processing of nadir and stereo
VHR Pleiadés images for 3D mapping and planning the land of Nineveh, Iraqi Kurdistan. Geosciences, 7(3), 80.
Menze, B. H., & Ur, J. A. (2012). Mapping patterns of long-­term settlement in Northern Mesopotamia at a large
scale. Proceedings of the National Academy of Sciences, 109(14), E778–E787.
Min, L. (2013). Archaeological landscapes of China and the application of CORONA images. In Mapping archaeological
landscapes from space (Vol. 5, pp. 45–54). New York: Springer Science & Business Media.
Moshier, S. O., & El-­Kalani, A. (2008). Late bronze age paleogeography along the ancient Ways of Horus in North-
west Sinai, Egypt. Geoarchaeology, 23(4), 450–473.
Oltean, I. A., & Lauren, L. A. (2012). High-­resolution satellite imagery and the detection of buried archaeological
features in ploughed landscapes. In R. Lasaponara & N. Masini (Eds.), Satellite remote sensing: A new tool for archaeol-
ogy (Vol. 16, pp. 291–305). New York: Springer Science.
Parcak, S. (2009). Satellite remote sensing for archaeology. Abingdon, UK: Routledge.
Parcak, S., Mumford, G., & Childs, C. (2017). Using open access satellite data alongside ground based remote sensing:
An assessment, with case studies from Egypt’s delta. Geosciences, 7(4), 94.
Pollock, S. (2016). Archaeology and contemporary warfare. Annual Review of Anthropology, 45, 215–231.
Pope, K. O., & Dahlin, B. (1989). Ancient Maya wetland agriculture: New insights from ecological and remote sens-
ing research. Journal of Field Archaeology, 16, 87–106.
Pringle, M. J., Schmidt, M., & Muir, J. S. (2009). Geostatistical interpolation of SLC-­off Landsat ETM+ images.
ISPRS Journal of Photogrammetry and Remote Sensing, 64(6), 654–664.
Rakwatin, P., Takeuchi, W., & Yasuoka, Y. (2007). Stripe noise reduction in MODIS data by combining histogram
matching with facet filter. IEEE Transactions on Geoscience and Remote Sensing, 45(6), 1844–1856.
Richason, B. F., & Hritz, C. (2007). Remote sensing and GIS use in the archaeological analysis of the central Mesopo-
tamian plain. In J. Wiseman & F. El-­Baz (Eds.), Remote sensing in archaeology (pp. 283–325). New York: Springer.
Riley, D. N. (1987). Air photography and archaeology. Philadelphia: University of Pennsylvania Press.
Savage, S. H., Levy, T. E., & Jones, I. W. (2012). Prospects and problems in the use of hyperspectral imagery for
archaeological remote sensing: A case study from the Faynan copper mining district, Jordan. Journal of Archaeo-
logical Science, 39(2), 407–420.
Scardozzi, G. (2012). Integrated methodologies for the archaeological map of an ancient city and its territory: The
case of Hierapolis in Phrygia. In R. Lasaponara & N. Masini (Eds.), Satellite remote sensing: A new tool for archaeology
(pp. 129–156). New York: Springer Science.
Schmid, T., Koch, M., DiBlasi, M., & Hagos, M. (2008). Spatial and spectral analysis of soil surface properties for an
archaeological area in Aksum, Ethiopia, applying high and medium resolution data. CATENA, 75(1), 93–101.
Schuetter, J., Goel, P., McCorriston, J., Park, J., Senn, M., & Harrower, M. (2013). Autodetection of ancient Arabian
tombs in high-­resolution satellite imagery. International Journal of Remote Sensing, 34(19), 6611–6635.
Siart, C., Bubenzer, O., & Eitel, B. (2009). Combining digital elevation data (SRTM/Aster), high resolution satellite
imagery (Quickbird) and GIS for geomorphological mapping: A multi-­component case study on Mediterranean
karst in Central Crete. Geomorphology, 112(1), 106–121.
Somers, B., Asner, G. P., Tits, L., & Coppin, P. (2011). Endmember variability in spectral mixture analysis: A review.
Remote Sensing of Environment, 115(7), 1603–1616.
Processing and analysing satellite data 375

Stewart, C. (2017). Detection of archaeological residues in vegetated areas using satellite synthetic aperture radar.
Remote Sensing, 9(2), 118.
Stewart, C., Oren, E., & Cohen-­Sasson, E. (2018). Satellite remote sensing analysis of the Qasrawet archaeological
site in North Sinai. Remote Sensing, 10(7), 1090.
Szeliski, R. (2010). Computer vision: Algorithms and applications. London: Springer.
Tao, C. V., & Hu, Y. (2001). Use of rational function model for image rectification. Canadian Journal of Remote Sens-
ing, 27(6), 593–602.
Ur, J. A. (2003). CORONA satellite photography and ancient road networks: A Northern Mesopotamian case study.
Antiquity, 77(295), 102–115.
Ur, J. A. (2017). WorldMap: Hollow ways in Northern Mesopotamia. Boston, MA: Harvard Dataverse. Retrieved from
http://worldmap.harvard.edu/maps/14984
Wickstead, H. (2009). The uber-­archaeologist: Art, GIS and the male gaze revisited. Journal of Social Archaeology, 9(2),
249–271.
Wilkinson, T. J. (1994). The structure and dynamics of dry-­farming states in Upper Mesopotamia. Current Anthro-
pology, 35(5), 483–520.
Wilson, D. R. (2000). Air photo interpretation for archaeologists. London: The History Press Ltd.
Wiseman, J., & El-­Baz, F. (2007). Remote sensing in archaeology. New York: Springer.
Zingman, I., Saupe, D., Penatti, O. A. B., & Lambers, K. (2016). Detection of fragmented rectangular enclosures in
very high resolution remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 54(8), 4580–4593.
Zitová, B., & Flusser, J. (2003). Image registration methods: A survey. Image and Vision Computing, 21(11), 977–1000.
20
Processing and analysing
geophysical data
Apostolos Sarris

Introduction
Despite the specificities of each geophysical technique, the goal is to maximize the information content
of the measurements taken and to transform the registered signals into a simple, clear and accurate form
that will allow a direct interpretation of them in relation to the suspected targets and the properties of
the surrounding soil matrix. Since subsurface soil interactions depend on different environmental and
climatic variables and anthropogenic interventions, it is necessary to apply specific processing to high-
light the information that is partially obscured by them. The need for the filtering of geophysical data
is obvious because experimental data contain external noise levels that mask the essential content of the
information that exists in the original measurements.
The processing of geophysical data follows a more or less standard flow, consisting of the preprocessing
and editing of the raw data, the basic processing routines based on signal or image processing, the applica-
tion of more sophisticated filters and algorithms, and the creation of maps for visualization purposes. Cer-
tain pre-­processing steps are required to improve the quality and accuracy of the original measurements.
Simple processing, such as pre-­amplification of the signal and noise reduction, can be achieved with
digital filtering. A number of standard processing techniques can be used for all kinds of data (magnetic,
resistance, conductivity, etc.) acquired with a normal sampling strategy within a grid (mapping mode).
More specialized processing is required for techniques that deal with data related to Ground Penetrating
Radar (GPR), Electrical Resistivity Tomography (ERT), seismic techniques, Electromagnetic Induction
(EMI) techniques and microgravity measurements.
The sections below present a short review of the particular processes utilised for the most commonly
applied techniques. The aim has been to provide an indicative, rather than exhaustive account, since more
detailed discussions of individual methods can be found elsewhere. For example, a thorough summary of
the most sophisticated processing techniques applied to principally magnetic archaeological survey data
has been given by Scollar, Weidner, and Segeth (1986) and Scollar, Tabbagh, Hesse, and Herzog (1990,
pp. 491–513). The fundamental methodologies on GPR data processing are also described in Jol (2009)
and Conyers (2013).
Processing and analysing geophysical data 377

Method

The processing of shallow depth geophysical data: magnetic, soil resistance


and electromagnetic mapping

Preprocessing techniques
De-­spiking of the original data is necessary in the initial stage of processing. Point-­like random interfer-
ence (expressed as extreme values in the data) caused by the existence of isolated features, instrumental
malfunctions or by improper measuring conditions, needs to be removed and usually replaced by inter-
polated values, based on a subset (array) of the surrounding measurements. This is usually achieved by
the continuous shifting of the filter to the next subset of the observed data (moving window method).
A situation frequently encountered during the zig-­zag acquisition of measurements (i.e. where mea-
surements are logged up and down adjacent transect lines) is the defect of traverse striping, namely a
different average value existing along alternating transects according to the alignment of the sensor (most
common in fluxgate magnetometry data due to an improper alignment of the sensor). The reduction
of the values along each transect to a common value removes the striping effect and at the same time
can contribute to the matching of adjacent grids. Common to the zig-­zag mode of surveying is also the
problem of staggering effects, namely the slight misalignment of the measurements along the alternat-
ing transects, depending on the survey pace (mainly in the automatic mode of surveying). This kind of
displacement has to be corrected accordingly by shifting the values to their exact position.
The above processes are usually followed by reduction of the data to a common reference value (e.g.
the average background resistance for soil resistance data or the 0-­mean for vertical magnetic gradient
data). The reduction of the measurements to a common reference level also contributes to the matching
of the measurements that have been carried out in different grids (grid matching or grid equalization).
In situations where surveys are carried out over a long period of time or because of incorrect balancing
of the instruments, the background values of different grids can be shifted with the consequence that the
grid results can be difficult to match. In soil resistivity/conductivity surveys, the mismatching of grids
can also result from changes of climatic conditions between grids and the repositioning and separation of
the remote probes. In magnetic surveys, mismatching of grids originates due to the balancing of the sen-
sors at different locations or due to diurnal variations in the magnetic field. After achieving an optimum
matching among grids, the entire geophysical map of a site is processed as a whole (Sarris, 1992; Geoscan
Research, 2005) (Figure 20.1).

Main convolution
All the techniques that measure the total magnitude of a physical quantity include an arbitrary amount
of background noise, usually systematic in nature, due mainly to the geological nature of the site and
the top soil uniformity. Alongside these broad, long-­range regional trends originating from the under-
lying geology are instrumental random errors or noise spiking signals, which are also often present in
the geophysical maps. Filters of different ranges with the appropriate cut-­off thresholds can be used
to isolate anomalies of interest. Most of the filtering treatment of geophysical data from archaeologi-
cal sites is performed in the spatial domain. Here, low pass filters use a high value threshold in order
to permit long-­range geological features to pass, whereas high pass filters use a low cut-­off value to
Figure 20.1 Multisensor (8 sensors) vertical magnetic gradient survey with SENSYS at Pella, North. Greece.
Left image indicates the original data suffering from various spikes, traverse striping and grid mismatching.
Right image indicates the results of processing that tried to remove those specific effects. A colour version of
this figure can be found in the plates section.
Processing and analysing geophysical data 379

allow only the high values to pass. A large search radius (or large dimensions of the moving window)
of a low pass smoothing filter can result in the virtual elimination of small anomalies and noise. In
archaeological surveys, where some of the anomalies are weak compared to the general background,
a small moving window is suggested (3×3 or 5×5), whereas for very large surveys, larger windows can
be employed to eliminate any regional trends. Band pass filters are also used to allow a certain range
(band) of values to pass by combining a broad filter to reject the regional trends and a narrow filter to
remove high values.
Residual filtering, by means of subtracting each reading from the mean of measurements, both along
a profile (line equalization) or throughout the map, can eliminate overall trends and emphasize subtle
anomalies. Low pass filtered residuals (the difference of low pass data from raw data) are used to achieve
an enhancement of magnetic data by reducing high frequency noise. On the other hand, high pass filters
(e.g. by means of measuring the curvature of the map and assigning +/– values for +/– curvature and a
0 value at inflection points) can outline the edges of features and increase the degree of details of certain
anomalies, in a similar way to residual filtering (MacLeod & Dobush, 1990).

Processes applied specifically to magnetic potential field measurements


Like gravity methods, magnetic methods of prospection are based on the potential field theory (Reford,
1980). There are, however, basic differences as the magnetic methods are based on the measurement of the
Earth’s magnetic field, which is dipolar, time dependent, and of variable direction, in contrast to gravity
methods that target the measurement of the earth’s gravitational field, which is monopolar, radial, and
mostly time independent.
Processing of the magnetic data depends critically on the methodology and the instrumentation used.
When total magnetic field measurements are involved, diurnal variations of the magnetic field may cause
distortions if insufficient corrections are applied. Weymouth and Lessard (1986) have summarized the
different methods of correction for diurnal variation. Smooth temporal variations can be corrected by
interpolation of the values of a base station magnetometer that records the changes of the geomagnetic
field over regular (or even irregular) time intervals. If one total field magnetometer is used, then it is
necessary to return to the base station after the end of each traverse and interpolate between base sta-
tion measurements with the assumption that each traverse is scanned within the same period. If two
magnetometers are used, one of them can record the geomagnetic field at regular intervals and then time
interpolation can be used with the assumption that the geomagnetic field does not change rapidly, as is the
case, for example, with magnetic storms. In the difference mode, the reference (base station) magnetom-
eter reads the local magnetic field simultaneously with the moving magnetometer that scans a particular
grid, and their difference caused by diurnal variations is corrected, leaving only the spatial differences of
the local magnetic field (Weymouth, 1976, p. 11; Bevan & Kenyon, 1975, p. 13). The tie-­line method
can also be used for correcting the raw data of a magnetic survey. According to this technique, a tie-­line
(traverse) is run perpendicular to the direction of the parallel traverses of the scanned grid, repeating the
measurements of the magnetic field on a grid point for each of the parallel traverses. The difference (shift)
of the duplicate readings adjusts each traverse in a way similar to the base station method (Weymouth &
Lessard, 1986, p. 38; Bevan & Kenyon, 1975, p. 12; Lessard, 1981, p. 2; Scollar et al., 1990, p. 474). Other
techniques that address the topic of diurnal variation corrections include least squares, polynomial power
series and spline approximations (Scollar et al., 1990, pp. 475–476). In general, in order to achieve the
high precision (<1nT), which is usually required in archaeological surveys, corrections for diurnal varia-
tions of the geomagnetic field are necessary, independently of how intense the geomagnetic activity is
(Weymouth & Lessard, 1986, p. 46).
380 Apostolos Sarris

Similar to the above, the total magnetic field measurements suffer from regional magnetic trends,
which are capable of masking short-­range magnetic anomalies. The regional trends can be approximated
by either employing a single dipole approximation of the geomagnetic field or calculating the latitudinal
and longitudinal gradient components. In the majority of archaeological prospection surveys, a simple
trend analysis and removal of a least square fitted surface from the actual data are sufficient due to the
small extents of the areas surveyed. As a consequence of the removal of trend, regional patterns are
reduced and the local anomalies are emphasized in a similar way as the computation of high pass residuals
(Kearey & Brooks, 1984, pp. 188–189). In the frequency domain, the removal of trend is a necessary pre-­
treatment procedure prior to the Fourier transformation of the raw data, since linear trends and non-­zero
means of the data increase the power of the spectrum in the lower wavenumber range.
Dean (1958) and Scollar (1970) have summarized the advantages of the frequency analysis of potential
magnetic data. In the frequency domain, where signals are converted to frequencies, operations are less
intensive, and the effects of convolution processes are more obvious than in the spatial domain, where
processing is applied directly to the pixel of an image. In the frequency (or wavenumber) domain, different
frequency filters are applied to enhance a certain range of frequencies. The process involves the transforma-
tion from the space domain (x-­y plane) to the frequency domain (u-­v plane), by a two dimensional Fourier
series bounded by both upper and lower frequencies due to the finite nature of the area and the sampling
size of the survey. Thus, the Fourier series becomes a representation of the different frequencies that form
the potential field data. Due to the finite nature of magnetic potential data, instead of a Fourier series, a Fou-
rier transform (usually a Fast Fourier Transform, FFT) can be used to generate a correspondence between
spatial and frequency domain data (Scollar, 1970, p. 11; Jahne, 1991, p. 53). The FFT conserves the image
information and the inverse transform can fully recover the spatial representation of data. In the frequency
domain, each point of the grid (image) is represented by the amplitude (relative position) and the phase of
a periodic function. Features are resolved into different frequency components, including regional trends,
local influences and noise, adding a new perspective to the two dimensional spatial data.
The power spectrum (i.e. the squared amplitude of the Fourier components) of the transformation
indicates the amplitude of the wavenumbers and carries slightly less information due to the loss of the
phase component. In general, deep geological structures are responsible for generating low amplitude
anomalies of large horizontal extent and low horizontal gradients (Mares, 1984, p. 127). Thus, the power
spectrum of the FFT regional (geological) component is represented in the lower wavenumber (k) range
of the spectrum (Figure 20.2). It has to be noted that the low wavenumbers (k) correspond to low fre-
quencies (f) and large wavelengths (λ). On the other hand, local features in shallower strata are character-
ized by high amplitude anomalies, of small horizontal extent and more abrupt changes of their horizontal
gradients. These high frequency features are best represented in the higher wavenumber range of the FFT
power spectrum of the magnetic data (high k, high f, small λ). The bimodal nature of the energy spec-
trum with respect to the wavelength permits the distinction between shallow and deep magnetic sources,
which are usually separated by a neutral (no energy) wavelength band (Alldredge, Van Voorhis, & Davis,
1963). The power spectrum can also be used for approximating the horizontal dimensions of the subsur-
face targets, and estimates of the depth of the target features can be calculated from the decay rate of the
power spectrum (Bhattacharyya, 1966, p. 98). Although Fourier analysis is limited by the spectral overlap
of the anomalies, some more information can be obtained by the examination of the phase component of
the anomalies (Zurflueh, 1967, p. 1017). In total, FFT and the frequency processing of magnetic potential
data are useful in resolving specific anomalies, changing the effective field inclination, calculating the
vertical derivatives of the total magnetic field intensity and identifying areas of interest. A summary of
the different frequency filtering techniques is given by Telford, Geldart, and Sheriff (1990, pp. 107–110).
The selective two-­dimensional filtering of high or low wavenumber components enhances the cor-
responding shallow or deep features associated with those components (Kearey & Brooks, 1984, p. 169).
Processing and analysing geophysical data 381

Figure 20.2 Application of the Fast Fourier Transform (FFT) power spectrum analysis of the magnetic data
obtained from the Bronze Age cemetery of the Békés Koldus-­Zug site cluster – Békés 103 (BAKOTA proj-
ect). The depth of the various targets (h) is easily determined by measuring the slope of the power spectrum
at different segments and dividing it by 4π (Spector & Grant, 1970). The radially averaged spectrum was cal-
culated and used to separate the magnetic signals coming from deep sources (h=2.87 m) and shallow sources
(h=0.73 m) below the sensor. The spectrum was also used as a guide to define a bandwidth filter in order to
eliminate the sources with wavenumber more than 550 radians/m and less than 100 radians/m respectively,
and enhance the magnetic signal coming from the potential archaeological structures. A colour version of this
figure can be found in the plates section.

High frequency anomalies from shallow sources can be enhanced by high pass filters. Low pass filtering
can eliminate high frequency effects created by inhomogeneities close to the magnetometer sensor or
in the vicinity of the electrodes in an electrical resistance survey (Pattantyus, 1986, p. 566). In order to
achieve a separation of regional and residual anomalies with frequency filters, the spectral contents of the
anomalies from the two different depths are assumed to be somewhat different.
382 Apostolos Sarris

In magnetic or EMI data, any change in the height of the sensor from the ground, due to deep plough-
ing, changing of operators or even imperfect balancing of the sensors, may cause striping effects either
within a grid or among grids. This will result in periodic noise that is removed by examination of the
power spectrum of the data, which specifies the degree of periodicity and the filter that has to be applied
to remove the effects of striping.
In upward and downward field continuation, the potential field is calculated above or below the plane
of measurement in order to give emphasis to deeper or shallower features correspondingly (Sheriff, 1989,
p. 141). It can enhance or depress the regional or residual component of the maps and even reduce topo-
graphic and sensor height variation effects (Telford et al., 1990, p. 106). Upward continuation of the mag-
netic field is applied in order to emphasize the regional trends originating from deep geological sources.
The opposite effect is achieved by downward continuation of the magnetic potential data, resulting in
an attenuation of the deep sources and accentuation of near surface targets. The successive application of
downward continuation is able to resolve the overlapping anomalies of adjacent features, give an estimate
of their vertical extent and enhance the high wavenumber components of the magnetic map, namely
those related to the shallow subsurface features. Downward continuation is accurate enough up to a depth
of two to three times the separation between the source and the sensor (Bhattacharyya, 1965, p. 857).
When dealing with total magnetic field measurements, the calculation of the vertical derivatives can
emphasize the outline of features present near the inflection points of the magnetic field. Thus, the verti-
cal derivatives are used as a measure of the curvature of the magnetic map, the large values of which cor-
respond to the shallow targets. Similarly, the reduction of magnetic potential data to the pole has been one
of the standard geophysical processing steps in eliminating the distortion of the magnetic map due to the
obliquity of the magnetic field, thus locating the magnetic anomalies right above the sources responsible
for them. The frequency response of the filter (Scollar et al., 1990, p. 494; Spector, 1975) depends on a
factor for the direction of the measurement and a factor for the direction of magnetization. The method
produces a simpler and more symmetrical appearance of extensive features, suppresses weak anomalies
and removes the general distortion of the magnetic maps.

Processes applied specifically to multi-­sensor magnetic measurements


Usually in the case of multi-­sensor magnetic surveys, a large number of measurements are collected with
hand-­towed carts or mechanically driven vehicles, based on the automatic triggering of a differential
GPS unit. The navigation of the systems also relies on GPS units, but obviously, the acquired data are not
gridded and need more bespoke processing methods.
Due to the large number of sensors, it is common either to have malfunction issues of one or more
sensors, or have a number of overlapping measurements from different sensors, which create problems of
interpolation in the final dataset. A statistical analysis of the measurements for each sensor is needed in
order to examine problems related to their proper operation during a survey. Furthermore, a graphical
display of the paths followed by each configuration of sensors is required to decide upon the removal of
specific values through either a sensor or transect based approach.
The lack of gridded data makes most of the normal filtering solutions (such as moving window fil-
tering) tedious to apply. For this reason, most of the processing is applied for each transect (each sensor)
separately. For example, de-­spiking and de-­striping can be achieved through the reduction of the values to
0-­mean along specific lengths of each transect. As transects are usually of significant lengths and exhibit
a large variability in their measurements, they can be divided into sections defined by their local minima
and maxima along the transect. The mean value is calculated for each section and subtracted to reduce
all values within the sections to the 0 level. As this method is susceptible to removing features along the
Processing and analysing geophysical data 383

transects, an alternative solution is to work with FFT de-­striping as in the gridded data. This however
requires that any gaps within the survey are filled with interpolated values, as the FFT does not accept
measurement voids or non-­orthogonal area shapes (Kalayci & Sarris, 2016).
Interpolation procedures are also problematic in non-­g ridded data, as the proximity of the sensors, the
overlap of measurements from different sensors, the non-­equal balance of the sensors, the titling of the
sensors during the survey and other practical factors introduce inconsistencies among the measurements
obtained. For this reason, different functions (such as a convex hull or an a-­shape algorithm) are applied
to create a buffer of a piecewise linear simple curve associated with the distribution of the finite set of
measurements (Bernardini & Bajaj, 1997). The particular buffers indicate the overlap of the regions, lead-
ing to the elimination of spatially close measurements. Then, interpolation algorithms (such as Inverse
Distance Weighted (IDW), kriging, minimum curvature, nearest neighbor, etc. (see Conolly, this volume;
Lloyd & Atkinson, this volume) can be used to create the resulting map.

The creation of maps


Processing of the original measurements is followed by the creation of corresponding maps. The location
of each measurement is either defined through actual projected coordinates (usually obtained through
a GPS) or through relative coordinates corresponding to their location inside a grid. Having applied a
reduction factor to normalize the data of each grid to a common reference value, the individual values
are gridded through different interpolation algorithms (usually through kriging, nearest neighbourhood
or inverse distance algorithms). Interpolation is usually carried out with an equal or smaller spacing than
the original sampling frequency of the measurements in order to create a smooth image and avoid a pixel
like appearance. Selective compression of the dynamic range of values (namely reduction of the original
range of data that are mapped) can be employed to intensify vague anomalies existing close to the back-
ground level and a mask file needs to be created to isolate the areas that have not been surveyed due to the
existence of thick vegetation, fences, modern structural remains, and other surface features and obstacles.
Colour and grey-­scale maps of the interpolated data are produced. Usually, hot colours (i.e. reddish)
in colour scale maps and light (white) colours in grey-­scale maps represent high intensity values. Cold
colours (bluish) in colour-­scale maps and dark (black) colours in grey-­scale maps represent low intensity
anomalies. Shading relief can be used to provide a better representation of the produced maps, based on
the artificial illumination of the data from a specific angle in such a way as to provide a relief impression.
By adjusting the declination and inclination of the artificial light source, certain features, mainly long-­
range linear patterns, are emphasized at the expense of small-­localized anomalies.

The processing of electrical resistivity tomography data


In contrast to normal soil resistance mapping, Electrical Resistivity Tomography (ERT) employs multi-
plexed measuring systems, which use different combinations of electrodes of different spacing to acquire
measurements of the vertical variations of the soil resistivity. This kind of technique provides stratigraphic
information of the subsurface as it maps the lateral and vertical variations of the soil’s resistivity.
The vertical distribution of the resistivity is mapped as a 2D pseudo-­section, based on the assumption
that the depth of resistivity values increases proportionally to the electrode separation. In this way, the
measured resistivity values are placed at the intersection of two 45-­degree lines through the centres of
the corresponding electrodes (Loke, 2004) or at the median depth of investigation (Edwards, 1977). As
the subsurface is not a simple layered model, but instead consisting of heterogeneous localized materials,
inversion algorithms are employed to make an approximation of the real values of the resistivity of the
384 Apostolos Sarris

soil strata. Different 2D/3D resistivity inversion algorithms (Loke & Barker, 1996; Loke, 2004) can be
applied and most of them partition the subsurface into a number of layers, which are also subdivided to a
number of individual elements, which are allowed to have an independent resistivity value. These inver-
sion algorithms run iteratively, continuously changing the resistivity of these elements until they achieve
the best possible match between the theoretical values of the electrode configurations with the observed
measurements. The divergence between the actual measured values and the theoretical model is evalu-
ated through the calculation of the Root Mean Square (RMS) error. The stability of the RMS is a sign
of the best matching between the calculated and apparent resistivity values and leads to a more accurate
reconstruction of the distribution of the subsurface resistivity (Papadopoulos et al., 2011).
In order to avoid large inconsistencies in the inversion model, it is necessary to eliminate any extreme
measurements. These are caused by poor contact resistances of the electrodes or connectivity problems
and have a more confined dynamic range of resistivity values. Topographic corrections are also required
to define the actual location of the electrodes and create a more realistic image of the pseudo-­section in
terms of the existing terrain.
Having a number of parallel ERT transects, it is further possible to create a 3D model of the apparent
resistivity and from this to extract slices of the horizontal distribution of the resistivity with increasing
depth (Figure 20.3), in a similar way to the depth slices of the GPR survey (Papadopoulos, Tsourlos,
Tsokas, & Sarris, 2006, 2007).

Figure 20.3 (a) 3D resistivity model. (b) Three dimensional distribution of the calculated apparent resistivity
results from model A. (c) Pseudo-­3D slices of the resistivity resulting from the 2D inversions along the X, Y
and XY axes. (d) 3D resistivity model from the three dimensional inversion. Due to the wide range of the
resistivity values, a logarithmic scale is used (Papadopoulos et al., 2006).
Processing and analysing geophysical data 385

The processing of ground penetrating radar measurements


Ground Penetrating Radar (GPR) transects are either collected with the help of a GPS that attaches exact
coordinates to each one of the GPR traces or they are acquired along parallel transects within a rectan-
gular grid and then subsequently georeferenced based on the exact coordinates of the edges of the grid.
The processing of the GPR signals is initiated with the manipulation of the individual transects (radar-
grams), which consist of a number of traces according to the sampling strategy. Their processing follows
the same fundamental directives as seismic processing methods (see further in chapter). Preprocessing
deals with the application of algorithms that adjust each radargram to the rest in terms of their length,
location and signal initialization. The next stage is dealing with the enhancement of the anomalies at
different depths and the removal of noise, followed by the creation of a three-­dimensional volume of
the stratigraphy out of which depth slices representing the horizontal distribution of the reflectors are
extracted.
Having established the correct placement of the individual radargrams, it is important to apply correc-
tions that deal with the exact positioning of the traces along the survey profiles (transects). This procedure
(trace reposition) eliminates any systematic or random offsets in the starting and ending locations of the
GPR profiles. This is important in surveys on rough terrain or cases that have to deal with a number of
obstacles (e.g. trees, rocks, standing monuments, etc.). The correction is vital when no GPS navigation
is used and the profiles have to be adjusted in order to correspond to their correct extent. Interpolation
algorithms are used to expand or contract the GPR profiles to correspond to the right length, whilst
simultaneously making the appropriate corrections to the step size of the GPR traces. Topographic cor-
rections are also needed in order to place the radargrams in their appropriate position on the terrain,
especially if the elevation is changing abruptly within the survey area. Since the merging of the various
transects is important in the following processing steps, a time zero reduction to the initial signal of each
transect is necessary to equalize the different radargrams to the same base line, namely to normalize the
time (translated to the correct vertical position) that the first pulse entered the subsurface. The time
zero reduction corresponds to a threshold value expressed as a percentage (%) of the absolute maximum
amplitude on a data trace (Szymczyk & Szymczyk, 2013; Sensors & Software, 2013).
Enhancement of the electromagnetic (EM) signal is followed mainly through the removal of the low
frequency noise (the so-­called “wow”) which is derived by low frequency energy near the transmitter,
associated with electrostatic and inductive fields (Dewow removal or signal saturation correction). A zero
phase high pass filter is applied to all points and the average is subtracted from the central point for each
point along the trace (Figure 20.4).
As the reflection signal from deeper reflectors gets attenuated, a gain function is usually necessary to
amplify them. Various signals are applied: SEC (Spreading Exponential Compensation) gain provides
an equalization of the amplitudes of the reflectors along a trace through the multiplication of a time-­
depended exponential function with the trace; AGC (Automatic Gain Control) normalizes the level of
intensity of the weak and intense signals through the application of a function which is inversely propor-
tional to the signal strength. Obviously this kind of procedure enhances the background noise and for this
reason a regional or local background subtraction filter, calculated by averaging all traces in a transect, is
applied to reduce the background trend noise from the data. In a similar way to the processing of other
geophysical signals, frequency domain filters (low pass, high pass and band pass filters) defined as a per-
centage (%) of the Nyquist frequency (i.e. the highest sampling frequency able to fully reconstruct the
signals), are applied to the signals to separate specific frequency ranges. The Average Frequency Spectrum
(AFS) plot (amplitude (mV) vs frequency (MHz)) is examined in order to define the parameters of the
specific filters. As with seismic data, there are a number of other filters available which are applied either
(a)

(b)

Figure 20.4 An example of processing approaches applied to a radargram obtained at Lechaion archaeological
site with an antenna of 250MHz: (a) Raw data without any processing, (b) application of Dewow, (c) Spreading
Exponential Compensation (SEC) gain with an attenuation of 6.16, start gain of 2.56 and maximum gain of
542, (d) application of automatic gain with a window width of 1.5 and maximum gain of 500, (e) application
of regional background removal, (f) migration process, (g) application of a high pass filter, and (h) envelope
transformation.
(c)

(d)

Figure 20.4 (Continued)


388 Apostolos Sarris

(e)

(f)

Figure 20.4 (Continued)

on the adjacent traces in the spatial direction (spatial filters) or along the traces in the vertical direction
(down to trace or vertical filters) (Annan, 2009; Sensors & Software, 2013).
Having processed each radargram, a Hilbert Transform computes the instantaneous amplitude, con-
verts it to a magnitude value based upon which time slices are extracted. Usually, the specific process is
accompanied by a migration process, which reduces the reflectors’ hyperbolas into point like sources.
Having an estimate of the velocity of propagation of the EM signal through the specific ground condi-
tions, time slices are transformed to depth slices, namely maps indicating the horizontal distribution of
the reflectors (reflector amplitude) with increasing depth. In depth slices, intense values indicate strong
Processing and analysing geophysical data 389

(g)

(h)

Figure 20.4 (Continued)

changes of high reflectivity, namely changes of the electrical properties of the soil strata. For the accu-
rate determination of the velocity of propagation of the EM waves (~0.10m/ns on average), Common
Mid-­Point (CMP) or Wide Angle Reflection and Refraction (WARR) GPR surveys are needed. Due to
the compactness of the antennas used, this is not a common practice in most archaeological prospection
surveys; instead an estimate of the velocity is calculated through hyperbola shape fitting to one of the
registered reflectors (Sensors & Software, 2015, pp. 87–92).
Depth slices are usually exported to 3D visualization software to create a volumetric representation of
the reflectors’ distribution. Through the isolation of the weak background noise, it is possible to separate
the most intense reflectors and have a 3D visualization of their extent.
390 Apostolos Sarris

The processing of micro-­gravity measurements


Micro-­gravity measurements fall within the potential field methods category and aim towards the deter-
mination of the subsoil’s lateral density by measuring variations in the acceleration of gravity (g) within
the gravitational field of the earth. Readings are usually taken along transects or within a grid and they
always need to be compared to a nearby base station as they are sensitive to fluctuations in temperature,
atmospheric pressure, altitude, etc. On the other hand, in contrast to the magnetic anomalies which
behave like magnetic dipoles, gravitational anomalies are monopolar, as they depend only on the gravity
distribution below the soil (and the area around) and thus their interpretation is more direct.
Due to the sensitivity of the measurements and the slight variations in them, in order to make a read-
ing valuable in terms of its interpretation, various processes need to be applied to the measurements.
Initially, readings are balanced to the specifications of the instruments, as each spring of the gravimeter
has a different elasticity factor. A drift correction based on repeated measurements at the base station is
also required due to changes of the mechanical operation of the springs and the tidal diurnal variations
(0.05mgals/hour).
Since the acceleration of gravity varies with the geographic latitude (increasing from the equator
towards the poles), a latitude correction based on theoretical acceleration values is necessary. Similar kinds
of variation exist due to the Earth’s geoid, namely the distance from the Earth’s centre, with the accel-
eration of gravity decreasing with the increase of the altitude. This “free air” correction is also based on
the reduction of the measurements taken at the base station with g increasing (or decreasing) by about
0.3086mGal/m in case the base station is above (or below) the local datum (Reynolds, 2011).
The Bouguer correction (Equation 20.1) is crucial as it depends on the density of the features that lie
beneath the location of the measurements. The correction is applied (subtracted from the actual values)
based on the gravitational attraction exerted by a body (e.g. a slab of thickness h in metres) of known (or
approximated) density (ρ in in kg/m3):

Dg (Bouguer ) = 0.0004191× ρ× h.  (20.1)

Complementary to the Bouguer correction is a topographic/terrain correction, as the variations of the


topography (e.g. measurements on an elevation datum close to a low altitude valley or a high elevation
mountain) influence the microgravity readings due to either mass deficiency or mass increase. The above
adjustments of the microgravity readings lead to a corrected version of the measurements which represent
the so called Bouguer anomaly (Equation 20.2; Parasnis, 1997).
Bouguer anomaly = measured values of g + free air correction − Bouguer correction +
 (20.2)
terrain correction + latitude correction
The measurements of the gravitational field are caused by features of different densities existing at various
depths (deep or shallow). In 2D, Bouguer anomaly maps represent horizontal differences in the accelera-
tion of gravity caused by the horizontal changes in density (Figure 20.5). In a similar way to the potential
theory of magnetic fields, the power spectrum of the microgravity measurements can be used in order to
separate deep seated features, regional anomalies (broad anomalies of low amplitudes), from the shallow
features, residual anomalies (confined anomalies of large amplitudes).
When microgravity measurements are carried out within monuments and historical buildings, empha-
sis is given to the relative variations of microgravity instead of absolute gravity measurements. Thus,
when dealing with microgravity measurements inside structures, we either apply standard correction
Processing and analysing geophysical data 391

Figure 20.5 Gravity residual anomalies recorded above two tombs (Tombs 4 (above) and 8 (below)) of the
Roman cemetery on the Koutsongila Ridge at Kenchreai on the Isthmus of Corinth, Greece. The centres
of the tomb chambers are located approximately at the middle of the transects. According to the result-
ing graphs, it is estimated that both tombs have a width of about 4.5–5 m. The gravity signature of tomb
T4 is better presented compared to the one of tomb T8, probably, because T4 is located within a more
homogeneous geological unit (valley fill deposits), whereas T8 is located at the border between the valley
fill deposits and conglomerate outcrops that extend to the central section of the ridge. Both tombs have
created a well-­defined gravity anomaly with at least 0.04–0.08 mGal maximum variation with respect to
the average background (Sarris et al., 2007).

approaches or simplified sensor position (taking into account the actual location of the sensor) or down-
ward continuation (taking into account the height differences of the sensor) corrections (Panisova &
Pasteka, 2009). The gravitational effect of the surrounding walls of the structures are modelled through
finite elements or prisms (having a specified thickness, density and height) reaching a range of about
10–100 mGal (Debeglia & Dupont, 2002).
392 Apostolos Sarris

Processing of the electromagnetic induction measurements


Most recent shallow depth electromagnetic induction (EMI) instruments (e.g. GEM2 or Profiler EMP)
consist of a broadband multi-­frequency electromagnetic sensor and a fixed transmitter-­receiver geometry
that allows multiple datasets to be collected. The transmitter coil generates a primary electromagnetic
field and the receiver registers a secondary electromagnetic field, which is induced through the flow of
the electrical currents through the ground (so called “eddy currents”). The secondary field consists of
two orthogonal components, the real component (in-­phase component) and the imaginary component
(quadrature or out-­of-­phase component). Crucial in an EMI survey remains the transformation of the
measurements to actual values of the physical properties of the soils.
For the calibration of EMI measurements, a vertical electrical sounding (taking measurements as the
instrument is raised at different heights) is carried out and the resulting measurements are compared to
a simplified model consisting of geological layers having different electrical properties (the layered earth
model). This comparison allows a conversion factor to be defined between the field measurements of the
out-­of-­phase component (electrical conductivity) to ppm (theoretical EMI response). The calculation of
the offset factor for the in-­phase component is similar, since at a large height (~2m above the surface),
the response induced by the susceptibility is negligible.
Following the above calibration, a transformation of EMI in-­phase and out-­of-­phase measurements is
carried out to convert them to actual values of magnetic susceptibility, magnetic viscosity, and electrical
conductivity. The difference of the out-­of-­phase measurements for two frequencies is calculated and it is
compared against a theoretical curve in order to approximate the actual conductivity value. In a similar
way, the values of the magnetic susceptibility and magnetic viscosity are obtained, considering that the
magnetic susceptibility is frequency (3–300 kHz) dependent, while magnetic viscosity is independent of
the frequency and always remains the same (Simon, Sarris, Thiesson, & Tabbagh, 2015). Recently, Simon,
Tabbagh, Donati, and Sarris (2018) have also proposed a way to determine the apparent permittivity,
when dealing with frequency responses of more than 20 kHz. This seems to be of significance in case
we are dealing with very conductive soils, where the in-­phase component may be influenced more from
the soil’s permittivity (proxy for the type of soils) rather than the soil’s magnetic susceptibility (proxy for
human activities).

Processing of seismic measurements


Seismic methods belong to the suite of active geophysical techniques, where a source (hammer or explo-
sive) creates a ground movement, which is measured by a series of geophones placed in different locations.
In refraction seismic techniques, individual seismograms are plotted and the first arrival times (first breaks)
of the refracted acoustical waves are picked. Diagrams of travel times versus distance (offset) between
the source and the geophones are created, corresponding to the depths of the subsurface interfaces. Esti-
mation of the number and depth of the subsurface layers is based on their physical properties and the
approximation of the velocity of propagation of the acoustic waves through them. Static corrections such
as weathering (upper surface layer) and elevation corrections of the individual shots and geophones are
required for a better interpretation of the depths (Scott & Markiewicz, 1990). Repeated measurements
taken by applying the same source can also increase the signal to noise ratio due to the repeatability of the
actual seismic signals. Delay-­time methods are also preferred in the interpretation of the seismic refraction
data (Green, 1974; Burger, Sheehan, & Jones, 2006).
Reflection seismic survey makes use of the energy intensity of the acoustical waves that is reflected
from the different subsurface layers (interfaces). The Common-­Mid-­Point (CMP) technique is usually
Processing and analysing geophysical data 393

employed in reflection seismic surveys. Lacking source coupling due to rough surface conditions, the
effect of the weathering layer and topography of the survey area attenuate the seismic signals and produce
weak reflections. Signals registered by the geophones represent a time series, as the acoustic waves are
recorded at different times depending on the distance of the geophones from the source. The effect of
multiple reflections in a layer (ringing effect), which is often encountered in the seismic measurements,
is addressed through a deconvolution process, which collapses the wavelets to their approximate point
locations. As the travel times of the acoustic waves are recorded with respect to the distance offset, time
(Normal Move Out – NMO) corrections are applied based on a primary velocity analysis to separate the
uncorrelated coherent noise (multiples) from the primary reflections. Frequency multiples can be further
removed through CMP stacking, which accounts for increasing the signal to noise ratio. Frequency-­
wavenumber migration is the final main processing stage as it reduces the hyperbolas of the diffraction
signals to their actual point-­like locations (Yilmaz, 1976; Hatton, Worthington, & Makin, 1986). Trace
to trace decorrelation attenuates even further the background noise. At the end of the processing and
based on the modeling of each seismic profile, transects are integrated to provide a 3D layered model of
the subsurface (Figure 20.6).

(a)

Figure 20.6 Results of a seismic refraction survey at the area of the assumed ancient port of Priniatikos Pyrgos
in East Crete, Greece: (a) 2D image representing the depth to the bedrock, which reaches about 40 m below
the current surface (bluish colors). The black dots represent the position of the geophones along the seismic
transects. The area has been completely covered by alluvium deposits and other conglomerate formation frag-
ments as a result of past landslide and tectonic activity. The interpretation of the velocity of propagation of
the acoustic waves revealed the spatial distribution of (b) the alluvium deposits at the top (velocity of 491 m/
sec), (c) the lower and upper terrace deposits (velocity of 1830 m/sec), (d) the medium depth sandstones and
conglomerates (velocity of 2400 m/sec) and (e) the deeper weathered limestone or cohesive conglomerates
(velocity of 4589 m/sec) (Sarris, Papadopoulos, & Soupios, 2014). A colour version of this figure can be found
in the plates section.
(b)

(c)

Figure 20.6 (Continued)


(d)

(e)

Figure 20.6 (Continued)


396 Apostolos Sarris

From processing to interpretation


Most of the image processing techniques used in the image manipulation of a geophysical map tend to
eliminate background noise, highlight the most obvious anomalies and produce as reliable an image of
the subsurface plan of a particular site as is possible. Still, any kind of processing cannot result in any
more information than that contained in the original data. The aim of data, signal or image processing
is to reveal or decode the information hidden in the raw data. As a result of the transformations used
and the finite limitations of the data, a number of by-­products of the filtering processes can be generated
distorting the information context of the produced maps. These artificial anomalies have to be identified
by comparing the processed data with the raw data. The success of a filtering method lies in the clarity
of representing the desired kind of information.

Case study
A number of geophysical approaches have been carried out in the past years aiming to explore the inter-
nal organization of a number of Neolithic settlements in the area of Thessaly, Central Greece, and their
environs. Thessaly is the locus of a high density of mounted (magoules) and flat settlements which played
a critical role in the origins of the Neolithic in Europe. The limited number of systematic excavations or
the application of large scale non-­invasive methods has been a serious defect in understanding the spatial
organization of these agricultural villages. In order to explore the structural diversity and similarities of
the settlements, a manifold geophysical survey and airborne/space-­borne remote sensing approach was
implemented to map a number of them. The geophysical surveys made use of handheld (Bartington
G601) and cart-­based multisensory magnetometry (SENSYS GmbH MX Compact system carrying 8
fluxgate gradiometer sensors), GPR (Sensors & Software Noggin Plus unit with a 250MHz antenna), soil
resistance (Geoscan Research RM85) and EMI (Geophex GEM-­2, GF Instrument CMD Explorer, and
Geonics EM-­31) techniques, accompanied by analysis of the chemical and magnetic properties of soil
samples. Differential GPS (DGPS) units were employed for navigation of the multisensory cart and the
EMI surveys. Processing of the data followed the pipeline that has been outlined in the above sections.
In total, within the auspices of the IGEAN project (Innovative Geophysical Approaches for the Study of
Early Agricultural Villages of Neolithic Thessaly), 21 Neolithic settlements were scanned covering a total
area of more than 70 hectares (Sarris et al., 2017).
Processing of the magnetic data was able to map the architectural structures and the layout of the
sites. Intense magnetic anomalies were caused by burnt (intentionally?) daub houses, whereas the stone
foundations exhibited weaker magnetic values but stronger GPR or resistivity signals, for example at
Velestino-­Mati (Figure 20.7). The stone foundations most probably belong to different occupation
phases and indicate either re-­occupation or expansion and growth of the settlement, as at Almyriotiki
(Figure 20.8). Different layouts were distinguished, following either a circular arrangement on mounted
settlements as at Almyros 2 (Figure 20.9), Almyriotiki and Velestino-­Mati, or rectangular planning in
flat settlements. Magnetic data were also revealing with respect to the clusters/neighborhoods within the
settlement and the differences in terms of house size and orientation. The extent of most of the settle-
ments was revealed through magnetic techniques and most of the detected enclosures were confirmed
to have ditches through the EMI measurements (increased magnetic susceptibility and soil conductivity).
Geophysical results did not resolve the ambiguity of the function of the enclosures, defensive (Runnels
et al., 2009), social (Halstead, 1999), for burials or water storage (Pappa & Besios, 1999). However, flood-
ing simulation modeling based on the analysis of satellite derived DEMs suggested that the ditches might
have acted as a counter-­measure against periodic flooding episodes. This hypothesis was strengthened
(a)

(b)

Figure 20.7 Results of the geophysical surveys at Velestino Mati. The magnetic data (a) indicates the nucleus
of the settlement at the west top of the magoula with some expansion towards the east top. A number of high
dipolar magnetic anomalies are associated with burnt daub foundations that were also confirmed from the
Electromagnetic Induction (EMI) soil magnetic susceptibility (b) and the soil resistance data (c). Magnetic
susceptibility also confirmed the existence of enclosures around the tell. A colour version of this figure can be
found in the plates section.
398 Apostolos Sarris

(c)

Figure 20.7 (Continued)

by the EMI measurements that indicated high conductivity of the soils in particular sections outside the
ditches, providing further support of a flood farming production strategy as was originally suggested by
Van Andel and Runnels (1995). Breaks along the perimeter of the (sometimes multiple and even concen-
tric) enclosures were also revealed designating the entrances to the settlements.
The above techniques provided an exceptional contribution to the study of the Neolithic landscapes
of the region of Thessaly and they managed to produce new evidence regarding the spatial characteristics
of the settlements, their extent and internal structure raising more questions about their social organiza-
tion and exploitation of the surrounding land.

Conclusion
Geophysical data processing offers an unlimited number of choices that depend on the method of
measurement adopted, survey conditions, instrumentation and the configuration of the sensors, the
physical properties measured, the location and properties of the targets, and the various automatic or
semi-­automatic algorithms that are able to enhance the original signals. Whatever the case, the common
denominator remains the interpretation of the measurements and the recognition of the subsurface fea-
tures, and even if the topic is being increasingly approached through more automated techniques such
as machine learning or pattern recognition, the interpretation of geophysical features will remain one of
the main challenges of shallow depth archaeological prospection. The interpretation process cannot be
considered to be always irrefutable; it will always be connected to the geological, soil and surface char-
acteristics, the target’s preservation condition, our experience, theoretical models and hypotheses. To this
(a)

Figure 20.8 Results of the geophysical surveys at Almyriotiki. The magnetic data (a) presented a clear image
of the internal planning of the settlement: Burnt daub structures follow a circular orientation around the top
of the tell. The houses expand further to the south, where some weaker magnetic anomalies representing stone
houses with internal divisions are also present. An irregular wide ditch system encloses the settlement from the
east and the north and it is confirmed from the EMI magnetic susceptibility (b) and soil conductivity measure-
ments (c). The high soil conductivity to the north coincides with an area susceptible to periodic flooding. The
above were also confirmed from the soil viscosity measurements (d) as an indicator of the soil permittivity.
A colour version of this figure can be found in the plates section.
(b)

Figure 20.8 (Continued)


(c)

Figure 20.8 (Continued)


(d)

Figure 20.8 (Continued)


(a)

Figure 20.9 Results of the geophysical surveys at Almyros 2. The magnetic data (a) depict clearly the con-
centration of burnt daub structures at the centre of the tell, expanding further to the south. The settlement
is surrounded by a double ditch system, which is confirmed by both EMI magnetic susceptibility (b) and soil
conductivity data (c). A number of breaks in this double enclosure are most probably associated with multiple
entrances to the settlement. Soil conductivity seems also to increase outside the settlement to the south and
west directions (north to the top), namely in the area which is most susceptible to flooding. A colour version
of this figure can be found in the plates section.
(b)

Figure 20.9 (Continued)


(c)

Figure 20.9 (Continued)


406 Apostolos Sarris

end, it will continue to be a joint undertaking involving all the different disciplines and experts that are
concerned with unravelling the secrets of our past that are hidden beneath the surface of our planet.

References
Alldredge, L. R., Van Voorhis, G. D., & Davis, T. M. (1963). A magnetic profile around the world. Journal of Geophysi-
cal Research, 68, 3679–3692.
Annan, A. P. (2009). Electromagnetic principles of ground penetrating radar. In H. M. Jol (Ed.), Ground penetrating
radar theory and applications (pp. 1–40). Amsterdam: Elsevier.
Bernardini, F., & Bajaj, C. L. (1997). Sampling and reconstructing manifolds using alpha-­shapes. Computer Science Techni-
cal Reports. Department of Computer Sciences, Purdue University, 1–11.
Bevan, B., & Kenyon, J. (1975). Ground penetrating radar for historical archaeology. MASCA Newsletter, 11(2), 2–7.
Bhattacharyya, B. K. (1965). Two-­dimensional harmonic analysis as a tool for magnetic interpretation. Geophysics,
30(5), 829–857.
Bhattacharyya, B. K. (1966). Continuous spectrum of the total magnetic field anomaly due to a rectangular prismatic
body. Geophysics, 31(1), 97–121.
Burger, H. R., A. F. Sheehan, & C. H. Jones. (2006). Introduction to applied geophysics: Exploring the shallow subsurface.
New York: W. W. Norton & Company.
Conyers, L. B. (2013). Ground-­penetrating radar for archaeology (3rd ed). Series Editors: L. B. Conyers & K. L. Kvamme.
Geophysical Methods for Archaeology No. 4. Lantham, MD: AltaMira Press.
Dean, W. C. (1958). Frequency analysis for Gravity and Magnetic Interpretation. Geophysics, (23), 97–127.
Debeglia, N., & Dupont, F. (2002). Projet de Réalisation d’un Nouveau Réseau Gravimétrique Français – Liaisons Gravimé-
triques Entre des Bases du Réseau Français de Référence RGF83 et des Bases Absolues Récentes, BRGM/RP-­51502-­FR.
Edwards, L. S. (1977). A modified pseudosection for resistivity and induced polarization. Geophysics, (42), 1020–1036.
Geoscan Research. (2005). Instruction manual 1.97 (Geoplot 3). Bradford: Geoscan Research.
Green, R. (1974). The seismic refraction method: A review. Geoexploration, 12(4), 259–284.
Halstead, P. (1999). Neighbours from Hell? The household in Neolithic Greece. In P. Halstead (Ed.), Neolithic society
in Greece (pp. 77–95). Sheffield: Sheffield Academic Press.
Hatton, L., Worthington, M. H., & Makin, J. (1986). Seismic data processing theory and practice. Oxford: Blackwell
Scientific Publications.
Jahne, B. (1991). Digital image processing: Concepts, algorithms, and scientific applications. Berlin: Springer–Verlag.
Jol, H. M. (2009). Ground penetrating radar theory and applications. Amsterdam: Elsevier.
Kalayci, T., & Sarris, A. (2016). Multi-­sensor geomagnetic prospection: A case study from Neolithic Thessaly, Greece.
Remote Sensing, 8(11), 966, doi: 10.3390/rs8110966, OPEN ACCESS: www.mdpi.com/2072-­4292/8/11/966/html
Kearey, P., & Brooks, M. (1984). An introduction to geophysical exploration. Oxford: Blackwell Scientific Publications.
Lessard, Y. A. (1981). Simulation of magnetic surveying techniques to study the effects of diurnal variations. Senior Project
CS498. Lincoln: Dept. of Computer Science, University of Nebraska-­Lincoln.
Loke, M. H. (2004). Tutorial: 2-­D and 3-­D electrical imaging surveys. Retrieved from www.geoelectrical.com
Loke, M. H., & Barker, R. D. (1996). Rapid least-­squares inversion of apparent resistivity pseudo-­sections using quasi-­
newton method. Geophysical Prospecting, (48), 181–152.
MacLeod, I. N., & Dobush, T. M. (1990). Geophysics: More than numbers: Processing and presentation of geophysical data.
4th National Outdoor Action Conference on Aquifer Restoration, Ground Water Monitoring and Geophysical
Methods, 14–17 May, Las Vegas, Nevada (pp. 1081–1095).
Mares, S. (1984). Introduction to applied geophysics. Prague: D. Reidel Publishing Company.
Panisova, J., & Pasteka, R. (2009). The use of microgravity technique in archaeology: A case study from the
St. Nicolas Church in Pukanec, Slovakia. Contributions to Geophysics and Geodesy, 39(3), 237–254.
Papadopoulos, N. G., Tsourlos, P., Papazachos, C., Tsokas, G. N., Sarris, A., & Kim, J.-­H. (2011). An algorithm for
the fast 3-­D resistivity inversion of surface electrical resistivity data: Application on imaging buried antiquities.
Geophysical Prospection, (59), 557–575.
Papadopoulos, N. G., Tsourlos, P., Tsokas, G. N., & Sarris, A. (2006). 2D and 3D resistivity imaging in archaeological
site investigation. Archaeological Prospection, 13(3), 163–181.
Processing and analysing geophysical data 407

Papadopoulos, N. G., Tsourlos, P., Tsokas, G. N., & Sarris, A. (2007). Efficient ERT measuring and inversion strategies
for 3D imaging of buried antiquities. Near Surface Geophysics, 5(6), 349–362.
Pappa, M., & Besios, M. (1999). The neolithic settlement at Makriyalos, Northern Greece: Preliminary report on the
1993–1995 excavations. Journal of Field Archaeology, 26(2), 177–195.
Parasnis, D. S. (1997). Principles of applied geophysics. London: Chapman & Hall.
Pattantyus, M. A. (1986). Geophysical results in archaeology in Hungary. Geophysics, 51(3), 561–567.
Reford, M. S. (1980). History of geophysical exploration, magnetic method. Geophysics, 45, 1640–1658.
Reynolds, J. M. (2011). An introduction to applied and environmental geophysics (2nd ed.). Chichester: John Wiley &
Sons Ltd.
Runnels, C. N., White, C., Payne, C., Wolff, N. P., Rifkind, N. V., & LeBlanc, S. A. (2009). Warfare in Neolithic
Thessaly: A case study. Hesperia, 78(2), 165–194.
Sarris, A. (1992). Shallow depth geophysical investigation through the application of magnetic and electric resistance techniques
(Ph.D. Dissertation). University of Nebraska-­Lincoln, Dept. of Physics and Astronomy, Lincoln, USA: A Bell &
Howell Company.
Sarris, A., Dunn, R. K., Rife, J. L., Papadopoulos, N., Kokkinou, E., & Mundigler, C. (2007). Geological and geo-
physical investigations in the Roman cemetery at Kenchreai (Korinthia), Greece. Journal of Archaeological Prospec-
tion, (14), 1–23.
Sarris, A., Kalayci, T., Simon, F.-­X., Donati, J., Garcia, C. C., Manataki, M., . . . Stamelou, E. (2017). Opening a
new frontier in the Neolithic settlement patterns of Eastern Thessaly, Greece. In A. Sarris, E. Kalogiropoulou,
T. Kalayci, & L. Karimali (Eds.), Communities, landscapes, and interaction in Neolithic Greece: Proceedings of international
conference, Rethymno 29–30 May 2015 (pp. 27–48). Ann Arbor, MI: International Monographs in Prehistory.
Sarris, A., Papadopoulos, N., & Soupios, S. (2014). Contribution of geophysical approaches to the study of priniatikos
pyrgos. In B. P. C. Molloy & C. N. Duckworth (Eds.), A cretan landscape through time: Priniatikos pyrgos and environs
(pp. 61–69). Oxford: BAR International Series 2634.
Scollar, I. (1970). Fourier transform methods for the evaluation of magnetic maps. Prospezioni Archeologiche, (5), 9–41.
Scollar, I., Tabbagh, A., Hesse, A., & Herzog, I. (1990). Archaeological prospecting and remote sensing. Cambridge: Cam-
bridge University Press.
Scollar, I., Weidner, B., & Segeth, K. (1986). Display of archaeological magnetic data. Geophysics, 51(3), 623–633.
Scott, J. H., & Markiewicz, R. D. (1990). Dips and chips-­PC programs for analyzing seismic refraction data. Proceedings
of SAGEEP 1990, Golden, Colorado (pp. 175–200).
Sensors & Software. (2013). EKKO_Project. Mississauga, Canada: Sensors & Software.
Sensors & Software. (2015). LineView. Mississauga, Canada: Sensors & Software.
Sheriff, R. E. (1989). Geophysical Methods, New Jersey: Prentice Hall.
Simon, F.-­X., Sarris, A., Thiesson, J., & Tabbagh, A. (2015). Mapping of quadrature magnetic susceptibility/magnetic
viscosity of soils by using multi-­frequency EMI. Journal of Applied Geophysics, 120, 36–47.
Simon, F.-­X., Tabbagh, A., Donati, J., & Sarris, A. (2018). Permitivity mapping in the VLF-­LF range using a multi-­
frequency EMI device: First tests in archaeological prospection. Near Surface Geophysics, 17(1), 27–41.
Spector, A. (1975). Application of aeromagnetic data for porphyry copper exploration in areas of volcanic cover. 45th Annual
International Meeting of the Society of Exploration Geophysicists, 15 October, Denver, Colorado.
Spector, A., & Grant, F. S. (1970). Statistical models for interpreting aeromagnetic data. Geophysics, 35(2), 293–302.
Szymczyk, M., & Szymczyk, P. (2013). Preprocessing of GPR Data. Image Processing & Communication, 18(2–3), 83–90.
Telford, W. M., Geldart, L. T., & Sheriff, R. E. (1990). Applied geophysics (2nd ed.). Cambridge: Cambridge University
Press.
Van Andel, T. H., & Runnels, C. (1995). The earliest farmers in Europe. Antiquity, 69(264), 481–500.
Weymouth, J. W. (1976). A magnetic survey of the walth bay site (39WW203). Lincoln, Nebraska: Midwest Archaeologi-
cal Center, National Park Service, U.S. Department of the Interior.
Weymouth, J. W., & Lessard, Y. A. (1986). Simulation studies of diurnal corrections for magnetic prospection. Pros-
pezioni Archeologiche, (10), 37–47.
Yilmaz, O. (1976). A short note on deep seismic sounding in Turkey. Journal of Geophysical Society of Turkey, (3), 54–58.
Zurflueh, E. G. (1967). Applications of two dimensional linear wavelength filtering. Geophysics, 32(6), 1015–1035.
21
Space and time
James S. Taylor

Introduction

Concepts of spatiotemporality
Concepts of ‘space’ and ‘time’ are fundamental to the discipline archaeology, which deals with the distri-
bution of human material culture at various scales through the timespan of human existence (see Lucas,
2005). More broadly, the integrated nature of spatiotemporality has been well established in science for
over a century since the acceptance of a general theory of relativity (Einstein, 1905). Since then, as Daly
and Lock outlined in their comprehensive review of the subject – “Timing is Everything” – (Daly &
Lock, 1999, p. 289), over the course of the 20th century a robust corpus of theoretical literature has devel-
oped across the humanities especially in anthropology (Evans-­Pritchard, 1939; Levi-­Strauss, 1948, 1961;
Block, 1977; Bourdieu, 1977; Fabian, 1983; Gell, 1992), sociology (Giddens, 1984), philosophy (McTag-
gart, 1908; Heidegger, 1953; Husserl, 1966) and of course geography (Carlstein, Parkes, & Thrift, 1975;
Hägerstrand, 1975; Carlstein & Thrift, 1978; Parkes & Thrift, 1978; Soja, 1989; Harvey, 1991; Soja, 1996).
Of particular interest to archaeology are those conceptions of temporalities that relate to similar spatial
concepts of landscape and place (as espoused by Soja, 1989; Harvey, 1991; adapted by Ingold, 1993). It is
commonly accepted that there are many alternative forms of temporal perception, generally based upon
the perspective of the observer (whether that be the emic agent of a ‘past society’, or the etic archaeolo-
gist, see Headland, Pike, & Harris, 1990; and for an in depth discussion of this Taylor, 2016, pp. 43–44).
Daly and Lock also note clear engagement by many theoretical archaeologists with the way in which
“constructs of time can be relevant to and applied in archaeology” and “how archaeology can contrib-
ute to the overall understanding of time as it relates with humans and human processes” (Daly & Lock,
1999, p. 289). They specifically highlight works by Bradley (1991), Clark (1992), Ingold (1993), Barrett
(1994), Gosden (1994), Thomas (1996), Terrell and Welsch (1997), Bradley (1998), and Frachetti (1998);
but one might also factor in work by Braudel (1972), Leone (1978), Braudel (1980), and Shanks and Til-
ley (1987) as well as Bailey (1983, 1987, 2007, 2008), much of which is neatly discussed by Lucas in his
Archaeology of Time (2005).
Despite this longstanding disciplinary awareness of the relevance of time, temporality and the affor-
dances of spatiotemporal computing, fully integrated spatiotemporal synthesis remains uncommon
Space and time 409

within archaeological narratives. However, it is perhaps more significant that even as digital methods con-
tinue to gain more and more traction within archaeology, the integration of spatial and temporal data, and
its subsequent analyses remains relatively simplistic, and even somewhat elusive within the fundamental
data structures that underpin the discipline.
From a computational angle, the integration of time or temporal data within Database Management
Systems (DBMS) and Geographic Information Systems (GIS) has been also discussed in detail both in
relation to, and outside of the discipline of Archaeology (see for example Langran, 1989, 1992; Roddick &
Patrick, 1992; Lock & Daly, 1998; Abraham & Roddick, 1999; Daly & Lock, 1999; Peuquet, 2002; Green,
2011a, 2011b; De Roo, Bourgeois, & Maeyer, 2013; Taylor, 2016). However there has been a distinct
lack of development in this area within commonly used GIS software, and, given the long history of the
development of GIS and spatial technologies (dating back to the 1960s), any sort of integrated temporal
functionality has been a relatively recent innovation.
Given the potential ways in which temporality might be addressed in computational terms (see
following) the reasons for this are not entirely clear. It may be that the specific spatiotemporal require-
ments of archaeologists are too niche to warrant the sustained research and development of integrated
spatiotemporal technologies. This seems unlikely, as a need for integrated spatiotemporal analysis of data
are not unique to Archaeology, and similar issues apply to other disciplines (for example geography, the
environmental and social sciences and project management within the engineering, construction and
planning industries all have complex spatiotemporal data requirements). More likely the apparent sim-
plicity of current computational approaches to space and time appears to be rooted in one fundamental
issue: ‘that it is complicated’. Constructing a computational spatiotemporal system and/or data-­structure
that serves the varied and subtle analytical needs of these disciplines is difficult and generally lacks real
world development.
A key aspect is that within computational spatial technologies Space is by definition Cartesian in its
conception, and therefore framed (or perhaps limited) by the Euclidean geometric and algebraic tools
most commonly used to describe and understand it (maps, coordinates, projections, etc.). This approach
to space has been extended into our understanding of time (at least within the constraints of spatial
technologies) which is commonly understood in terms of “cartographic time” (see Langran, 1992,
pp. 28–29). This is a distinctly ‘Newtonian’ view of time as a linear fourth Cartesian dimension that
flows from past infinity to future infinity and can be measured separately from the other three spatial
dimensions (ibid.; Taylor, 2016). Thus by extension time is most commonly defined by GIS practitio-
ners, both in archaeology or otherwise, as an extension of a spatial Cartesian system; a measurable fourth
dimension. This approach to time fits neatly with an archaeological (and more general social) concept of
a linear, chronological time, and usually culminates in a form of numeric spatiotemporal data (i.e. lists of
coordinates, dates and timestamps).
This definition of time clearly privileges the linear, sequential, ‘measureable’ temporality that also
neatly fits within the dominant relational data structures that we use to manage our spatiotemporal data.
For most archaeologists time is almost a fourth coordinate which generally manifests as some sort of
timestamp at a relevant level of granularity (a broad archaeological period, or a year/range of years given
by an absolute dating technique). This is a hugely simplistic way of dealing with temporality. Perhaps
the challenge when seeking an integrated form of spatiotemporality comes in defining a more nuanced
and abstract temporality, reflecting the presence of multiple real-­world temporalities. As geographers and
spatial theorists have increasingly sought to characterise a sense of ‘Place’ as opposed to ‘Space’, so perhaps
there is a need to distinguish a different sense of ‘Temporality’ from the common linear, numerical con-
cept of ‘time’ itself (see for example Soja, 1989; or Ingold, 1993). Archaeology, as a discipline predomi-
nantly concerned with the subtleties of these concepts, has much to offer.
410 James S. Taylor

So where are the T-­GIS?


The concept of a Temporal-­GIS (T-­GIS) is by no means new. In 1992 Gail Langran published her seminal
work, Time in Geographic Information Systems, in which she outlined an approach to the “philosophical,
conceptual and technical” decisions required for the development of a T-­GIS (Langran, 1992, p. 9). As
a geographer, she based her approach to temporal modeling upon Sinton’s much earlier representational
framework (Sinton, 1978) and the concepts of temporal DBMS developed throughout the 1980s (Peu-
quet, 2002, p. 304). Defining geographic data as having three basic components: time, location and attri-
bute, she explicitly set out to construct a conceptual model for GIS that would enable the “tracing and
[analysis of] changes in spatial information” (Langran, 1992, p. 4).
The mapping of these components is conventionally achieved by fixing one attribute at a constant
value, controlling the second within a range of values, and measuring the third on an interval or scale.
Langran argued that generally, mapped data fixes time, which results in spatial data represented at a spe-
cific point in time. As an example, the consecutive dated editions of British Ordnance Survey maps –
their time/temporality is fixed by the year the data was collected and printed, the landscape features are
controlled by a standardised symbology, and spatial attributes (i.e. their size/location) are measurable (as
coordinates) at intervals of the scale at which they were printed. Langran goes on to assert that freeing
the time component is central to the creation of a true T-­GIS, something which to date has still not been
fully implemented in computational terms (predominantly because of the inherent complexity of this
task). Notably (perhaps for the same reason) there has been very little development in the literature that
progressed the underlying conceptualisation of time in GIS since the late nineties and early noughties
(see for example Castleford, 1992; Langran, 1992; Renolen, 1997; Daly & Lock, 1999; Peuquet, 2002).
Similarly, although some GIS have begun to develop integral temporal tools (such as, for example, ArcGIS’
‘time-­slider’ function), none have substantially changed their underlying structure, to truly embed time or
temporality. Ultimately the temporal dimension tends to be bolted-­on as an attribute to existing spatial
data. For the most part our dominant relational data-­management systems have not radically changed in
the intervening period.
Unsurprisingly (with a few notable exceptions, see for example Green, 2011a or De Roo et al., 2013),
most of the real conceptual work with regard to the conceptual modeling of space and time has been
undertaken outside of the discipline in computer science, or more specifically within the emerging sub-­
discipline of geography known as Geographical Information Science (or GISci). Following is an overview
of some of these approaches from a Computer Science/GISci perspective. Almost none of these concepts
have been deployed effectively at a general level, even fewer have been recognised or developed by digital
practitioners in archaeology.

Conceptual modeling of space and time in computer systems


Generally then there remains a limited number of ‘off-­the-­shelf ’ tools for spatio-­temporal analysis, and
certainly there is no common methodological approach to the way time is handled by GIS. Bespoke solu-
tions offered by archaeological GIS practitioners (and more broadly across the social sciences again, see for
example Green, 2011a, 2011b) tend to be targeted to address specific research problems and are perhaps
too niche to warrant significant research and development by commercial software developers. To this
end it is difficult to discuss a discrete methodological approach in a volume such as this.
Efforts to integrate space and time in computing and spatial technologies have generally (consciously
or otherwise) been founded upon a relatively small number of approaches to conceptual modelling (see
Table 21.1), most of which are rooted in Gail Langran’s seminal work. Langran, was not explicitly writing
Space and time 411

Table 21.1 Table summarising the three key computational approaches to the integrated conceptual modeling of
spatiotemporality in spatial technologies.

The Space-­Time Cube (Langran, 1992).


Where time is represented by the height attribute of a 3-­dimensional dataset.
Sequent Snapshots, ‘Time Slicing’ (Langran, 1992)
Where 2-­dimensional datasets are duplicated as a sequence of layers, ordered by a different temporal attribute,
usually codified as a chronon.
Event Oriented approaches to Modelling
A cluster of approaches that shift the temporal focus on to changes of state in spatial data, including:
Base State With Amendments (Langran, 1992)
Where the earliest single layer of spatial data (representing the base state) is recorded and all subsequent layers
document areas of change, ordered by chronon.
Space-­Time Composites (Langran, 1992)
Where a single layer of spatial data holds all its temporal information as attributes, that can be coded and
symbolised to represent spatial change through time.
Event-­based spatio temporal datasets (Peuquet & Duan, 1995)
Where an ‘event list’ is generated that serves as a ‘time-­line’ or ‘temporal-­vector’, forming the basis for the
organisation and storage of specific changes associated with each time interval.

for an archaeological audience, however she outlined four popular conceptions of computational spatio-
temporality, detailing their pros and cons at a pragmatic level (Langran & Chrisman, 1988, p. 11; Langran,
1992, pp. 37–44). These, alongside a further event-­based approach posited by Peuquet and Duan (1995),
are based upon the simplified codification of a discrete (linear) temporal attribute, or chronon. That is, a
single “nondecomposable unit of time with an arbitrary duration” (Snodgrass, 1992, p. 24); i.e. a second,
a minute, or a year, for example.

Method

Implementation
Despite a well-­established theoretical discourse on how time might function alongside space within
spatial technologies, and with some notable prototypes, there still remains no fully functional T-­GIS.
Peuquet’s (2002) review and critique of the state of these spatiotemporal conceptual models offers some
insight as to why. She notes the extension of both conventional relational (or otherwise) DBMS and
spatial data models to “include temporal data, or vice versa, will [. . .] result in forms of implementation
that are both complex and voluminous”, particularly if one wants to capture the nuances of “temporal
interrelationships, such as temporal coexistence of specific entities or relative temporal configuration of
various events that are not explicitly stored” (Peuquet, 2002, p. 307). Indeed even Peuquet’s own concep-
tual answer to this, the ‘Event-­Based Spatio Temporal Data Model’ (ESTDM), which ultimately (despite
being problematic) proposes a more versatile and efficient approach to temporal modelling (Peuquet &
Duan, 1995) has only seen limited (if any) realisation within current GIS technologies.
Outside of the relatively small corpus of geographic examples (highlighted by predominantly research
driven examples), there has also been relatively little published work to date on the integration of space
and time within GIS or spatial technologies (compared to other research and development innovations in
412 James S. Taylor

spatial technologies), and certainly no literature advocating a standard or best practice for the structure of
spatiotemporal data. There is no broad research into a common methodology for its analysis (still less within
archaeological applications). Despite the huge amount of potential in this field of research, there remains
much work to be done.
That said, things are beginning to move on as processing power has become faster, and software more
sophisticated. Work on more integrated spatiotemporal data management has continued in the wider
commercial GIS industry. Most notably, ESRI (Environmental Systems Research Institute) began to
improve the functionality of temporal data in its 2010 software release: ArcGIS 10, with the addition of a
fairly straightforward ‘time slider’ (but, see also discussion of ‘space-­time cubes’ further in chapter). These
tools facilitate temporal animation (based upon start and end date attributes, effectively time-­stamps, of
objects within the geodatabase) in order to visualise the evolution of features in a geodatabase. Again
reliant on chronon-­based data, this approach is closely related to basic time-­slicing techniques (outlined
earlier) and as such, is predominantly useful for the consideration of time instants (single events) and
extents (features with lifespan). Such functionality again preferences temporal models rooted in absolute
time, that are less able to cope with the vague period ranges and fuzzy boundaries of our absolute dat-
ing methods (ceramic spot dates, radiocarbon probability ranges, etc.) or the relative chronologies that
dominate archaeological temporalities.
Ultimately all of these conceptual approaches and software solutions are problematic for one funda-
mental reason: that time is still not truly represented as a continuum, but as a list of events or chronons
that represent incremental changes to space. The temporal data is effectively constrained by its location
and is still not dealt with as a free and discrete entity. Generally these spatial systems have one thing in
common in terms of the way they handle time: they all adopt an approach that requires the tabulation
of temporal data. Rather than being fully integrated, time is simply appended as metadata to spatial data
sets; an issue which, by definition, prohibits the development of a true T-­GIS (Langran, 1992, pp. 11–12).
Consequently, from a spatiotemporal perspective our main methodological challenge currently boils
down to one simple question: how do we code time in order to fully integrate it with our spatial data?

Conceptualisation: the spatiotemporal model


Given the lack of any coherent methodology for addressing the problem of integrating space and time
computationally, solutions have tended to be isolated and creative, repurposing tools and functionality
within GIS that were designed to address specific research questions. In theory, any of the computational
conceptual modeling approaches outlined in the introduction above could be implemented as real-­world
methodological solutions. As such, in lieu of any sort of standardised or ‘best practice’ approach to spa-
tiotemporal analysis, these broad approaches warrant some further discussion in order to contextualise
the case studies presented in the following section:

The space-­time cube


Possibly the best known approach to the integrated modeling of spatiotemporality is the space-­time cube
(after Hägerstrand, 1967; see also Rucker, 1977; Szegö, 1987). This is essentially a three-­dimensional cube
of data, where one of the spatial dimensions (conventionally height) is substituted for a time dimension.
In this approach “processes of two-­dimensional space [. . .] are played out along a third temporal dimen-
sion” (Langran, 1992, p. 37; see Figure 21.1). Although space-­time cubes have existed conceptually for
some time, this approach of modeling has rarely been applied within archaeological contexts (see John-
son, 2003; Bezzi, Bezzi, Francisci, & Gietl, 2006; Lieberwirth, 2008; Mlekuž, 2010; Crema, 2011; Scheder
Black, 2011; Orengo, 2013), because historically they have been particularly difficult to implement.
Figure 21.1 Lin and Mark’s conceptual data models, where raster datasets perhaps based upon spatial interpola-
tion (SI)/generalisation (SG) methods are converted into voxels, which may in turn be manipulated through
temporal interpolation (TI)/generalisation (TG) methods. The image highlights the difference between 2D and
3D raster data (voxels) in GIS (redrawn by Neil Gevaux after Lin & Mark, 1991, p. 988).
414 James S. Taylor

From a temporal perspective the most literal and difficult way to implement this concept is through
the temporal “voxelisation” of spatial data, where rasterised two-­dimensional data is converted into a
three-­dimensional voxel structure (a regular grid of 3D cells), “in which the height of the voxels is a time
interval” (Lin & Mark, 1991, p. 987), as opposed to the more conventional use of voxel height to represent
a Cartesian z-­coordinate. Hypothetically, it has been suggested that interpolation between the ‘original
data based time slices’ could be used to construct (or re-­construct) missing temporal layers, i.e. gaps in
the data (Lin & Mark, 1991, p. 987). Daly and Lock (1999) highlight the inevitable questions about what
would make “appropriate interpolation techniques”.
To date therefore most experimentation in large-­scale voxel modeling has tended to focus upon the
field of commercial geological prospection and the interpolation of subsurface geology for analysis at
the landscape level (see, for example, van der Meulen et al., 2013, for a good example of this approach).
Recently, however the process of making space-­time cubes has also become more attainable, since major
software releases have begun to incorporate tools for their production (alongside a suite of other tempo-
ral tools), and there is a growing trend in experimentation with the potential of voxel modelling within
archaeology (see, for example, Landeschi, 2018).
Related to the space-­time cube, Langran notes that “the trajectory of a two-­dimensional object
through time creates a worm-­like pattern in this phase space” (Langran, 1992, 37) – a space-­time path
(STP). Specifically, this concept builds upon the earlier ‘Time Geography’ of Hägerstrand (1967; see
also for example Kraak, 2003; Miller, 2005; Yu, 2006; Miller & Bridwell, 2009). Halls and Miller (1995,
1996) articulate a similar approach to modeling temporality along a space-­time path. They suggest that
a data object’s ‘lifespan’ can be represented as a mathematical curve, or ‘worm’, which can be viewed as
a ‘temporal arc’, constrained by a series of ‘temporal nodes’, or ‘todes’, which can influence the trajec-
tory of the worm. In real terms is it possible to repurpose the three-­dimensional utility of modern GIS
to represent time as the third variable in a three-­dimensional model (Lock & Harris, 1997)? Again, one
Cartesian dimension (height) is generally sacrificed so that time can be represented as the third axis of a
two-­dimensional spatial dataset (Daly & Lock, 1999, p. 288). This approach has been effectively deployed
in a number of cases, perhaps most evocatively by Kwan (2002a, 2002b; see also Kwan, 2008; Kwan &
Ding, 2008), in her efforts to visualise the everyday social geographies of individuals (outlined in the case
studies in this chapter).
The STPs have been used to great effect in the approach adopted by ‘Feminist GIS’ geographer Kwan
(2008) in her study of the experiences of Muslim women in Columbus, Ohio, within the context of a
post ’9/11’ USA. It highlights the potential of STPs in the visualisation of qualitative aspects of spatio-
temporality using spatial technologies. Kwan’s approach clearly demonstrates that it is possible to adapt
some of the conceptual approaches for visualising time and temporality in GIS, to explore an implicit
and somewhat intangible social context of spatiotemporality. This may have far reaching implications for
archaeology, where there may be considerable potential for exploring the lifespan of archaeological finds
(see, for example, Kraak, 2003).

Sequent snapshots (time-­slicing)


Perhaps the simplest approach to spatiotemporal modeling is the ‘sequent snapshot’ or ‘time-­slice’
approach (see Figure 21.2). In this approach every chronon replicates a discrete layer of spatial data, gener-
ally manifesting as sequentially overlaid spatially-­registered grids. Time is therefore an appended attribute
of the spatial component, and each grid essentially represents the same area or ‘world state’ at a different
point in time (Langran, 1992, p. 39; Peuquet & Duan, 1995; Daly & Lock, 1999). Apart from the obvious
implications for data redundancy by (potentially massive) repetition, this time-­slicing approach has been
Space and time 415

Figure 21.2 Example of Langran’s ‘Snapshot Approach’. In this case ‘snapshot’ (Si) presents a particular ‘world
state’ at time (ti).
Source: Note here that the temporal distance between ‘snapshots’ need not be uniform (redrawn by Neil Gevaux after Peuquet &
Duan, 1995, 9; Langran, 1992).

criticised as being restrictive as it also constrains temporality to known points in time: “the events that
change one state to the next” are not explicitly recorded (Langran, 1992, p. 39). This is a linear approach,
and while much temporal data is distinctly non-­linear in character, this is not explicit when visualised as
a sequence of time-­slices (Halls & Miller, 1996, p. 12).
Although the case studies in the following section deploy more sophisticated approaches, utilising
bespoke software, it is remarkably straightforward to produce a simple time-­slice sequence in any GIS
through the creation of a simple ‘time’ field in the attribute table of a spatial dataset. This is done regu-
larly in archaeological data sets to define and produce archaeological period maps, and phased plans of
excavation data for example – which are, in effect, sequent snapshots.

Event-­based modeling
The remaining methods of spatiotemporal modeling might be distinguished from the others, as being
‘event oriented’ (see Table 21.1). Like all of the other approaches time is distilled as a spatial ‘attribute’,
which can be symbolised accordingly and discrete changes in space through time can be modeled as dis-
crete temporal events usually in a separate (but linked) data table (Langran, 1992, p. 44). In the ‘Base State
With Amendments’ approach a single spatial layer forms the so-­called ‘base state’ of a specific geographic
region, and subsequent ‘amendments’ to this image are superimposed (Langran, 1992, pp. 39–41).
‘Space-­Time Composite Modeling’ builds upon this method by “flattening” all the temporal data into
a single layer, where formal coding and topology are utilised to reconstruct the temporal sequence. This
can manifest either as a raster-­based “temporal grid” solution or a vector-­based approach (Langran, 1992,
pp. 43 & 46–47; see also Hazelton, 1991; Kelemis, 1991). The former sees a ‘temporal list’ attached to each
pixel, representing a specific location on a spatially registered grid (Figure 21.3). In the latter, polygon
‘entities’ are imbued with inherent temporal attributes representing incremental change (Figure 21.4)
distinct from their neighbours (Langran, 1992, p. 47). With the polygon approach it is technically pos-
sible to generate new ‘regions’ from the intersection of superjacent polygons holding information about
the ‘change’ between them (see Lin & Mark, 1991).
These event-­based approaches have the advantage over conventional snapshot approaches in that they
only store temporal data related to specific locations, also reducing data redundancy (Peuquet & Duan,
1995, p. 10). However, they afford little insight into the process of change itself (Daly & Lock, 1999,
p. 288).
416 James S. Taylor

Figure 21.3 Example of Langran’s ‘Temporal Grid’ solution – here a temporal grid is created and a variable
length ‘list’ is attached to each grid cell to denote successive changes.
Source: redrawn by Neil Gevaux after Langran, 1992, p. 46

Figure 21.4 Example of Langran’s ‘Amendment Vector Approach’ – showing urban encroachment where
urban encroachment is represented as a base state (left) with incremental amendment vectors.
Source: redrawn by Neil Gevaux after Langran, 1992, p. 40

Case studies
The following case studies can be related to the conceptual models outlined above. As noted, given the
scale of digital practice in archaeology there are few who have fully engaged with computing space and
time. Consequently, there are relatively few robust archaeological case studies.
Space and time 417

Dynamic mapping: TimeMap


Perhaps one of the best-­known examples of an applied and integrated spatiotemporality is the Time-
Map project (Figure 21.5) (Johnson, 1997; Johnson, 2002a, 2002b; Johnson & Wilson, 2002; John-
son & Wilson, 2003; Johnson, 2004), which is essentially a method of ‘dynamic mapping’. As a wholly
digital approach, dynamic mapping differs from ‘traditional’ analogue sequences of static maps (i.e.
period or phased plans) or traditional sequent snapshot approaches (such as those adopted by Snow,
1997; or Spikins, 1997) in that the time-­based maps can be used to visualise spatial change through
map-­based animations (Johnson & Wilson, 2003, p. 125). At the time of its development the Time-
Map project “developed a methodology for recording time-­dependent features, based upon vector
GIS” which Johnson and Wilson (2003) called the “Snapshot-­Transition Model”. Their approach was
designed to be visualised through a bespoke software interface, the TimeMap Data Viewer (TMView)
(Johnson & Wilson, 2003).
This is essentially bespoke two-­dimensional cartographic display software, with explicit support for
‘fuzzy’-­temporal manipulation and querying. It models the history of ‘features’ “as a series of [raster or

Figure 21.5 The TimeMap Data Viewer (TMView) map space. A colour version of this figure can be found
in the plates section.
Source: from Johnson, 2002a
418 James S. Taylor

vector] snapshots at known points in time, and a series of transitions between these snapshots” (Johnson,
1997). It allows geographically registered historical features, maps and satellite imagery to be superim-
posed and animated in an event-­based system. TimeMap is not a topological system. It does not record
the relationship between features in space and time, it simply records their location (Johnson, 1997, p. 6).
As such TimeMap is not a true ‘spatiotemporal system’. Its primary function is the dynamic representa-
tion of the past with limited capability for more complex spatiotemporal analysis. More recently, the
addition of temporal functionality in ArcGIS (including a time slider and temporal animation function)
has meant the notion of dynamic mapping pioneered by the TimeMap project is rapidly becoming an
integral spatiotemporal tool in modern GIS.

‘Object lifespan’ approaches


The following case studies represent variant approaches to spatiotemporal visualisation that utilise existing
data-­structures and functionality of current GIS to ‘embed’ lifespans into (usually) vector-­based spatial
objects, by assigning them start and end times.

Stratigraphically derived spatiotemporal modeling in GIS at Çatalhöyük


Work at the Neolithic site of Çatalhöyük has seen experimentation with this approach at an intra-­site
level with spatiotemporal modeling of some of the site’s stratigraphic data (Taylor et al., 2015; Taylor,
2016). This multidisciplinary, collaborative study sought to embed data pertaining to the material culture
of a well-­preserved burnt building within a spatiotemporal model, rooted in the relative temporality of
the structure’s stratigraphic sequence. Drawing upon the large amount of digitised data stored in the site’s
intra-­site GIS and databases (see Berggren et al., 2015; Taylor et al., 2018), the aim of the project was to
produce a series of

spatiotemporal animations to present the results of this collaborative study as a form of prototype
‘visual biography’, [that would be] more dynamic and nuanced than conventional phasing, that
might be used to underpin and illustrate a social narrative of the building.
(Taylor et al., 2015, p. 127)

In the absence of absolute dates for every stratigraphic unit in the sequence, the project took an
approach that involved parsing through the stratigraphic matrix of the structure (Harris, 1989), to define
a minimum number of stratigraphic events that could be cross-­correlated and coded in relation to one
another, in such a way that they serve as a relative start/end ‘node’ and allow individual units to be defined
in terms of their lifespan, or ‘temporal arc’ (Taylor et al., 2015, pp. 133–146). This ‘temporally-­enabled’
spatial data allowed the in-­built time-­slider functionality of ArcGIS to visualise the spatial data as a series
of dynamic animations that could be symbolised using any of the other data linked to the stratigraphic
units as attribute tables. The result is a powerful spatial visualisation that can be linked to other visualisa-
tions of statistical approaches (such as density, or cumulative frequency of material culture) in order to
demonstrate a wide variety of aspects of the material culture and depositional sequence through time
(Figure 21.6). Although the project was a pilot study, the overall goal was to produce a tool to illustrate
and underpin rich multidisciplinary “visual narratives” about the depositional sequence and its relation-
ship to the material culture it yields, and the overall ‘life-­cycle’ of the structure (Chadwick, 2001; after
Lucas, 2001; Taylor et al., 2015, pp. 146–148).
Space and time 419

Figure 21.6 Sample frame from an animation in the ‘Up In Flames’ study that combined synchronised ani-
mated density graphs (produced in the R Software Environment) with animated density maps (produced in
ArcGIS). A colour version of this figure can be found in the plates section.
Source: from Taylor et al., 2015, p. 145

Object lifespans in the urban cityscape of tours


A variant on the concept of an ‘object lifespan’ approach to spatiotemporal modelling in spatial technolo-
gies has been utilised fairly effectively in the study of the historic development of the urban cityscape
of Tours, France (Lefebvre, Rodier, & Saligny, 2008; Lefebvre, 2009). In this case study Lefebvre et al.
essentially develop a conceptual approach towards the data structure of spatial objects within the city.
They define a data structure called ‘Historical Objects’ (OH), within the urban fabric, which is subject to
three systems: social use [i.e. function] (EF), space (ES) and time (ET) (abbreviated as FET) (Lefebvre et al.,
2008, p. 218). In terms of data structure, the ontology of this ‘OH-­FET’ model focuses upon the ‘social
use’ (function) of buildings within the urban environment as the basis of symbolisation within the spatial
system, whilst the temporal system (phasing, accepted as being linear) was based upon the interpreted
periodisation of buildings within the urban fabric (Figure 21.7). The result is an effectively balanced
model that does not privilege one facet of the data over another, “[t]hese different analyses lead to an
overall view of the social, spatial and temporal structure of the selected [Historical Object]” (Lefebvre
et al., 2008, p. 218).
420 James S. Taylor

Figure 21.7 Diagram showing the relationship between the various inputs and outputs of the ‘OH_FET’ urban
fabric model (social use, space and time) and the dynamics of the potential analytical outputs.
Source: from Rodier & Saligny, 2010, p. 32

Towards an archaeological temporal GIS


Acknowledging the fact that at an archaeological disciplinary level “chronological time is itself com-
plex”, Green (2011b, pp. 213–214) highlights a number of theoretical fundamentals, including that it
is both ‘multi-­linear and topological’ (cf. Harris matrix diagrams), often ‘uncertain’ or probabilistic (cf.
radiocarbon dates), or can simply represent points in time that “may only act as a terminus post quem for
its context of discovery” (cf. numismatics or dendronology). Noting that practice relating to chronologi-
cal time in archaeology has been developing quickly in recent years with advances in Bayesian model-
ling (Bayliss, 2007) or biographical approaches (see discussion below), Green (2011a, 2011b) goes on to
identify a need for bespoke spatiotemporal tools for archaeologists, specifically the development of an
archaeological temporal GIS (T-­GIS).
Green’s (2011a) research sought to bring bespoke temporal functionality to ArcGIS, and the resulting
TGIS, created specifically for archaeologists took into account the particular idiosyncrasies and complexi-
ties of archaeological temporal data outlined earlier. It was designed to visualise and analyse different types
of dating evidence and work with the most commonly used dating techniques in British archaeology
(specifically: typological, numismatic/historical, dendrochronological, radiocarbon, and thermolumines-
cence/OSL dates). A particularly useful innovation of this TGIS was that it could calculate and symbolise
a number of probabilistic functions and higher level analysis of the spatiotemporal data stored in the
GIS (including the probability of calibrated OxCal dates, and inferred dates generated through Bayesian
modelling). This included, for example, the probability and topological implications of a date within a
Space and time 421

Figure 21.8 Time-­GIS (TGIS) screenshot showing dates symbolised according to temporal topology. The
colour coding is according to the temporal topological relationship between each date and the currently
selected time period. A colour version of this figure can be found in the plates section.
Source: from Green, 2011b, p. 217

layer falling within a selected period, based upon the percentage overlap between the date’s range and the
selected period (see Figure 21.8) (Green, 2011b).
The development of this TGIS has helped to establish ‘aoristic’ or ‘fuzzy’ approaches to handling
archaeological chronological data (see also Fusco & de Runz, this volume), such as those implemented in
recent work considering the way in which time and space might be addressed in large complex archaeo-
logical datasets (e.g. those generated by the Portable Antiquities Scheme – PAS), whereby

rather than assigning artefacts to relative typochronological phases (e.g. the appropriate coin period
or pottery phase), this approach considers the probability that the objects under consideration
belong to one or more time-­slices of equal (or less commonly unequal) length (e.g. 50 years) across
a given study period.
(Cooper & Green, 2017).

Conclusion
Recent releases of many off-­the-­shelf GIS packages have seen the inclusion of increasingly complex spa-
tiotemporal functionality suggesting that the tide is turning with regards to the way in which time and
temporality might be integrated into spatial datasets. For archaeologists, the importance of the spatiotem-
poral integration of archaeological data cannot be overstated, particularly as the discipline continues to
benefit from recent innovations in the Bayesian modelling of radiocarbon dates (Bayliss, Bronk-­Ramsey,
Van der Plicht, & Whittle, 2007: Bayliss et al., 2015; Whittle, Richardson, Healy, Alton, & Bayliss, 2011).
Together then, generally, the various models and methodological implementations presented in the
422 James S. Taylor

discussion above highlight that there are a number of ways to approach how to embed temporal data
within GIS and spatial technologies.
However, the methodologies discussed in this chapter tend to share a strong focus on codifying time
so it fits within common relational data structures (complete with the inherent restrictions of linear spa-
tiotemporality and tabular structure), either as an integral attribute of spatial data, or as a layer of related
data that can be analysed and visualised separately. Even with renewed focus on digital technologies in
recent decades, and radical advancements in computer science, we still have not achieved the simplest fully
functional TGIS as defined by Langran (1992) or Peuquet (2002). Ultimately, in order for true spatiotem-
poral analysis to progress in Archaeology there is a distinct need to rethink the underlying data-­structures
that drive spatial technologies, and carry out more concerted research into how space and time might be
integrated to inform our analysis and outputs (see discussion in De Roo, 2013, pp. 619–620).

The potential of objects, graphs and ontologies


Since the early 1990s, there have been increasing attempts to move away from conventional relational
database models and efforts to develop more efficient models based upon archaeological entities defined
within a relational object-­oriented (O-­O) database model (see for example Andresen & Madsen, 1992;
Feder, 1993; Andresen & Madsen, 1996a, 1996b; Tschan, 1998; Madsen, 2003). The potential value of
O-­O data structures was quickly recognised by the archaeological community that engage with spatial
technologies. Indeed, in 1999 Daly and Lock stated:

the flexibility in defining object relationships that OO GIS and databases provide has a tremendous
potential for redefining how archaeologists can manage temporal variables. The possibility exists
for unique temporal relationships to be constructed, unfettered by predetermined categories (for
the basis of OO is the construction of the categories and the extents of the relationships that can
exist between them).
(Daly & Lock, 1999, p. 289)

In traditional relational database models (which are much easier to design and implement) the archaeo-
logical entity is represented by a table and relationships between archaeological entities (temporal or
otherwise) are reflected in the relationships between the database tables. By contrast object-­oriented
databases focus upon modelling the archaeological entity as an object, which can “participate in events”.
This means they are defined both by what they are and what they do (Richards, 1998, p. 333). As such
temporal information may be embedded within the objects themselves by definition. The capacity for
object entities to inherit properties of those entities from which it is comprised offers a potentially seam-
less transition between various temporal scales, and granularity.
More recently archaeologists are experimenting with ‘Semantic Web’ technologies, which are reliant
on a ‘graph data’ structure, using subjects, predicates and objects (often referred to as triples). Semantic
Web data is mapped to appropriate controlled vocabularies, thesauri and/or ontologies, allowing interop-
erability with other data mapped to the same authoritative structure (Wright, 2011, p. 13). Within
archaeology, data is often mapped to the domain ontology known as the CIDOC (International Council
for Documentation) Conceptual Reference Model (CRM). CIDOC-­CRM is the ISO standard for the
cultural heritage domain, and may prove to be a particularly useful means of coding sophisticated multi-­
layered temporal information into spatial objects (Taylor & Wright, 2012), potentially offering a more
holistic and interoperable form of spatiotemporality. This ultimately begs the question as to whether
there may be potential in this suite of technologies for handling a more elegant integration of different
Space and time 423

Table 21.2 Table summarising the seven baseline temporal operators of Allen’s interval algebra (1983), which, along
with their inversions, define a total of 13 relationships between two temporal intervals.

Relation Illustration Interpretation


X <Y X X takes place before Y
Y >X Y
X mY X X meets Y (i=inverse)
Y miX Y
X oY X X overlaps with Y
X oiY Y
X sY X X starts with Y
Y siX Y
X dY X X during Y
Y diX Y
X fY X X finishes with Y
Y xiX Y
X =Y X X is equal to Y
Y

types of temporal information within a single data structure (such as absolute and relative temporal data)
perhaps drawing upon the complex nuances offered by ‘Allen operators’ (Allen, 1983). These operators
define 13 base relations for temporal reasoning, that capture the relationship between a pair of intervals
as tabulated in Table 21.2.
Extensions to the CRM, including the CRM-­EH and particularly the CRM-­archaeo, have begun
to define and model these operators, and take into account other forms of temporal models (including
‘Spacetime volume’). In experimenting with the application of the CRM, considerable work has been
done demonstrating how dates and timespans (instances and intervals) can be aligned at a disciplinary
level for use with Semantic Web modeling (Binding, 2011). Further research has also been conducted
considering ways to semantically handle spatial data (Wright, 2011; Doerr & Hiebel, 2013; Hiebel,
Doerr, & Eide, 2013; Hiebel, Doerr, Hanke, & Masur, 2014; Hiebel, Doerr, & Eide, 2016) and to some
extent stratigraphic data (Cripps, Greenhalgh, Fellows, May, & Robinson, 2004; Cripps & May, 2010;
Tudhope, Binding, Jeffrey, May, & Vlachidis, 2011). This work has culminated in the construction of
some interesting prototypes for geo-­based Semantic Web applications (see, for example, the Pelagios
Commons project: http://commons.pelagios.org; Isaksen, Barker, Simon, & de Soto, 2014; Barker, Simon,
Isaksen, & de Soto Cañamares, 2016) and robust efforts to define broader temporal definitions in order
to facilitate the semantic interoperability of temporal data (see for, example, the PeriodO project: http://
perio.do; Rabinowitz, 2014; Rabinowitz, Shaw, Buchanan, Golden, & Kansa, 2016). However, despite
this, there is considerable work to be done to make these technologies user friendly enough for wider
implementation. These fields of research will undoubtedly have an important impact upon the future of
‘spatio-­temporal technologies’.

The potential of qualitative GIS: towards landscapes, taskscapes and


spatiotemporal narratives
Ultimately, as archaeological practitioners of spatial technologies move away from the limitations of a
Cartesian and Euclidean concept of space/time as a measurable set of ‘2→3→4 Dimensions’, there is a
424 James S. Taylor

commensurate need to continue to refine space/time in this sense from a basic software development
perspective. Perhaps what practitioners in archaeological spatial technologies should really be striving
for, given the importance of various temporalities to the discipline, is a move towards a more inferred,
interpretative spatiotemporality, taking into account past perceptions of time, including the landscapes
and taskscapes of Ingold (1993), or the narrative biographical concepts of temporality explored by the
likes of Lucas (2005), Yamin (1998, 2001), Beaudry (1998), and King (2006). These qualitative analytical
methods resonate with trends laid out in the growing body of ‘critical GIS’ literature that has emerged
since the mid 1990s, in response to post-­modern critiques of GIS technologies (Pickles, 1995), and the
consolidation of a discrete and complementary field of GIScience (for a review of this critique and history
of this sub-­discipline see Elwood, 2006; O’Sullivan, 2006; Pavlovskaya, 2006). Much of the discourse of
critical GIS highlights the obvious tension between the ease of producing more conventional ‘represen-
tational’ outputs based upon Euclidian spatial and temporal data constructs and the difficulties of using
GIS to offer more fluid, qualitative and interpretative ‘non-­representational’ spatiotemporal outputs. This
is further echoed in recent calls for a more non-­representational approach to applied GIS in archaeology,
which seek to understand the world as being “spatio-­temporally contingent”, where “the past [is not]
understood as a frozen and pre-­g iven entity [. . .] but rather as something that continuously melts down
and is remade in the present” (Hacιgüzeller, 2012, p. 255). Perhaps the potential affordances of the ‘graph
data’ approaches emerging from research into semantic ontologies and the still underexploited ‘O-­O’ DB
systems will ultimately help archaeologists to visualise a more complex, socially oriented and integrated
spatiotemporality.

References
Abraham, T., & Roddick, J. F. (1999). Survey of spatio-­temporal databases. GeoInformatica, 3(1), 61–99.
Allen, J. F. (1983). Maintaining knowledge about temporal intervals. Communications of the ACM 26, 11, 832–843.
Andresen, J., & Madsen, T. (1992). Data structures for excavation recording: A case of complex information manage-
ment. In C. U. Larsen (Ed.), Sites & monuments: National archaeological records (pp. 49–67). Copenhagen: National
Museum of Denmark.
Andresen, J., & Madsen, T. (1996a). IDEA: The integrated database for excavation analysis. In H. Kamermans &
K. Fennema (Eds.), Interfacing the past: Computer applications and quantitative methods in archaeology CAA95
(pp. 3–14). Leiden: Analecta Praehistorica Leidensia.
Andresen, J., & Madsen, T. (1996b). Dynamic classification and description in the IDEA. Archeologia e Calcolatori,
7, 591–602.
Bailey, G. N. (1983). Concepts of time in quaternary prehistory. Annual Review of Anthropology, 12, 165–192.
Bailey, G. N. (1987). Breaking the time barrier. Archaeological Review from Cambridge, 6, 5–20.
Bailey, G. N. (2007). Time perspectives, palimpsests and the archaeology of time. Journal of Anthropological Archaeol-
ogy, 26, 197–223.
Bailey, G. N. (2008). Time perspectivism: Origins and consequences. In S. Holdaway & L. Wandsnider (Eds.), Time
in archaeology: Time perspectivism revisited (pp. 13–30). Utah: Utah University Press.
Barker, E., Simon, R., Isaksen, L., & de Soto Cañamares, P. (2016). The pleiades gazetteer and the pelagios project.
In M. L. Berman, R. Mostern, & H. Southall (Eds.), Placing names: Enriching and integrating gazeteers (pp. 97–109).
Bloomington: Indiana University Press.
Barrett, J. C. (1994). Fragments from antiquity: An archaeology of social life in Britain, 2900–1200 BC. Oxford: Blackwell.
Bayliss, A., Brock, F., Farid, S., Hodder, I., Southon, J., & Taylor, R. E. (2015). Getting to the bottom of it all: A
bayesian approach to dating the start of Çatalhöyük. Journal of World Prehistory, 28(1). https://doi.org/10.1007/
s10963-­015-­9083-­7
Bayliss, A., Bronk-­Ramsey, C., Van der Plicht, J., & Whittle, A. (2007). Bradshaw and bayes: Towards a timetable for
the Neolithic. Cambridge Archaeological Journal, 17(1), 1–28.
Beaudry, M. C. (1998). Farm journal: First person, four voices. Historical Archaeology, 32(1), 20–33.
Space and time 425

Berggren, Å., Dell’Unto, N., Forte, M., Haddow, S., Hodder, I., Issavi, J., . . . Taylor, J. (2015). Revisiting reflex-
ive archaeology at Çatalhöyük: Integrating digital and 3D technologies at the trowel’s edge. Antiquity, 88,
433–448.
Bezzi, A., Bezzi, D., Francisci, D., & Gietl, R. (2006). L’utilizzo di Voxel in campo archeologico. Paper given at Settimo
Meeting Degliutenti Italiani di GRASS, febrarrio, Genova, Italy.
Binding, C. (2011). Relatively speaking: Temporal alignment for archaeological data. Paper given at KCL, London,
PELAGIOS Project Workshop.
Block, M. (1977). The past in the present and the past. Afan, 12, 278–292.
Bourdieu, P. (1977). Outline of a theory of practice. Cambridge: Cambridge University Press.
Bradley, R. (1991). Ritual, time, and history. World Archaeology, 23, 209–219.
Bradley, R. (1998). The significance of monuments. London: Routledge.
Braudel, F. (1972). The meditérranéan and the meditérranéan world in the age of Phillip II. New York: Harper and Row.
Braudel, F. (1980). On history. London: Weidenfield & Nicolson.
Carlstein, T. D. P., Parkes, D., & Thrift, N. (Eds.). (1975). Human activity and time geography. London: Unwin Hyman.
Carlstein, T. D. P., & Thrift, N. (Eds.). (1978). Timing space and spacing time, Vol. II: Human activity and time geography.
London: Edward Arnold Ltd.
Castleford, J. (1992). Archaeology, GIS and the time dimension: An overview. In G. Lock & J. Moffett (Eds.), CAA91.
Computer applications and quantitative methods in archaeology 1991 (BAR International Series S577) (pp. 95–106).
Oxford: Tempus Reparatum.
Chadwick, A. M. (2001). What have the post-­processualists ever done for us? Towards an integration of theory and practice;
and radical field archaeologies. Paper given at University of York, Interpreting Stratigraphy Group Meeting, York
(pp. 9–36).
Clark, G. (1992). Space, time, and man: A prehistorian’s view. Oxford: Cambridge University Press.
Cooper, A., & Green, C. (2017). Big questions for large, complex datasets: Approaching time and space using com-
posite object assemblages. Internet Archaeology, 45. https//doi.org/10.11141/ia.45.1
Crema, E. R. (2011). Aoristic approaches and voxel models for spatial analysis. In E. Jerem, F. Redő, & V. Szeverényi
(Eds.), On the road to reconstructing the past: Computer applications and quantitative methods in archaeology (CAA): Pro-
ceedings of the 36th international conference, Budapest, April 2–6, 2008 (pp. 179–186) (CD-­ROM 199–106). Budapest:
Archeaeolingua.
Cripps, P., Greenhalgh, A., Fellows, D., May, K., & Robinson, D. (unpublished). Ontological modelling of the work of the
centre for archaeology, September 2004.
Cripps, P., & May, K. (2010). To OO or not to OO? Revelations from defining an ontology for an archaeological
information system. In F. Niccolucci & S. Hermon (Eds.), Beyond the artefact – Digital interpretation of the past.
Computer applications and quantitative methods in archaeology (CAA). Proceedings of the 32nd international conference,
Prato, Italy, 13–17 April 2004 (pp. 57–61). Budapest: Archaeopress.
Daly, P. T., & Lock, G. R. (1999). Timing is everything: Commentary on managing temporal variables in geographic information
systems. Paper given at Barcelona, Computer Applications and Quantitative Methods in Archaeology: CAA98,
March 1998 (pp. 287–293).
De Roo, B., Bourgeois, J., & Maeyer, P. D. (2013). On the way to a 4D archaeological GIS: State of the art, future
directions and need for standardization. Proceedings of the 2013 Digital Heritage International Congress, 2.
Doerr, M., & Hiebel, G. (unpublished). CRMgeo: Linking the CIDOC CRM to GeoSPARQL through a spatiotemporal
refinement, April 2013.
Einstein, A. (1905). Zur Elektrodynamik bewegter Körper. Annalen der Physik, 17, 891.
Elwood, S. (2006). Critical issues in participatory GIS: Deconstructions, reconstructions, and new research directions.
Transactions in GIS, 10(5), 693–708.
Evans-­Pritchard, E. (1939). Nuer time reckoning. Africa, 12, 189–216.
Fabian, J. (1983). Time and the other: How anthropology makes its object. New York: Columbia University Press.
Feder, J. (1993). Museums index: An object oriented approach to the design and implementation of a data driven
data base management system. In J. Andresen, T. Madsen, & I. Scollar (Eds.), Computer applications and quantitative
methods in archaeology: CAA 1992 (pp. 221–228). Aarhus: Aarhus University Press.
Frachetti, M. (1998). Two times for val camonica (unpublished Masters Dissertation). University of Cambridge,
Cambridge.
426 James S. Taylor

Gell, A. (1992). The anthropology of time: Cultural constructions of temporal maps and images. Oxford: Berg.
Giddens, A. (1984). The constitution of society: Outline of the theory of structuration. Cambridge: Polity Press.
Gosden, C. (1994). Social being and time. Oxford: Blackwell.
Green, C. (2011a). Winding Dali’s clock: The construction of a fuzzy temporal-­GIS for archaeology. BAR International Series
2234. Oxford: BAR Publishing.
Green, C. (2011b). It’s about time: Temporality and intra-­site GIS. In E. Jerem, F. Redö, & V. Szeverényi (Eds.), CAA
2008: On the road to reconstructing the past. Budapest: Archaeolingua.
Hacιgüzeller, P. (2012). GIS, critique, representation and beyond. Journal of Social Archaeology, 12(2), 245–263.
Hägerstrand, T. (1967). Innovation diffusion as a spatial process. Chicago, IL: The University of Chicago Press.
Hägerstrand, T. (1975). Survival and arena: On the Üfe history of individuals in relation to their geographic environ-
ment. In C. T., P. D., & T. N. (Eds.), Human activity and time geography. London: Unwin Hyman.
Halls, P. J., & Miller, A. P. (1995). Moving GIS into the fourth dimension . . . or the case for todes. Paper given at the GIS
Research UK 1995 Conference, Department of Surveying, University of Newcastle upon Tyne. Newcastle upon
Tyne (pp. 41–43).
Halls, P. J., & Miller, A. P. (1996). Of todes and worms: An experiment in bringing time to ArcInfo. Paper given at the ESRI
European Users Conference. Watford (pp. 1–15).
Harris, E. C. (1989). Principles of archaeological stratigraphy (2nd ed.). London: Academic Press.
Harvey, D. (1991). The condition of postmodernity: An enquiry into the origins of cultural change. Oxford: Blackwell.
Hazelton, N. W. J. (1991). Integrating time, dynamic modelling and geographic information systems: Development
of four-­dimensional GIS (unpublished Ph.D). Melbourne: Department of Surveying and Land Information, The
University of Melbourne.
Headland, T., Pike, K., & Harris, M. (1990). Emics and etics: The insider/outsider debate. Newbury Park, CA: Sage
Publications.
Heidegger, M. (1953). Being and time. New York: State University of New York Press.
Hiebel, G., Doerr, M., & Eide, Ø. (2013). CRMgeo: Integration of CIDOC CRM with OGC standards to model spatial
information. Paper given at CAA 2013, 41st Conference in Computer Applications and Quantitative Methods in
Archaeology. Perth, Australia.
Hiebel, G., Doerr, M., & Eide, Ø. (2016). CRMgeo: A spatiotemporal extension of CIDOC-­CRM. International
Journal on Digital Libraries, 18(4), 271–279. https://doi.org/10.1007/s00799-­016-­0192-­4
Hiebel, G., Doerr, M., Hanke, K., & Masur, A. (2014). How to put archaeological geometric data into context?
Representing mining history research with CIDOC CRM and extensions. International Journal of Heritage in the
Digital Era, 3(3), 557–578. doi: 10.1260/2047-­4970.3.3.557
Husserl, E. (1966 [1887]). The phenomenology of internal time consciousness. Bloomington: INA, Midland Books.
Ingold, T. (1993). The temporality of the landscape. World Archaeology, 25, 152–174.
Isaksen, L., Barker, E., Simon, R., & de Soto, P. (2014). Pelagios and the emerging graph of ancient world data. WebSci’14
Proceedings of the ACM Conference on Web Science, 22–26 June, Bloomington, IN, USA (pp. 197–201).
Johnson, I. (1997). Mapping the fourth dimension: The TimeMap project. Paper given at University of Birmingham,
Computer Applications and Quantitative Methods in Archaeology: CAA97, 1999, 82/CDROM.
Johnson, I. (2002a). Contextualising archaeological information through interactive maps. Internet Archaeology, 12.
https://doi.org/10.11141/ia.12.9
Johnson, I. (2002b). Mapping the humanities: The whole is greater than the sum of its parts. In Proceedings of digital
resources in the humanities. Sydney: Research Institute of Humanities and Social Sciences.
Johnson, I. (2003). Aoristic analysis: Seeds of a new approach to mapping archaeological distributions through time.
In Proceedings of the computer applications into archaeology conference (pp. 448–452). BAR International Series 1227.
Oxford: BAR Publishing.
Johnson, I. (2004, July/August). Putting time on the map: Using Time map for map animation and web delivery.
Geoinformatics, 26–29.
Johnson, I., & Wilson, A. (2002). The TimeMap Kiosk: Delivering historial images in a spatio-­temporal context. Paper given
at Gotland, Sweden, Computer Applications and Quantitative Methods in Archaeology: CAA 2001, April 2001
(pp. 71–78).
Johnson, I., & Wilson, A. (2003). The Time map project: Developing time-­based GIS display for cultural data. Journal
of GIS in Archaeology, 1, 123–135.
Space and time 427

Kelemis, J. (1991). Time and space in geographic information: Toward a four-­dimensional spatiotemporal data model
(unpublished PhD). Pennsylvania: The Pennsylvania State University.
King, J. A. (2006). Historical archaeology, identities and biographies. In D. Hicks & M. C. Beaudry (Eds.), The
Cambridge companion to historical archaeology. Cambridge: Cambridge University Press.
Kraak, M. J. (2003). The space-­time cube revisited from a geovisualization perspective. In ICC 2003: Proceedings of
the 21st international cartographic conference: Cartographic renaissance, 1988–1996. Durban, South Africa: International
Cartographic Association (ICA).
Kwan, M.-­P. (2002a). Time, information technologies, and the geographies of everyday life. Urban Geography, 23(5),
471–482.
Kwan, M.-­P. (2002b). Feminist visualization: Re-­Envisioning GIS as a method in feminist geographic research. Annals
of the Association of American Geographers, 92(4), 645–661.
Kwan, M.-­P. (2008). From oral histories to visual narratives: Re-­presenting the post-­September 11 experiences of the
Muslim women in the USA. Social & Cultural Geography, 9(6), 653–669.
Kwan, M.-­P., & Ding, D. (2008). Geo-­Narrative: Extending geographic information systems for narrative analysis in
qualitative and mixed-­method research. The Professional Geographer, 60(4), 443–465.
Landeschi, G. (2018). Rethinking GIS, three-­dimensionality and space perception in archaeology. World Archaeology,
1–16. https://doi.org/10.1080/00438243.2018.1463171
Langran, G. (1989). A review of temporal database research and its use in GIS applications. International Journal of
Geographical Information Systems, 3, 215–232.
Langran, G. (1992). Time in geographic information systems. London: Taylor & Francis.
Langran, G., & Chrisman, N. R. (1988). A framework for temporal geographic information. Cartographica, 25(3),
65–99.
Lefebvre, B. (2009). How to describe and show dynamics of urban fabric: Cartography and chronometry? Paper given at Wil-
liamsburg, Virginia, USA., CAA 2009, 22–26 March 2009 (pp. 1–15).
Lefebvre, B., Rodier, X., & Saligny, L. (2008). Understanding urban fabric with the OH_FET model based on social
use, space and time. Archeologia e Calcolatori, 19, 195–214.
Leone, M. (1978). Time in American archaeology. In C. Redman (Ed.), Social archaeology: Beyond subsistence and dat-
ing. London: Academic Press.
Levi-­Strauss, C. (1948). Race et Histoire. Paris: UNESCO.
Levi-­Strauss, C. (1961). Tristes Tropiquez (A world on the wane). London: Hutchinson.
Lieberwirth, U. (2008). Voxel-­based 3D GIS: Modelling and analysis of archaeological stratigraphy. In B. Frischer &
A. Dakouri-­Hild (Eds), Beyond illustration: 2D and 3D digital tools for discovery in archaeology, British archaeological
reports international series 1805 (pp. 250–271). Oxford: Archaeopress.
Lin, H., & Mark, D. (1991). Spatio-­Temporal INtersection (STIN) and volumetric modeling. GIS/LIS Proceedings, 28
October–1 November, Atlanta, Georgia, USA (pp. 982–991).
Lock, G. R., & Daly, P. T. (1998). Looking at change, continuity and time in GIS: An example from the Sangro Valley,
Italy. In J. A. Barceló, I. Briz, & A. Vila (Eds.), Computer applications and quantitative methods in archaeology: CAA98
(pp. 259–264). Barcelona: Archaeopress.
Lock, G. R., & Harris, T. M. (1997). Analyzing change through time within a cultural landscape: Conceptual and
functional limitations of a GIS approach. In P. Sinclair (Ed.), Urban origins in Eastern Africa, world archaeological
congress, one world archaeology series. London: Routlegde.
Lucas, G. (2001). Critical approaches to fieldwork: Contemporary and historical archaeological practice. London: Routledge.
Lucas, G. (2005). The archaeology of time. Abingdon, Oxon.: Routledge.
Madsen, T. (2003). ArchaeoInfo: Object-­oriented information system for archaeological excavations. Paper given at Vienna,
Computer Applications and Quantitative Methods in Archaeology, 8–12 April.
McTaggart, J. M. E. (1908). The unreality of time. Mind, 17, 457–474.
Miller, H. J. (2005). A measurement theory for time geography. Geographical Analysis, 37(1), 17–45.
Miller, H. J., & Bridwell, S. A. (2009). A field-­based theory for time geography. Annals of the Association of American
Geographers, 99(1), 49–75.
Mlekuž, D. (2010). Time geography, GIS and archaeology. In F. Contreras & F. J. Melero (Eds.), CAA 2010 “fusion
of cultures” proceedings of the 38th conference on computer applications and quantitative methods in archaeology granada,
Spain, April 2010 (BAR International Series 2494). Oxford: Archaeopress.
428 James S. Taylor

Orengo, H. A. (2013). Combining terrestrial stereophotogrammetry, DGPS and GIS-­based voxel modelling in the
volumetric recording of archaeological features. ISPRS Journal of Photogrammetry and Remote Sensing, 76, 49–55.
https://doi.org/10.1016/j.isprsjprs.2012.07.005
O’Sullivan, D. (2006). Geographical information science: Critical GIS. Progress in Human Geography, 30(6), 783.
Parkes, D., & Thrift, N. (1978). Putting time in its place. In T. Carlstein, D. Parkes, & N. Thrift (Eds.), Timing space
and spacing time,Vol. 1: Making sense of time (pp. 119–129). London: Edward Arnold, Ltd.
Pavlovskaya, M. (2006). Theorizing with GIS: A tool for critical geographies?. Environment and Planning A, 38(11),
2003–2020.
Peuquet, D. J. (2002). Representations of space and time. New York: The Guilford Press.
Peuquet, D. J., & Duan, N. (1995). An Event-­Based Spatio-­Temporal Data Model (ESTDM) for temporal analysis
of geographical data. International Journal of Geographical Information Systems, 9(1), 7–24.
Pickles, J. (Ed.). (1995). Ground truth: The social implications of geographic information systems. New York: Guildford Press.
Rabinowitz, A. (2014). It’s about time: Historical periodization and linked ancient world data. ISAW Papers, 7(22).
http://dlib.nyu.edu/awdl/isaw/isaw-­papers/7/rabinowitz/
Rabinowitz, A., Shaw, R., Buchanan, S., Golden, P., & Kansa, E. (2016). Making sense of the ways we make sense of the past:
The PeriodO Project. Bulletin of the Institute of Classical Studies, 59(2), 42–55. doi:10.1111/j.2041-­5370.2016.12037.x
Renolen, A. (1997). Temporal maps and temporal geographical information systems (Review of Research), Department of
Surveying and Mapping, The Norwegian Institute of Technology.
Richards, J. (1998). Recent trends in computer applications in archaeology. Journal of Archaeological Research, 6(4),
331–382.
Roddick, J. F., & Patrick, J. D. (1992). Temporal semantics in information systems, A survey. Information Systems,
17(3), 249–267.
Rodier, X., & Saligny, L. (2010). Modélisation des objets historiques selon la fonction, l’espace et le temps pour
l’étude des dynamiques urbaines dans la longue durée. Cybergeo: European Journal of Geography [online], Systèmes,
Modélisation, Géostatistiques, document 502. doi: 10.4000/cybergeo.23175
Rucker, R. (1977). Geometry, relativity and the fourth dimension. New York: Dover Books.
Scheder Black, A. (2011). Visualising the past: A prototype temporal GIS for archaeology. (unpublished Archaeology
M.Sc. Dissertation). University of York.
Shanks, M., & Tilley, C. (1987). Abstract and substantial time. Archaeological Review from Cambridge, 6, 32–41.
Sinton, D. (1978). The inherent structure of information as a constraint to analysis: Mapped thematic data as a case
study. In G. Dutton (Ed.), Harvard papers on GIS. Reading, MA: Addison-­Wesley.
Snodgrass, R. T. (1992). Temporal databases. In A. U. Frank, I. Campari, & U. Formentini (Eds.), Proceedings of the
international conference on GIS: From space to territory: Theories and methods of spatio-­temporal reasoning in geographic
space, Pisa, Italy, 21–23, 1992 (pp. 22–64). Berlin: Springer-­Verlag.
Snow, D. (1997). GIS and northern iroquoian demography. In I. Johnson & M. North (Eds.), Archaeological applications
of GIS: Sydney University archaeological methods series #5. Sydney: Prehistoric & Historical Archaeology, University
of Sydney. CD – ROM.
Soja, E. W. (1989). Postmodern geographies: The reassertion of space in critical social theory. London: Verso Press.
Soja, E. W. (1996). Thirdspace: Journeys to Los Angeles and other real-­and-­imagined places. Oxford: Basil Blackwell.
Spikins, P. (1997). GIS modelling of holocene vegetation dynamics in Northern England. In I. Johnson & M. North
(Eds.), Archaeological applications of GIS: Sydney University archaeological methods series #5. Sydney: Prehistoric &
Historical Archaeology, University of Sydney. CD – ROM.
Szegö, J. (1987). Human cartography: Mapping the world of man. Stockholm: Swedish Council for Building Research.
Taylor, J. S. (2016). Making time for space at Çatalhöyük: GIS as a tool for exploring intra-­site spatiotemporality
within complex stratigraphic sequences (unpublished PhD PhD). University of York, UK.
Taylor, J. S., Bogaard, A., Carter, T., Charles, M., Haddow, S., Knüsel, C. J., Mazzucato, C., Mulville, J., Tsoraki, C.,
Tung, B., & Twiss, K. (2015). “Up in flames”: A visual exploration of a burnt building at Çatalhöyük in GIS. In
I. Hodder & A. Marciniak (Eds.), Assembling Çatalhöyük (pp. 128–149). Leeds: Maney Publishing.
Taylor, J. S., Issavi, J., Berggren, Å., Lukas, D., Mazzucato, C., Tung, B., & Dell’Unto, N. (2018). The rise of the
machine: The impact of digital tablet recording in the field at Çatalhöyük. Internet Archaeology, 47. https://doi.
org/10.11141/ia.47.1
Space and time 429

Taylor, J. S., & Wright, H. (2012). Getting into time: Exploring semantic web-­based research questions for the spatio-­temporal
relationships at Çatalhöyük. Paper given at Southampton, UK, Archaeology in the Digital Era: 40th Computer
Applications and Quantitative Methods in Archaeology (CAA) Conference.
Terrell, J. E., & Welsch, R. (1997). Lapita and the temporal geography of prehistory. Antiquity, 71, 548–572.
Thomas, J. (1996). Time, culture, and identity. London: Routledge.
Tschan, A. P. (1998). An introduction to object-­oriented GIS in archaeology. In J. A. Barceló, I. Briz, & A. Vila
(Eds.), Computer applications and quantitative methods in archaeology: CAA98 (pp. 303–316). Barcelona: Archaeopress.
Tudhope, D., Binding, C., Jeffrey, S., May, K., & Vlachidis, A. (2011). A sTellar role for knowledge organization
systems in digital archaeology. Bulletin of the American Society for Information Science and Technology, 37(4), 15–18.
van der Meulen, M. J., Doornendal, J. C., Gunnink, J. L., Stafleu, J., Schokker, J., Vernes, R. W., . . . van Daalen,
T. M. (2013). 3D geology in a 2D country: Perspectives for geological surveying in the Netherlands. Netherlands
Journal of Geoscience, 92(4), 217–241.
Whittle, A., Richardson, W., Healy, F., Alton, F. M., & Bayliss, A. (2011). Gathering time: Dating the early neolithic enclo-
sures of Southern Britain and Ireland. Oxford: Oxbow Books.
Wright, H. E. (2011). Seeing triple: Archaeology, field drawing and the semantic web (unpublished PhD). University
of York.
Yamin, R. (1998). Lurid tales and homely stories of New York’s notorious five points. Historical Archaeology, 32(1),
74–85.
Yamin, R. (2001). Alternative narratives: Respectability at New York’s five points. In A. Mayne (Ed.), The archaeology
of urban landscapes: Explorations in slumland. Cambridge: Cambridge University Press.
Yu, H. (2006). Spatio-­temporal GIS design for exploring interactions of human activities. Cartography and Geographic
Information Science, 33(1), 3–19.
22
Challenges in the analysis of
geospatial ‘big data’
Chris Green

Introduction
It has become a cliché, although that does not make it any less true, to state that we now live in the
information age and are all members of an information society (Huggett, 2012, p. 539). Archaeology as
a discipline is going through its own 21st century computing revolution (Levy, 2014), with many of its
‘grand challenges’ requiring sophisticated computer modelling and large-­scale synthetic research. The
greatest potential payoff in this regard will arise from systematic exploitation of data resulting from the
explosion in archaeological investigation that has taken place in many countries (especially in Europe
and North America) since the mid to late 20th century (Figure 22.1), due in part to laws protecting
archaeological resources (that are particularly relevant in the context of commercial projects) (Kintigh
et al., 2014, p. 879). This is because data possess both the characteristic that they can be processed again
and again without losing value (although reprocessing processed data over and over again will inevitably
cause deterioration in quality due to the Second Law of Thermodynamics [Groot, 2017, p. 208]) and the
characteristic that their value arises from what they may reveal in aggregate: recombining data in novel
ways can trigger new insights (Gattiglia, 2015, pp. 113, 117).
Our immersion in data has triggered concerns about information overload (Huggett, 2012, p. 539),
both within archaeology as a discipline and society as a whole. Archaeologists, despite some concerns to
the contrary, are not alone in having to deal with data that is difficult to reconcile or that means differ-
ent things to different people. A good example from medical science is the International Classification
of Diseases (ICD) database maintained by the World Health Organisation, which is subject to constant
modification and to constant local reinterpretation according to conditions and opinions at the medi-
cal ‘coalface’. The standards defined therein mean different things to different users and any impression
of international standardisation is only really skin deep (Bowker & Star, 1999). These are issues that are
similarly true of archaeological datasets, which are equally subject to the creative reinterpretation of so-­
called standards and the metamorphosis of ‘agreed’ terminologies over time.
At the time of writing archaeology stands on the brink of an immense challenge, which we must not
fail to embrace: to dive in and attempt to reinterpret our models on a grand spatial and temporal scale
(Cooper & Green, 2016; see Atici, Kansa, Lev-­Tov, & Kansa, 2013). As stated, archaeological data has been
multiplying at an increasing rate and we should not let concerns about data quality and data coherency
unduly hold us back. If we wish to justify our continuing existence in a world of increasing pressure
The analysis of geospatial ‘big data’ 431

Figure 22.1 Archaeological investigations recorded over time in England from the 17th century until 2013
as in the National Record of the Historic Environment Excavation Index (based upon data extracted from
Historic England, 2011). The picture is certainly not complete, but is representative of the immense increase
in archaeological investigation seen from the mid to late 20th century.

on archaeological resources, both in material and personnel terms, then we need to continually strive to
demonstrate that our data can produce exciting new models and insights into the past (and potentially
the future) of humanity. Part of the solution to answering that challenge lies in big data analytics and,
more particularly, analytics of geospatial big data (McCoy, 2017, p. 74):

The need for larger and more integrated geospatial data and analyses cross-­cuts virtually all of our
goals and aspirations . . . These require us to produce data and results that are scientific (testable,
replicable), authentic (a faithful representation of the archaeological record and the human past),
and ethical (protects cultural resources).

Big data
Big data is a buzzword that has achieved little consensus as to its meaning, which varies significantly
between disciplines. Its definitions should be relative, not absolute (Gattiglia, 2015, p. 114), and should be
related to the resources available to those dealing with the data in question. The threshold for data that
is challenging due to its size or complexity will be much lower within a less well-­resourced discipline
432 Chris Green

such as archaeology (e.g. compared to earth sciences) (Austin & Mitcham, 2007, p. 12). McCoy defines
Geospatial big data as being datasets which include locational information that exceed the capacities of
hardware/software/human resources, which he argues is not a criteria yet met by archaeological data-
sets, with the exception of some remotely sensed data (2017, p. 74). By contrast, Gattiglia suggests that
big data could alternatively be taken to mean working with the maximum amount of data available or
useful to approach answering a question (‘big data as all data’), which would mean working not neces-
sarily just on a broad scale but (in combination or alternatively) also working at an intense level of detail
(2015, p. 114). In essence, we can see in this a distinction between two broad categories of big data:
datasets that are large but primarily numeric (referred to here as ‘scientific’ big data) and datasets that
are potentially not so large but which contain a complex mix of numeric and textual detail (referred to
here as ‘human’ big data).

Large numeric datasets: ‘scientific’ big data


This former category is where most definitions of big data in the hard sciences would fall, for example
the immense datasets gathered by the Large Hadron Collider or the Sloan Digital Sky Survey. These
datasets probably vastly dwarf in digital terms all of the data ever gathered for archaeological purposes
and, as such, few of our datasets suffer from the same difficulties in terms of data storage and analytical
processing. However, some datasets used for archaeological purposes do fall within the lower limits of this
category of big data: specifically, (mostly) geospatial datasets collected by what Opitz and Limp (2015)
call High Density Survey and Measurement (HDSM). HDSM includes an expanding list of technologies
such as: laser scanning (both airborne lidar and terrestrial); Global Navigation Satellite System (GNSS)
and Global Positioning System (GPS) survey and site recording; and Structure from Motion (SfM)/Close
Range Photogrammetry (CRP) recording of sites and objects (sometimes from the air; see also Kalaycı,
this volume; Sarris, this volume); all of which facilitate the measuring/recording/analysing of the spatial/
morphological properties of archaeological entities, at a range of scales from the object, through the site,
to the landscape (Opitz & Limp, 2015, p. 348).
Essentially, HDSM records x, y, and z coordinates for the surfaces of entities, spaced closely enough
to suggest the form of the entities in question (Opitz & Limp, 2015, p. 348). Geospatial datasets that fall
within this category have immense potential to transform how certain aspects of archaeological analysis
are performed, particularly in regard to the direct analysis of shape and place (e.g. lidar analysis of monu-
ment forms across large landscapes), with the increasing availability of the tools, software, and expertise
meaning that many of these techniques are moving from being specialised to normal practice (Opitz &
Limp, 2015, p. 348), especially structure from motion photogrammetry as part of site recording (which
has rapidly become a near essential part of the daily practice of field archaeology). However, these are
datasets that can only very rarely be constructed from archival materials and, therefore, they largely rely
upon the creation of new datasets during new investigations of archaeological entities. As such, although
a topic of great potential and archaeological interest, the greater current challenges for the discipline in
getting to grips with big data lie elsewhere, due to the large amount of data requiring further analysis that
was generated prior to the development of HDSM methods.

Complicated ‘alphanumeric’ datasets: ‘human’ big data


The vast majority of archaeological datasets fall within a paradigm that might be better classified within
the humanities or social sciences than within the pure sciences: they consist of a mixture of textual and
numerical attributes of differing degrees of standardisation and often have a large geospatial component.
The analysis of geospatial ‘big data’ 433

For example, an excavation site archive constructed today might include any of the following (amongst
many others): standardised context sheets describing individual archaeological features, photographs,
photogrammetric models, section drawings and plans, diaries, possibly videos, scientific samples and their
quantifications, quantifications of finds, etc. This complexity is only added to when moving up the chain
of interpretation towards synthetic projects and regional archaeological records. Furthermore, there are
a multiplicity of different data storage formats and data structures, and a whole spectrum of attitudes
towards data accessibility (McCoy, 2017; Dam, Austin, & Kenny, 2010; Kintigh, 2006; Snow et al., 2006).
Although what data can reveal in aggregate is one of its great strengths, data aggregation of this type of
material can amplify the volume of meaningless noise (Wesson & Cottier, 2014, p. 2) and will almost
always entail increased levels of ‘messiness’ (Gattiglia, 2015, p. 114).
Working with ‘human’ big data of this type means that we have to accept this degree of messiness,
simply because it becomes impossible to achieve the high levels of accuracy seen (at least on the surface) in
traditional methods (Gattiglia, 2015, p. 114): Harris’s (2013) eighth law of data quality states that a smaller
emphasis on data quality can sometimes enable bigger data-­driven insights, meaning that using a larger
amount of lower-­quality data can sometimes be better than using smaller amounts of higher-­quality data.
There are several ways in which we can mitigate these problems of ‘messiness’: through better planning
before data capture in the case of newly acquired datasets (Wesson & Cottier, 2014, p. 2; Huggett, 2012,
p. 540; Austin & Mitcham, 2007, p. 23); through thorough exploration of the histories and topographies
of existing datasets to better understand their qualities (Cooper & Green, 2016, pp. 283–284; Huggett,
2012, pp. 539–540, 546–547); through documentation of dataset quality in standalone reports (McCoy,
2017, pp. 91–92), through use of common ontologies and controlled vocabularies (e.g. Binding, May, &
Tudhope, 2008; Vlachidis, Binding, Tudhope, & May, 2010), and through the application of big data ana-
lytics to highlight anomalies in datasets and to verify data quality (Gattiglia, 2015, p. 118).
‘Human’ big data, then, is less about the size of datasets themselves but more the capacity to aggre-
gate, search, and cross-­reference multiple large datasets (boyd & Crawford, 2012, p. 663). Performing
analyses of this nature requires the increased usage of automated processing methodologies, as it rapidly
becomes impossible within reasonable time and cost constraints to process data using manual method-
ologies. This is thus the first definition of ‘human’ big data used here following the discussion above:
datasets that are too complex and/or large to process without the use of computer algorithms/scripts.
Further, applying the big data paradigm to complex alphanumeric datasets involves a shift away from
traditional hypothesis-­driven approaches towards evidence-­based and data-­driven approaches, which
are perhaps better able to avoid researcher-­derived biases. In this approach, hypotheses and models
come after data analysis, born from data rather than born from theory (Hey, Tansley, & Tolle, 2009;
Gattiglia, 2015, pp. 115–116; Kitchin, 2014, pp. 5–7; also see case study later in this chapter). Therefore,
this is the second complementary definition of ‘human’ big data used here: analyses conducted in an
exploratory manner and then used to construct models and hypotheses after the (initial) data processing
event. This does not preclude further data analysis to test models once they have been constructed, of
course. As Kitchin suggested (2014, p. 2):

Big data analytics enables an entirely new epistemological approach for making sense of the world;
rather than testing a theory by analysing relevant data, new data analytics seek to gain insights ‘born
from the data’.

The rest of this chapter will discuss methods that can be applied to ‘human’ big data, followed by a case
study. In summary, ‘human’ big data is defined here as (a) datasets that are too large to be processed using
manual methodologies that are (b) analysed using an exploratory data-­driven paradigm. Examples of this
434 Chris Green

type of study in archaeology are rare due to the only recent availability of large collated datasets, tools,
and sufficient computer processing power. Embracing these ideas is an essential next step for large scale
synthesis projects in archaeology, as projects conducted using conventional methodologies are rapidly
becoming impossible within the budget constraints of the discipline. For example, the recent Rural Settle-
ment of Roman Britain (RSRB) project at the University of Reading (Smith, Allen, Brindle, & Fulford,
2016) produced a rich and detailed dataset relating to 3,652 rural Roman period sites in England and
Wales excavated over the last few decades (Allen et al., 2015). The production of this dataset involved the
employment of three (later four) full time postdoctoral staff over a period of around four years, reading
reports and entering data into a database. Yet the National Record of the Historic Environment Exca-
vation Index (Historic England, 2011) suggests that at least 9,000 Roman sites have been subjected to
excavation in England since 1990 (a figure that should be considered an underestimate, although many
of these sites will not be ‘rural’), so their dataset would struggle to claim to be ‘complete’ (as impressive
as it is). As excavation and investigation of archaeological sites continues to take place at a rapid pace, we
have probably already reached the point where a project like RSRB could not now be undertaken without
the extensive use of data harvesting algorithms and other big data methodologies. A notable example of
a (very) large-­scale geospatial ‘human’ big data project would be Endangered Archaeology in the Middle
East and North Africa (EAMENA), which has been attempting to gather data on archaeology across a
massive swathe of the planet (largely through remote sensing methods), in order to understand and pro-
tect vulnerable archaeological resources (Rayne et al., 2017); however, EAMENA is only able to function
by operating with a much larger team and budget than a project such as RSRB. Even so, the progress
of the EAMENA team would still undoubtedly be much more rapid with the availability of improved
algorithms to handle pattern recognition from remote sensed data and algorithms that ‘read’ and extract
information from textual sources through natural language processing.

Method
There are many different ways in which one can approach complicated large datasets computationally.
Some applicable methods have been discussed in detail elsewhere in this volume (see, for instance, Bevan,
this volume; Conolly, this volume; Hacıgüzeller, this volume; Lloyd & Atkinson, this volume) and a few
others will be briefly outlined here.

The spatial binning method


One relatively straightforward way of analysing complex data is through applying automated simplifica-
tion systems. For archaeology, these can be any combination of spatial, temporal, or typological. The most
obvious way to simplify objects spatially is through reconfiguring data into spatial bins (Cooper & Green,
2016, p. 290; Green, 2013). This is a widely used technique in other disciplines (especially biology/zool-
ogy) and is also applied in archaeology to the results of extensive field walking surveys. In simple terms,
one creates a tessellating surface of polygons (usually squares, increasingly hexagons, very rarely triangles)
and then samples data falling within each polygon (or ‘bin’) (see also Banning, this volume).
If dealing with a single dataset or non-­overlapping datasets (whether spatially or in terms of content),
then one can summarise the material by summing or counting or by using any other mathematical
operation. For overlapping datasets where one is not certain which features are double (or more) counted
across datasets, then one can either summarise by taking the maximum value in any one dataset or by
simply recording the presence/absence of particular types of feature. This characteristic is especially useful
when dealing with large aggregated datasets built from multiple sources with minimal manual oversight
The analysis of geospatial ‘big data’ 435

(see case study in this chapter), as it will minimise problems caused by multiple counting of the same
feature(s). There are many ways of implementing the spatial binning of a dataset (e.g. using ‘Generate
Tessellation’ followed by ‘Summarize Within’ in ArcGIS Pro), so no detailed examples will be given here.
Hexagons are increasingly preferred over squares for sampling tessellations due to having less perceptual
issues over the human brain spotting false linear alignments in the binned data and due to the increased
simplicity of calculating neighbouring cells (Birch, Oom, & Beecham, 2007).

Exploratory statistics
Statistical analyses are inevitably key to the big data approach. These potentially include a vast number of
techniques applicable to particular problems or datasets, although exploitation of these is not yet common
in archaeology. An important theme within these statistical approaches is the data-­driven paradigm based
on exploratory statistics and in line with the second definition of ‘human’ big data used above. Instead
of hypothesising and modelling a relationship between a large dataset and a specific set of other selected
variables and then testing that model statistically, the data-­driven approach can involve testing many
relationships, both singly and in combination with a much larger set of potentially explanatory variables,
with little initial user selection other than gathering the maximum possible available input data sources.
A notable example of this methodology is the Exploratory Regression tool in ArcGIS (Rosenshein,
Scott, & Pratt, 2011), which tests a dataset against any number of potential explanatory variables using
Ordinary Least Squares (OLS) regression (see Hacıgüzeller, this volume). In each iteration, the dataset is
tested against an increasing number of combined other variables (generally from one up to a maximum
set by the user). At the end of the process, a report is generated which can be used to assess how well
each modelled set of variables explains variation in the original dataset. However, each extra iteration
massively increases the run time for the tool, which limits the usability of the technique on current stan-
dard desktop hardware. Techniques such as this possess immense potential for building stronger models
to explain spatial variation in archaeological data, however, and escape from some (but not all) of the
problems associated with creator subjectivity in conventional modelling exercises.

Semantic technologies and linked geospatial big data


The so-­called ‘Semantic Web’ (Berners-­Lee, Hendler, & Lassila, 2001) has been on the verge of revolu-
tionising how we interact with data and how data interacts with other data for the past two decades, albeit
arguably never quite coming to its full fruition. However, the combination of multiple datasets is greatly
aided by the use of ontologies designed for rendering datasets semantically interoperable and by the use
of standards in recording (Vlachidis et al., 2010; Nicolucci, 2017). Although the application of standards
is necessarily constraining in terms of how phenomena are recorded, especially where standards are nested
within other standards (Huggett, 2012, p. 548), they remain an immense aid when one is attempting to
integrate what would potentially otherwise be very disparate datasets. Even though standard terminolo-
gies may be applied in different ways by different researchers, they at the very least bring in some sense of
coherence across multiple datasets. Assessing the degree of coherence in the application of standards thus
becomes a necessary step in the combination of datasets (i.e. understanding an element of their character).
The application of semantic ontologies to geospatial datasets is a required step for further data mining
techniques. The most obvious of these would be natural language processing, which allows the ‘excavat-
ing’ of locational information from the text of existing documents. These methodologies enrich text with
associated semantic annotations that can then aid in information retrieval and in making inferences across
multiple documents/data sources (e.g. Vlachidis et al., 2010, pp. 468–469; geospatial examples include:
436 Chris Green

Smits & Friis-­Christensen, 2007; Yue, Gong, Di, He, & Wei, 2011). These developments go hand in hand
with new and improved cyber infrastructures for archaeological geospatial information (Kintigh, 2006;
Snow et al., 2006; e.g. Meghini et al., 2017; Elliott, Heath, & Muccigrosso, 2014), which would ideally be
referenced against schema designed to aid semantic interoperability (such as CIDOC-­CRM; cf. Doerr,
2003; Binding et al., 2008), and which should themselves no longer be seen as simply an additional tool
in the archaeologist’s toolbox, but rather as the key system that allows us to unlock meaning from within
our information (Llobera, 2011, p. 217).

Workflows and visualization


In some sense, however, the application of specific techniques is not the defining aspect of a geospatial
big data analytical approach: partly this is because it remains an evolving area and new techniques will
inevitably support and supplant current ones, but partly it is also because the key to working successfully
with large and complex datasets rests in a focus on defining sensible and productive workflows. In par-
ticular, this means taking a pragmatic approach to data analysis by seeking, defining, and following a set
of procedures that allows one to productively create new understandings. As a simple example, if replica-
bility is a minimal concern and an analysis is only likely to be undertaken a small number of times, one
can perform specific operations using different software depending on which software package is most
simply put to work on the specific task. However, if replicability is more important and/or an analysis
might be expected to be undertaken many times, then coding the procedure in a single software package
would probably be more prudent.
Creative visualization of the results of geospatial big data analytics is also vital, as large and complex
datasets inevitably become difficult to comprehensibly visualize using conventional methods: visualiza-
tion in this sense is distinct from more realistic ways of representing our data, but is rather the task of
transforming information into human-­perceptible form (Llobera, 2011, p. 195). The aim, thus, is to ease
understanding of data through enhancement of visual perception/cognition (Llobera, 2011, p. 197).
The electronic revolution is causing the visual medium to become even more the dominant medium of
thought, with a resultant move towards more experiential modes of apprehending and away from con-
ventional narrative explanation (cf. Carusi et al., 2015). In particular, the sciences have always struggled
to present their material in printed form as most scientific processes involve temporality (Gooding, 2008,
p. 46), just like archaeology. As archaeology also moves towards more data-­driven and exploratory modes
of working, we also need to find creative ways to visualize the results of our work: clearly this is not new,
but it is increasingly vital (see Eve & Graham, this volume; Gupta & Devillers, 2017). This requires better
access to training, in order to embed information visualization skills within the archaeological commu-
nity (Llobera, 2011, p. 218).

Case study
The English Landscapes and Identities project (EngLaId) ran from 2011 to 2016 at the University of
Oxford. It sought to construct syntheses using archaeological information for all of England from the
Middle Bronze Age (c.1500 BC) to the Domesday Book (AD 1086). Fundamentally, it was structured
around the reuse of legacy data collected from over eighty regional and national organisations. The
main project database of Bronze Age to early medieval archaeological sites (including single findspots
and records of uncertain date) contained over 900,000 records, of which around 800,000 were usable
in a spatial analysis context (with the others either lacking any spatial information or being errone-
ously extracted data relating to incorrect time periods). With a core day-­to-­day project team of four
The analysis of geospatial ‘big data’ 437

Figure 22.2 Comparison of different spatial bins used by the EngLaId (The English Landscapes and Identities)
project at the same scale.

postdoctoral researchers (plus PhD students, administrative support, etc.), it was clearly not possible to
attempt any level of analysis or cleaning of all of this data on a manual record-­by-­record basis. As such,
although more detailed studies were conducted on case study areas, at a national level all synthesis and
spatial analysis had to involve automated processing methods. Although the main project database was
only around 3.5GB in terms of storage usage, the database definitely passed the “too complex for manual
processing” test for ‘human’ big data.
The major issue in terms of cleaning this data involved the identification of double-­counted (or more)
records from multiple datasets. This was particularly problematic due to the combination of national
and regional level archaeological records, with many entities being included in more than one of these
source databases. The only reliable way to completely solve this problem, due to the lack of any common
identifiers recorded in most of the datasets, would have been to compare the records one-­by-­one against
all other nearby records on a map, a task made even more difficult due to spatial imprecision for some
of the entities. This would have been an impossible task within the time allowed and with the number
of people employed upon the project. As such, although the comparison was undertaken for small areas
of the country, the overall national synthesis was achieved using the spatial binning method discussed
briefly above.
To allow maximum flexibility in both spatial visualization and spatial analysis, three different sets of
spatial bins were constructed and the data collated using them on a presence/absence basis for 120 differ-
ent types of archaeological site and 22 different broad categories of archaeological find (Figure 22.2). The
coarsest bins were a set of regular hexagons where any one vertex was 5km from its second nearest neigh-
bour. For England, this resulted in 6,598 cells, which allowed for very rapid display and filtering of data.
However, these bins were clearly too coarse for most visualization (other than images designed to appear
at a very small size on a page, e.g. Figure 22.3) and for all analytical purposes. The second set of bins was a
set of hexagons with a 3km vertex to second nearest vertex resolution. This resulted in 17,922 cells, which
438 Chris Green

Figure 22.3 Artefacts recorded by the Portable Antiquities Scheme (PAS) displayed using 100 year time-­slices
from 1500 BC to AD 999. The data has been binned into 5 km hexagons and probabilities of each artefact
falling into each time-­slice in each hexagon calculated and then summed.

were of sufficiently fine spatial resolution for most of our national level mapping purposes, particularly
for images presented at around 20cm × 20cm size (Figure 22.4). However, again, these bins were probably
too coarse to attempt any robust statistical analyses. The final set of bins used for the project were 1×1km
squares, slightly offset from the 1,000m divisions of the Ordnance Survey (OS) National Grid to remove
quadruple counting (due to the procedure applied) of records at the origin point of OS kilometre grid
The analysis of geospatial ‘big data’ 439

Figure 22.4 Map produced using 3 km hexagons showing early medieval evidence for field systems.

squares (a situation reflecting lack of spatial precision in many cases, rather than records that actually fell
on grid cell origin points). This resulted in 136,767 cells, which were less useful for visualization purposes
(due to their small size compared to the scale of England and due to the perceptual issues associated with
square tessellations noted above), but which allowed reasonably robust statistical comparison of EngLaId
data against a great number of other variables of potentially explanatory nature. We found on the whole
that step changes in variables at the boundary lines between regional datasets made any problematic data
readily apparent (Figure 22.4 shows the border between Cornwall and Devon in southwest England very
clearly due to categorisation differences between the datasets maintained by the two local authorities)
(Cooper & Green, 2016, p. 294). The binned EngLaId data can be explored online (EngLaId team, 2016)
with the different sizes of spatial bin being displayed at different spatial scales/zoom levels.
440 Chris Green

Following the end of the project, further analyses have been attempted using data collated using
hectare (100×100m squares) spatial bins. This resulted in 13,478,926 cells for all of England. Process-
ing this data as a vector dataset is very intensive on conventional computer hardware and becomes very
time-­consuming. This can be mitigated by moving from a vector to a raster data model, but this makes
cross-­referencing between the different variables more difficult. Attempting to apply the Exploratory
Regression tool mentioned earlier to this full dataset is essentially impossible on conventional computer
hardware, as it only works with vector data and the vector dataset is too large to process at all (even when
converted to points). As such, sub-­sampling is necessary to produce any results, which can be done by
performing the analysis just on the bins containing archaeological material and a random sub-­sample
of the bins containing no archaeological material (of roughly equivalent numbers to those containing
archaeology): this reduces the number of cells to around 700,000 from over 13 million. However, by
sub-­sampling in this way, we are losing some of the explanatory potential of the dataset taken as a whole:
clearly, the ideal for big data analytics should be to attempt analysis on the fullest possible datasets (Gat-
tiglia, 2015, p. 114).
One of the source datasets utilised by the EngLaId project, and one particularly susceptible to a
geospatial big data approach, was the Portable Antiquities Scheme (PAS) database (British Museum,
2013–2018). This consists (at the time of writing) of over 800,000 records relating to over 1.3 million
archaeological objects reported (mostly) by members of the public. The EngLaId team approached the
PAS data in a number of different ways (see Cooper & Green, 2017 for more detail), but one relevant
example is shown in Figure 22.3. The production of this set of maps involved: (a) calculate the percent-
age probability of each record falling within a series of time-­slices based upon its assigned start and end
dates; (b) multiply those probabilities by the quantity of objects represented by each record; (c) calculate
which hexagon each record falls within spatially; (d) sum each value for each time-­slice for each 5km
hexagon; (e) attach the summed probabilities to the spatial dataset for the hexagons; (f) produce maps.
All of this can be relatively simply achieved using scripts produced in Python (or another programming
language) possibly alongside using tools provided in widely available GIS packages (for the spatial parts of
the procedure). This is just one example of how relatively simple but computationally intensive analyses
can produce interesting new results that would have taken many weeks of work to produce using con-
ventional methods.

Conclusion
It could be argued that, in the more human focussed disciplines such as archaeology, big data analytics
is less about the quantity of data and less about specific analytical techniques, and more about taking
a particular perspective when undertaking our analyses. That perspective can: involve gathering the
maximum possible material to approach our questions (“big data as all data”, Gattiglia, 2015); be about
understanding the different characters of our datasets (Cooper & Green, 2016; Huggett, 2012); be
about building exploratory models that avoid building in too many pre-­existing assumptions; and be
about accepting data-­driven workflows that do not generally start from a pre-­defined and overly
restrictive hypothesis or set of hypotheses. Many of the techniques described in this chapter could be
applied to very large or complex datasets, and new techniques will undoubtedly come along in future
that will also be of use. Also, computer hardware continues to become more powerful per unit of
money spent, aiding in processing large datasets, although associated data curation costs do not neces-
sarily become cheaper over time (Austin & Mitcham, 2007, p. 12).
However, the key point here is that discussion of software and hardware solutions are not the alpha
and the omega of discourse in computational archaeology: more important to the future of the discipline
The analysis of geospatial ‘big data’ 441

is discussion about the nature of our data, its definition, representation, and manipulation (Llobera, 2011,
p. 219). Kitchin wrote (2014, p. 2):

The challenge of analysing big data is coping with abundance, exhaustivity and variety, timeliness
and dynamism, messiness and uncertainty, high relationality, and the fact that much of what has
been generated has no specific question in mind or is a by-­product of another activity.

We have to embrace all of the data available to us, despite concerns about subjectivity (after all, almost all
archaeological data is the result of interpretation on some level). We can aid in this by better understand-
ing the character and histories of our datasets through thorough documentation (i.e. metadata creation)
and by developing ways of working with data that allow us to include more data in our models (Cooper &
Green, 2016, pp. 296–297), perhaps by representing data quality using fuzzy membership criteria, such
as treating temporal information probabilistically (e.g. Green, 2011; Crema, 2012; see Fusco & de Runz,
this volume). In 2015, Gattliglia asked (2015, p. 118):

Is archaeology ready to move towards data-­led research, and to accept predictive and probabilistic
techniques?

The answer, hopefully, is now yes.

Acknowledgements
The case study discussed in this chapter was carried out as part of the European Research Council funded
English Landscape and Identities project (Grant Number 269797). The project team’s thanks go to all of
the bodies that supplied data to us, most notably Historic England, the Portable Antiquities Scheme, and
England’s Historic Environment Record offices. At the time of writing, the EngLaId data can be explored
at: http://englaid.arch.ox.ac.uk

References
Allen, M., Brindle, T., Smith, A., Richards, J. D., Evans, T., Holbrook, N., . . . Blick, N. (2015). The rural settlement
of Roman Britain: An online resource. Archaeology Data Service. Retrieved from https://doi.org/10.5284/1030449
Atici, L., Kansa, S. W., Lev-­Tov, J., & Kansa, E. C. (2013). Other people’s data: A demonstration of the imperative of
publishing primary data. Journal of Archaeological Method and Theory, 20(4), 663–681. https://doi.org/10.1007/
s10816-­012-­9132-­9
Austin, T., & Mitcham, J. (2007). Preservation and management strategies for exceptionally large data formats: “Big Data”.
Retrieved from https://archaeologydataservice.ac.uk/research/bigData.xhtml
Berners-­Lee, T., Hendler, J., & Lassila, O. (2001). The semantic web. Scientific American, 284(5), 34–43.
Binding, C., May, K., & Tudhope, D. (2008). Semantic interoperability in archaeological datasets: Data mapping and
extraction via the CIDOC CRM. In B. Christensen-­Dalsgaard, D. Castelli, B. Ammitzbøll Jurik, & J. Lippincott
(Eds.), Research and advanced technology for digital libraries (Vol. 5173, pp. 280–290). Berlin, Heidelberg: Springer
Berlin Heidelberg. https://doi.org/10.1007/978-­3-­540-­87599-­4_30
Birch, C. P. D., Oom, S. P., & Beecham, J. A. (2007). Rectangular and hexagonal grids used for observation,
experiment and simulation in ecology. Ecological Modelling, 206(3–4), 347–359. https://doi.org/10.1016/j.
ecolmodel.2007.03.041
Bowker, G. C., & Star, S. L. (1999). Sorting things out: Classification and its consequences. Cambridge, MA: MIT Press.
boyd, D., & Crawford, K. (2012). Critical questions for Big Data: Provocations for a cultural, technological, and
scholarly phenomenon. Information, Communication & Society, 15(5), 662–679. https://doi.org/10.1080/13691
18X.2012.678878
442 Chris Green

British Museum. (2013–2018). Portable antiquities scheme. Retrieved from https://finds.org.uk/


Carusi, A., Hoel, A. S., Webmoor, T., & Woolgar, S. (Eds.). (2015). Visualization in the age of computerization. New
York: Routledge, Taylor & Francis Group.
Cooper, A., & Green, C. (2016). Embracing the complexities of “Big Data” in archaeology: The case of the
English Landscape and Identities project. Journal of Archaeological Method and Theory, 23, 271–304. https://doi.
org/10.1007/s10816-­015-­9240-­4
Cooper, A., & Green, C. (2017). Big questions for large, complex datasets: Approaching time and space using com-
posite object assemblages. Internet Archaeology, 45(1). https://doi.org/10.11141/ia.45.1
Crema, E. R. (2012). Modelling temporal uncertainty in archaeological analysis. Journal of Archaeological Method and
Theory, 19(3), 440–461. https://doi.org/10.1007/s10816-­011-­9122-­3
Dam, C., Austin, T., & Kenny, J. (2010). Breaking down national barriers: ARENA – a portal to European heritage
information. In F. Niccolucci & S. Hermon (Eds.), Beyond the artifact: Digital interpretation of the past (pp. 94–98).
Budapest: Archaeolingua.
Doerr, M. (2003). The CIDOC conceptual reference module: An ontological approach to semantic interoperability
of metadata. AI Magazine, 24(3), 75–92. https://doi.org/10.1609/aimag.v24i3.1720
Elliott, T., Heath, S., & Muccigrosso, J. (Eds.). (2014). Current practice in linked open data for the ancient world.
ISAW Papers, 7. https://doi.org/2333.1/gxd256w7
EngLaId team. (2016). Portal to the past. Retrieved from https://englaid.arch.ox.ac.uk
Gattiglia, G. (2015). Think big about data: Archaeology and the Big Data challenge. Archäologische Informationen,
38(1), 113–124.
Gooding, D. C. (2008). Envisioning explanation: The art in science. In B. Frischer & A. Dakouri-­Hild (Eds.), Beyond
illustration: 2D and 3D digital technologies as tools for discovery in archaeology (pp. 45–74). Ann Arbor, MI: MPublishing
[originally Oxford: Archaeopress]. Retrieved from http://hdl.handle.net/2027/heb.90045.0001.001
Green, C. (2011). Winding Dali’s clock: The construction of a fuzzy temporal-­GIS for archaeology (Vol. 2234). Oxford:
Archaeopress.
Green, C. (2013). Archaeology in broad strokes: Collating data for England from 1500 BC to AD 1086. In A. Chry-
santhi, D. Wheatley, I. Romanowska, C. Papadopoulos, P. Murrieta-­Flores, T. Sly, & G. Earl (Eds.), Archaeology in the
digital era: Papers from the 40th annual conference of computer applications and quantitative methods in archaeology (CAA),
Southampton, 26–29 March 2012 (Vols. 1 – Book, Section, pp. 307–312). Amsterdam: Amsterdam University Press.
Groot, M. (2017). A primer in financial data management. London: Academic Press, an Imprint of Elsevier.
Gupta, N., & Devillers, R. (2017). Geographic visualization in archaeology. Journal of Archaeological Method and Theory,
24(3), 852–885. https://doi.org/10.1007/s10816-­016-­9298-­7
Harris, J. (2013). The eighth law of data quality. Retrieved from http://blogs.sas.com/content/datamanagement/
2013/06/19/the-­eighth-­law-­of-­data-­quality/
Hey, T., Tansley, S., & Tolle, K. (Eds.). (2009). The fourth paradigm: Data-­intensive scientific discovery. Redmond, Wash-
ington: Microsoft Research.
Historic England. (2011). Historic England NRHE excavation index. Retrieved from http://archaeologydataservice.
ac.uk/archives/view/304/
Huggett, J. (2012). Lost in information? Ways of knowing and modes of representation in e-­archaeology. World
Archaeology, 44(4), 538–552. https://doi.org/10.1080/00438243.2012.736274
Kintigh, K. W. (2006). The promise and challenge of archaeological data integration. American Antiquity, 71(3),
567–578. https://doi.org/10.1017/S0002731600039810
Kintigh, K. W., Altschul, J. H., Beaudry, M. C., Drennan, R. D., Kinzig, A. P., Kohler, T. A., . . . Zeder, M. A. (2014).
Grand challenges for archaeology. Proceedings of the National Academy of Sciences, 111(3), 879–880. https://doi.
org/10.1073/pnas.1324000111
Kitchin, R. (2014). Big Data, new epistemologies and paradigm shifts. Big Data & Society, 1(1), 1–12. https://doi.
org/10.1177/2053951714528481
Levy, T. E. (2014). Editorial. Near Eastern Archaeology, 77(3). Retrieved from www.jstor.org/stable/10.5615/near-
eastarch.77.issue-­3
Llobera, M. (2011). Archaeological visualization: Towards an Archaeological Information Science (AISc). Journal of
Archaeological Method and Theory, 18(3), 193–223. https://doi.org/10.1007/s10816-­010-­9098-­4
The analysis of geospatial ‘big data’ 443

McCoy, M. D. (2017). Geospatial Big Data and archaeology: Prospects and problems too great to ignore. Journal of
Archaeological Science, 84, 74–94. https://doi.org/10.1016/j.jas.2017.06.003
Meghini, C., Niccolucci, F., Felicetti, A., Ronzino, P., Nurra, F., Papatheodorou, C., . . . Hollander, H. (2017).
ARIADNE: A research infrastructure for archaeology. Journal on Computing and Cultural Heritage, 10(3), 1–27.
https://doi.org/10.1145/3064527
Nicolucci, F. (2017). Documenting archaeological science with CIDOC CRM. International Journal on Digital Librar-
ies, 18(3), 223–231. https://doi.org/https://doi.org/10.1007/s00799-­016-­0199-­x
Opitz, R., & Limp, W. F. (2015). Recent developments in High-­Density Survey and Measurement (HDSM) for
archaeology: Implications for practice and theory. Annual Review of Anthropology, 44(1), 347–364. https://doi.
org/10.1146/annurev-­anthro-­102214-­013845
Rayne, L., Bradbury, J., Mattingly, D., Philip, G., Bewley, R., & Wilson, A. (2017). From above and on the ground:
Geospatial methods for recording endangered archaeology in the Middle East and North Africa. Geosciences, 7(4),
100. https://doi.org/10.3390/geosciences7040100
Rosenshein, L., Scott, L., & Pratt, M. (2011). Exploratory regression: A tool for modeling complex phenomena. Retrieved
from www.esri.com/news/arcuser/0111/files/exploratory.pdf
Smith, A., Allen, M., Brindle, T., & Fulford, M. (2016). The rural settlement of Roman Britain: New visions of the countryside
of Roman Britain (Britannia Monograph 29). London: The Roman Society.
Smits, P., & Friis-­Christensen, A. (2007). Resource discovery in a European spatial data infrastructure. IEEE Transac-
tions on Knowledge and Data Engineering, 19(1), 85–95. https://doi.org/10.1109/TKDE.2007.250587
Snow, D. R., Gahegan, M., Giles, C. L., Hirth, K. G., Milner, G. R., Mitra, P., & Wang, J. Z. (2006). Cybertools and
archaeology. Science, 311(5763), 958–959. https://doi.org/10.1126/science.1121556
Vlachidis, A., Binding, C., Tudhope, D., & May, K. (2010). Excavating grey literature: A case study on the rich index-
ing of archaeological documents via natural language-processing techniques and knowledge-based resources. Aslib
Proceedings, 62(4/5), 466–475. https://doi.org/10.1108/00012531011074708
Wesson, C. B., & Cottier, J. W. (2014). Big sites, big questions, Big Data, big problems: Scales of investigation and
changing perceptions of archaeological practice in the southeastern United States. Bulletin of the History of Archaeol-
ogy, 24, 16. https://doi.org/10.5334/bha.2416
Yue, P., Gong, J., Di, L., He, L., & Wei, Y. (2011). Integrating semantic web technologies and geospatial catalog ser-
vices for geospatial information discovery and processing in cyberinfrastructure. GeoInformatica, 15(2), 273–303.
https://doi.org/10.1007/s10707-­009-­0096-­1
23
The analytical role of 3D
realistic computer graphics
Nicoló Dell’Unto

Introduction
In the last decade, the introduction and use of high-­resolution, textured, 3D surface models (3D realistic
surface models) in archaeology has been widely presented and discussed in the literature (Callieri et al.,
2011; De Reu et al., 2013; Campana, 2014; Opitz & Limp, 2015; Dell’Unto et al., 2016). Recently, these
applications have become widely used and have proven to be very efficient in supporting archaeological
interpretation.
Three-­dimensional realistic models display aspects of the original materials that are usually dif-
ficult (or impossible) to represent with more traditional approaches, and allow the construction of
3D spatial simulations for visualising the original position of objects and contexts in great detail. The
introduction of these approaches within archaeological practice augments our comprehension of
the site, enhancing the complex relations that exist among the many fragmented bits of information
retrieved as the result of a field investigation (Dell’Unto, Landeschi, Apel & Poggi, 2017). Despite the
broad applications of 3D realistic models in archaeology, this chapter will focus on the particular use
of these models to support archaeological field practice, and specifically, it will provide an overview
of how these applications and their associated data can be employed to support spatial analysis and
interpretation.

A brief history
Three-­dimensional models have always been considered an innovative tool in cultural heritage research,
and since their introduction, the possibilities offered by these technologies have brought forth new
perspectives in archaeology (Reilly, 1989; Reilly, 1991; Forte & Silotti, 1997). In the 1990s, Colin Ren-
frew discussed how 3D models forced archaeologists into more logical explanations, thereby providing
researchers with the opportunity to experiment with different interpretation methods (Renfrew, 1997).
However, at that time, several obstacles including the high costs of the technology and the complexity
of merging and using those types of information within the framework of more traditional investiga-
tion practice prevented the spread of these techniques throughout the archaeological community. The
creation and use of 3D models required the mastery of specific skills which were not traditionally part
The analytical role of 3D graphics 445

of any archaeological training, and thus, despite their relevance, their introduction and diffusion was
hampered by many obstacles.
In the last few years, the technological development of passive and active sensors, such as laser scanning
or image based 3D modelling techniques, and their diffusion at relatively low cost has allowed for the
exponential spread of realistic 3D surface models, providing archaeologists with the opportunity to start
using these instruments and techniques in support of field investigations. This phenomenon was probably
due to a number of key events: (a) the development and diffusion of low cost techniques and instruments,
(b) the initiation of a methodological and theoretical discussion concerning the use of the third and
fourth dimensions in support of archaeological interpretation, and (c) the accomplishment of significant
experimentation where the introduction of 3D acquisition technology proved to be crucial in interpret-
ing the site. Despite their obvious advantages, the introduction of these methods in the field raised con-
cerns among practitioners who warned archaeologists of the risks of losing intellectual engagement with
the material being recorded (Giuliani, 2008). Despite their apparent realism and accuracy, 3D data are
still the product of a complex interpretation process (Jeffrey, 2015; Garstki, 2016), and for this reason, their
affordances are strongly dependent on the research goals posed before their creation (Dell’Unto, 2016).
Since their first application in archaeology, 3D modelling techniques were meant to be used within
the framework of a geodatabase, and in relation to different types of spatial information (Ducke, Score, &
Reeves, 2011). Thus, they would provide researchers with the opportunity to (a) perform complex opera-
tions of data visualisation and editing, (b) identify 3D patterns as a result of specific queries, (c) import
and handle new types of information, such as 3D volumes, and (d) enable the visualisation of information
through different types of devices, such as immersive or stereoscopic systems. Despite the lack of a single
system which includes all these functions, a number of experiments have focused on using 3D visual tech-
nologies and 3D realistic modelling techniques to create 3D palimpsests for performing different kinds of
spatial analysis (Bevan et al., 2014; Magnani & Schroder, 2015; Garstki, Arnold, & Murray, 2015; Opitz,
2014; Callieri et al., 2011; Forte, Dell’Unto, Issavi, Onsurez, & Lercari, 2012; Dellepiane, Dell’Unto, Cal-
lieri, Lindgren, & Scopigno, 2013; Dell’Unto, 2014; Roosevelt, Cobb, Moss, Olson, & Unluso, 2015). The
results of these experiments have proven the capability of 3D realistic surface models, combined with 3D
visualisation systems, to generate new ways of engaging with archaeological information. These findings
provide the scientific community with the opportunity to initiate a discussion about the impact that these
approaches will have on spatial analysis and, more specifically, about how the combination of 3D models
and different types of archaeological data can be used to identify new patterns.

Visualizing 3D realistic models in archaeological field practice


As previously introduced, three dimensional realistic surface models used in support of archaeological
practice are strictly dependent on the data management systems that are adopted for their visualization.
For this reason, when discussing the impact of 3D realistic models on archaeological investigation and
spatial analysis, it is important to briefly introduce the different platforms currently employed and review
how these have been customized to host and use 3D models in support of archaeological field practice.
To date, two main types of platforms have been employed for the visualization of 3D realistic surface
models: game engines and 3D GIS.
The former have been used in different archaeological field contexts to visualize, analyse and publish
the results from archaeological excavations, functioning in some cases as the primary interface used to
access the data retrieved on site (Opitz & Johnson, 2016; von Schwerin, Richards-­Rissetto, Remondino,
Agugiaro, & Girardi, 2013). Visualization systems such as Unity 3D or Unreal Engine are capable of host-
ing full-­scale environments connected with complex database systems (Figure 23.1), thereby providing
446 Nicoló Dell’Unto

Figure 23.1 A 3D model of the house of Caecilius Iucondus visualized through Unity 3D. A colour version
of this figure can be found in the plates section.
Source: Picture: Kennet Ruona

archaeologists and specialists with the opportunity to ‘walk across’ their simulations and review the infor-
mation previously collected directly in 3D space (Forte, Dell’Unto, Jonsson, & Lercari, 2015; Lercari,
Shiferaw, Forte, & Kopper, 2017).
Due to their ability to be used in immersive simulation environments, through which it is possible to
gain a more natural interaction with the scene, game engine systems are considered very promising. By
using information from archaeo-­botanical or geological studies, game engines have been used for simu-
lating and visualizing complex past ecosystems (Huyzendveld et al., 2012) and for the study of the use of
space in antiquity through the employment of virtual characters (Der Manuelian, 2013). However, despite
their increased application in scientific environments, in order to be used to support the field investiga-
tion process, these systems need to be transformed into “sandboxes” where data can be easily imported,
queried, analysed, manipulated, and visualized in spatial relation to other datasets and/or users in real time.
These systems are very powerful, and their future use and diffusion in support of field archaeological
practice is strongly dependent on the implementation of tools for spatial analysis.
Three-­dimensional geographic information systems (3D GISs) have also been employed for managing
3D realistic models in support of archaeological practice. Traditional 2D GISs are considered to be the
most influential visualization tools for managing and analysing archaeological data, and in many countries,
they represent the standard for archaeological documentation (Allen, Green, & Zubrow, 1990; Lock &
Stancic, 1995; Wheatley & Gillings, 2002; Chapman, 2006; Conolly & Lake, 2006). Three-­dimensional
GIS allow 3D surface models to be merged with more traditional datasets, making the employment of
this visualization instrument a more logical choice for archaeologists.
The analytical role of 3D graphics 447

Although these platforms cannot necessarily be considered a novelty, recent development and inte-
gration into these systems of tools oriented towards 3D visualization (editing, querying and analysis of
resolute point clouds such as Light Detection and Ranging (LiDAR) 3D surface models or 3D volumes)
has revealed new scenarios for their use in archaeological field practice. 3D GIS allow for the manage-
ment of data as a result of different excavation approaches and allows users to interact with their data
in completely novel ways, Figure 23.2. Currently, these systems have the capabilities to host and display
the results of multiple types of analysis in three dimensions (surface analysis, 3D spatial distributions, 3D
visual scapes, etc.), thus providing the basis for opening discussions concerning best practice with regard
to fieldwork recording strategies.
Most of the current archaeological field projects developed using these new versions of 3D GIS have
employed ESRI (Environmental Systems Research Institute) ArcGIS. This allows 3D realistic surface mod-
els to be directly imported and visualized in spatial relationships with more traditional data sets. ESRI’s
ArcGIS takes advantage of fast rendering speeds to explore in a 3D environment the information detected
and recorded in the field (ESRI, 2010).

Figure 23.2 3D GIS for the documentation of insula V I at Pompeii. Vertical and horizontal drawings are made
directly in the system based on the 3D surface model of the house of Caecilius Iucondus as a 3D reference.
448 Nicoló Dell’Unto

More recently ESRI introduced ArcGIS PRO, which provides advanced 3D editing tools and allows the
importing, managing and further analysing of 3D models with higher resolution. The real time imple-
mentation of 3D realistic surface models within a 3D GIS used in support of an excavation affords new
ways of interacting with the dataset collected in the field. The adoption of these systems during excava-
tion as the primary tool for documentation is excellent for exploring 3D information in a non-­linear
way, making the excavation process more contextual and reflexive, and providing archaeologists with a
more complete overview of the complex information retrieved on site (Berggren et al., 2015; Landeschi,
Nilsson, & Dell’Unto, 2016; Dell’Unto et al., 2017).
Despite the recent introduction of new 3D Web-­based GIS tools (Jensen, 2018; Galeazzi & Rissetto,
2018; Scopigno, Callieri, Dellepiane, Ponchio, & Potenziani, 2017), these systems still represent the most
diffuse solution for testing new methodological and theoretical approaches in the field. An interesting
example of this can be found in the study of Stora Förvar, Sweden; a prehistoric cave excavated at the end
of the 19th century with arbitrary spit layers. Despite the important information retrieved as a result of
the field investigation, due to the excavation method adopted at that time it was not possible to recon-
struct the stratigraphic sequence of the site. However, by using a 3D GIS to simulate in three dimensions
the relationship between the volumes of the spit layers and elements of the artefact assemblages retrieved
during different excavations, as shown in Figure 23.3, it was possible to partially redefine the stratigraphic

Figure 23.3 A virtual simulation of Stora Förvar. The volumes of spit layers excavated during the 19th century
were reconstructed by combining the original drawings and the 3D surface model of the cave made by laser
scanning technology.
Source: 3D Acquisition: Stefan Lindgren, 3D Visualization: Giacomo Landeschi and Victor Lundström
The analytical role of 3D graphics 449

sequence of the site, opening up new interpretative scenarios (Landeschi, 2018; Landeschi et al., 2016;
Landeschi et al., 2018).

Method
Since the first introduction of 3D GIS in archaeology, a number of projects have invested resources in
experimenting with the implementation of 3D realistic surface models in support of field investigations.
Research projects like Gabii Goes Digital (Opitz & Johnson, 2016), Catalhoyuk (Taylor et al., 2018),
Uppåkra (Callieri et al., 2011), 3D Digging (Forte, 2014) and the Kaymakçı Archaeological project
(Roosevelt et al., 2015) were among the first to start testing the limits and potential of 3D data in relation
to more traditional archaeological field recording systems. If, at the beginning, the discussion was mainly
focused on aspects such as the sustainability and management of these types of information in support
of site documentation (Forte et al., 2012; Roosevelt et al., 2015), the diffusion and systematic use of this
approach in the field has allowed the focus to shift towards the discovery of new archaeological results
and their implications for theory and methodology. The question which characterizes this second phase
is as follows: how crucial is the use of these technologies for the definition of research questions and for
the identification of new archaeological information? If logistical aspects (such as the implementation
of these instruments and techniques in current field investigation activities) can be easily addressed and
discussed, an understanding of the impact of these new methods at a more methodological and theoreti-
cal level will not be immediate and instead requires a deeper analysis of the effects of these recording and
visualisation systems on the ways archaeologists interpret the past.
The creation of archaeological documentation is not only a mechanical procedure for recording the
geometrical characteristics of archaeological materials, but it is also a process through which archaeolo-
gists gain a deeper understanding of the contexts and features being recorded (Giuliani, 2008). On one
hand, 3D recording techniques are shown to be faster and to produce very accurate and realistic repre-
sentations of the materials retrieved on site (Scopigno et al., 2017). On the other hand, the idea that the
use of 3D realistic surface models can simply substitute conventional recording could create a dangerous
imbalance between the process of data documentation and intellectual engagement with the material
being recorded (Powlesland, 2016). As a consequence, in order to make 3D realistic surface models sig-
nificant elements in the interpretation process, it is important to generate data that reflect and emphasize
the observations accrued as a result of an active engagement with the material retrieved on site.
For this reason, when introducing 3D realistic surface models as part of field documentation, it is
necessary to do the following:

1 Assess the limits and potential of this type of data in relation to its use during archaeological
interpretation.
2 Design a data management system capable of establishing robust spatial relationships between 3D
realistic models and the rest of the documentation produced on site.
3 Carefully define the metadata which will be used to describe the 3D realistic models.
4 Re-­define the site logistics in order to ensure that these data are created only after accurate analysis
of the materials retrieved in the field. Despite their characteristics, 3D surface models do not have
the capacity to reproduce all aspects of the original materials.

A methodological approach which takes into account the limits and potential of these types of data in
relation to the different spatial datasets produced as a result of field analysis has the potential to signifi-
cantly affect the impact that this documentation approach will have on the final interpretation of the
450 Nicoló Dell’Unto

site. Integrating 3D surface models within the framework of field investigation activities is not an easy
task. More traditional documentation approaches do not include or manage such datasets, and for this
reason, 3D models produced (and used) during the excavation are often mainly employed to extract bi-­
dimensional geometric representations of sections and maps of the site.
Considering the potential of these data for supporting field interpretation and spatial analysis, in the
past years, the Lund University Digital Archaeology Laboratory (DARKLab) focused on developing field
strategies for the use of 3D data in the context of archaeological field investigation. This activity aimed
to advance understanding of how the employment of a 3D system for recording and visualizing ongo-
ing field activities impacted the archaeological understanding of space. Since the method was developed,
a number of experiments on different case studies were initiated, and various approaches were tested.
Among the instruments and techniques for producing 3D surface models, image based 3D modelling
techniques proved to be the most efficient tools to employ in support of archaeological field investigation
(Callieri et al., 2011; Dellepiane et al., 2013).
The method relies on the integration and use of 3D realistic surface models in a 3D GIS for the
documentation and interpretation of contexts and features detected in the field. In particular, the process
consists of the following steps:

1 A digital camera is used to capture images of the context from multiple perspectives, and then, by
using image based modelling techniques (Agisoft PhotoScan), the pictures are processed and a 3D
textured surface model created.
2 Once generated, the 3D model is georeferenced using ground control points, imported into the
geodatabase and visualized (in the field) in a 3D GIS platform (ESRI ArcScene or ARCGIS Pro). By
employing a tablet PC to work directly in the trench,
3 3D polylines are used to draw contexts and features using the 3D surface model (and the 3D GIS) as
geometrical reference, and
4 a relational database (connected to the 3D polyline) is employed to record textual information (Fig-
ures 23.2 and 23.4).

The possibility of simultaneously exploring features and contexts both in the real world (looking, touch-
ing, changing perspective) and in the 3D GIS (where the contexts can be visualized in real time and in
spatial relation to materials and features removed in previous steps of the investigation) allows for simula-
tion of different 3D spatial scenarios and the exploration, in real time and within field investigation, of
different hypotheses.
To be successful, this approach requires the definition of acquisition strategies which maintain a good
balance between the number of pictures taken and processed on site and a proper 3D representation of the
contexts as exposed in the field. Due to the unpredictable nature of archaeological investigation, the field
acquisition approach adopted for recording is a delicate step, and it is usually defined after a discussion among
the field archaeologists operating on site. Each acquisition should aim to properly represent the material
identified in the field to support archaeological interpretation and spatial analysis. To be effective in the field,
3D surface models must be used in synergy with different types of datasets. Their creation and use in support
of ongoing field investigations allow for a number of simulations that is impossible to achieve with more
traditional documentation/visualization methods. Once imported into the 3D GIS, the models can be used
for (1) visualizing, exploring and analysing (in 3D and in spatial relations) contexts and materials exposed and
removed at different steps of the investigation (Lingle et al., 2015), (2) the detection of new archaeological
information by identifying 3D patterns as result of specific spatial queries (Wilhelmson & Dell’Unto, 2015)
or (3) analysing the site morphology in order to detect new archaeological contexts and features.
The analytical role of 3D graphics 451

Figure 23.4 The different steps undertaken in the field to record, visualize and interpret contexts and features
detected during field investigation.

Case studies

Catalhoyuk, Turkey
The Çatalhöyük Research Project (www.catalhoyuk.com/) represents an important reference for pre-
senting the early integration and use of 3D realistic surface models in support of archaeological field
investigation. The project has a long history of engagement with digital 3D technology (Tringham &
Stevanović, 2012; Berggren et al., 2015; Taylor et al., 2018), and since the end of the 1990s, the use of 3D
models has been integrated into this research activity as a tool which ‘allows for experimentation with
different ways of experiencing the site’ (Hodder, 2000, p. 8).
With the diffusion of 3D recording instruments, a number of activities were started on site to assess
the limitations and the potential of the data in supporting field practices. After the initiation of experi-
ments developed within the framework of the 3D Digging project (Forte et al., 2012), three dimensional
recording techniques were adopted, customized and further developed by different teams of specialists
operating on site (Knüsel, Haddow, Sadvari, Dell’Unto, & Forte, 2013; Lercari & Lingle, 2016). After a
period of transition and experimentation, the team of field archaeologists were introduced to the data
generated, which was used in combination with GIS platforms and tablets, and since 2013, 3D realistic
surface models, were systematically employed by excavation teams to record spaces, buildings and features
452 Nicoló Dell’Unto

Figure 23.5 3D Model of the plaster head (21666) after conservation. The model was generated using Agisoft
PhotoScan pro version 1.2.6. with acquisition campaign and processing done by Nicoló Dell’Unto. A colour
version of this figure can be found in the plates section.

at different steps of the investigation. To integrate the documentation system adopted on site, a number of
protocols and workflows to support field investigation were developed (Taylor et al., 2018). Once in the
GIS, 3D realistic surface models allowed archaeologists to interact with the information in a completely
different way, helping them to gain a more accurate and holistic overview of the contexts and materials
retrieved on-­site (Dell’Unto, 2016). The opportunity to generate and use 3D realistic surface models in
the field proved, on some occasions, to be crucial to understanding the complex relationships that were
revealed between contexts and structures (Carpentier, 2015). Specifically, during the excavation season of
2015, a painted plaster head was retrieved in the North Area within Building 132 (Figures 23.5 and 23.6).
Due to its instability, the plaster head was removed before it could be studied in spatial relation to the
rest of the structure. However, by using the 3D archive produced since 2013, it was possible to recognise the
plaster head while still in situ (Lingle et al., 2015). The realistic 3D surface model that had been previously
stored allowed for identification of the plaster head in its original position, which was located at the junc-
tion of several walls and spaces in the southwest corner of the building. At the same junction, a crawl space
carved into a wall was also located. Unfortunately, due to a layer of plaster which covered the structures at
the time when the 3D model was created, it was not possible to understand the spatial relationship between
the plaster head and the different parts of the building. For this reason, a new 3D version of the plaster head
was made after the work of the conservation lab and was virtually replaced in its original location using the
3D realistic surface models as geometric and spatial references, as shown in Figure 23.6.
The 3D model of the plaster painted head created after conservation was geo-­referenced and imported
into the project geodatabase. Then, using a 3D GIS platform, it was visualized in three dimensions to
simulate the spatial relationship between the different contexts retrieved at different times during the
excavation process. This spatial 3D simulation allowed for the re-­examination of the plaster head in rela-
tion to the rest of the building, proving that the feature was located in the direction of the crawl-­hole,
rather than the building itself (Lingle et al., 2015).
The use of this methodology allows for the selection and visualization in a three-­dimensional/
geo-­referenced space of the different contexts exposed and removed during the different phases of
the excavation. The use of such a methodology allowed the plaster head to be contextualised within
the construction sequence of the building (Lingle et al., 2015), proving that this methodology is
Figure 23.6 (a) 3D realistic surface model of building 131 which displays the plaster head when still in situ.
(b) Spatial simulation of the 3D plaster head after conservation re-­integrated in its original position within
the 3D GIS.
Source: Acquisition campaign and 3D models were created by Jason Quinlan, Marta Perlinska and Nicoló Dell’Unto
454 Nicoló Dell’Unto

capable of bridging the work of several specialists working on site and that it can obtain results otherwise
impossible to achieve. The Çatalhöyük experience has shown how, in order to be useful to archaeological
field research, 3D realistic surface models need to be acquired regularly and with a specific strategy in mind.

Kämpinge, Sweden
Another case study where 3D realistic models were used in support of field investigations from their
beginning is the archaeological site of Kämpinge, Sweden. Since 2014, the site has been investigated by
the Institute of Archaeology and Ancient History at Lund University and is one of a group of Middle
and Late Neolithic coastal sites in the Öresund region dating from 8500–6000 cal BP, which belong to
the Kongemose and Ertebölle cultures (Brinch Petersen, 2015). Due to the site’s complexity, since the
first excavation season, 3D realistic surface models were employed to record all the contexts and features
retrieved. The documentation method employed was based on an extensive use of image-­based 3D
reconstruction techniques combined with a Real-­Time Kinematic (RTK) Global Positioning System
(GPS) to spatially document contexts and materials retrieved during the field investigation campaign.
Once generated, the 3D models were georeferenced, imported and made available within a 3D GIS plat-
form to be used by the excavators in situ as 3D geometrical references for drawing the contexts in 3D
directly in the trench before their removal (Dell’Unto et al., 2017). The geodatabase developed during the
excavation was designed to support single context recording; all the information concerning the context
description was recorded electronically and at the trowel’s edge, as shown in Figure 23.7.

Figure 23.7 3D documentation constructed for the archaeological field investigation of Kämpinge, Sweden.
(a, b, c) Trench N09–04 3D recorded and visualized inside the 3D GIS used on site. (d) 3D models of Trench
06 made during the excavation season 2014 and 2015.
The analytical role of 3D graphics 455

The opportunity to visually simulate in three dimensions the excavation process allowed for a better
understanding of the actions performed by different teams, thus providing a better and more holistic
overview of the relationships between the contexts already removed by the excavation process (Dell’Unto
et al., 2017). Across the years, the system proved to be capable of reflecting and emphasizing observations
which accrued as a result of an active engagement with the contexts and materials retrieved on site, trig-
gering an interpretation process among different recording agents which occurred at the same time in
the real and in the virtual world. Among the many results obtained by using this approach in the field,
the most interesting result was the way the system affected the excavation strategy: increasing, and not
reducing, the levels of intellectual engagement with the material being investigated. During the excava-
tion, archaeologists carefully reviewed in three dimensions aspects of the trenches from previous years
by combining together in a 3D space the contexts and artefacts that were retrieved at different times by
different teams.
The advantages of this approach were visible from the first excavation season, when six artefacts
identified as a flint flake, a flint axe core, a piece of worked amber, a fragmented disc in black slate, a
flint “Trollsten” and a retouched flint flake were documented in 3D whilst still in situ (Apel, Leffler,
Landeschi, & Dell’Unto, 2015). The 3D models were georeferenced, imported, visualized and stored in
the 3D geodatabase as part of the documentation strategy. During the second excavation season, a larger
trench was opened in the same location by a different team to further investigate the contexts identified
in the previous year. This team employed a tablet PC to visualize the 3D contexts and artefacts retrieved
the year before. The ability to use the 3D information stored in the geodatabase for testing multiple 3D
simulations allowed the archaeologists to gain a clear spatial understanding of the relationship between
the sequence of contexts detected the year before and the new materials. Specifically, the use of this
methodology allowed for the review of the exact position of the artefacts previously excavated and for
a spatial comparison with the contexts and artefacts undergoing active excavation. The possibility of
simulating in three dimensions the spatial relationship of those contexts as if they were exposed at the
same time allowed for a deeper level of understanding of the complex relations retrieved across the years
(Dell’Unto et al., 2017).

Conclusion
Before the diffusion of 3D acquisition instruments and techniques, 3D models were rarely used by
field archaeologists. The very few 3D reconstructions that were presented in the literature at the end of
the 1990s were created mainly by private companies whose development focused on public outreach
(Frischer, 2008).
The production of 3D realistic models for use in archaeological investigation is no longer an obstacle.
Advances in 3D acquisition techniques and computational resources today allow the acquisition and
processing of realistic 3D models as an integral part of the archaeological excavation process. More prob-
lematic seems to be the lack of data visualization systems (see also Eve & Graham, this volume) that are
customized for enhancing the real potential of this information, as well as the absence of routines aimed
at instructing archaeologists on how and when to use these systems to support their interpretations.
Three-­dimensional realistic surface models have the capacity to describe a substantial amount of infor-
mation, which can also be linked to a very large body of different data, with both spatial and non-­spatial
information. To date, despite their resolution and accuracy, the large number of isolated 3D models has
had a very limited impact on wider site interpretation. The establishment of data management systems
capable of contextualizing such data will eventually augment the use of 3D realistic models within the
framework of more complex spatial queries, increasing their impact on the current interpretation process.
456 Nicoló Dell’Unto

The introduction of such complex systems and data while the interpretation process is ongoing is not
an easy task. Therefore, an important question to be considered is how and when we should engage with
these systems and data in the field. What needs to be taken into account when creating this information
and what kind of affordances do these data need to have in order to be useful for interpreting the past?
Despite the results so far achieved within different field projects, 3D realistic surface models have only
just begun to be integrated into archaeological field practice. To date, those applications have provided
archaeologists with the opportunity to experiment with investigative approaches capable of discovering
information that was previously impossible to detect. Due to their characteristics, 3D realistic surface
models have proven to be easy to integrate and use in more traditional documentation practices, func-
tioning as a robust palimpsest for supporting on-­site discussion and highlighting information that was
previously difficult, or even impossible, to detect. This approach allows for the management and analysis
of archaeological data in three dimensions, making it possible to simulate in the field the multi-­temporal
actions performed during the excavation process, gluing together the fragmented data retrieved during
different years.
The introduction of 3D realistic surface models, together with different types of 3D data, 3D volumes
and point clouds, will bring important changes to the way archaeologists collect and use information
retrieved as the result of field activities, affecting not just the final outcome of the interpretation process
but also the way the process is constructed. Moreover, the increased production of 3D surface models
will lead to the creation of large 3D archives of archaeological data, which will further affect the way
information will be transmitted and used among the community of practitioners.

Acknowledgements
The work described in this paper was generously supported by the Birgit och Sven Håkan Ohlssons
Foundation and the DARKLab, Laboratoriet för Digital Arkeologi, Lund University, Sweden. The author
wishes to thank the Humanities Laboratory, Lund University, for the opportunity to use instruments and
facilities to perform parts of the experiments described in the text. I wish to thank James Taylor and
Giacomo Landeschi for endless discussions on this topic, the editors of this volume for the great feedback
on this text, the Catalhoyuk Research Project, the 3D Digging Project and The Kämpinge Project for
the opportunity to develop these experiments and participate in productive discussions.

References
Allen, K. M. S., Green, S. W., & Zubrow, E. B. W. (1990). Interpreting space: GIS and archaeology. London: Taylor and
Francis.
Apel, J., Leffler, J., Landeschi, G., & Dell’Unto, N. (2015). Stenålderslokalen vid Kämpinge 24:2, Räng Socken (RAÄ
4:1). Skåne – Säsongen 2015 Excavation report. Lunds Universitet. Lund, Sweden.
Berggren, Å., Dell’Unto, N., Forte, M., Haddow, S., Hodder, I., Issavi, J., . . . Taylor, J. (2015). Revisiting reflexive
archaeology at Çatalhöyük: Integrating digital and 3D technologies at the trowel’s edge. Antiquity, 89, 433–448.
Bevan, A., Li, X., Torres, M., Green, S., Xia, Y., Zhao, K., & Rehren, T. (2014). Computer vision, archaeological clas-
sification and China’s terracotta warriors. Journal of Archaeological Science, 49, 249–254.
Brinch Petersen, E. (2015). Diversity of mesolithic vedbaek. In Acta Archaeol (Vol. 86:1). Oxford: Wiley.
Callieri, M., Dell’Unto, N., Dellepiane, M., Scopigno, R., Soderberg, B., & Larsson, L. (2011). Documentation and
interpretation of an archeological excavation: An experience with dense stereo reconstruction tools. In M. Del-
lepiane, S. Serna, H. Rushmeier, L. Van Gol, & F. Nicolucci (Eds.), VAST the 11th international symposium on virtual
reality archaeology and cultural heritage (pp. 33–40). Prato: Eurographics.
The analytical role of 3D graphics 457

Campana, S. (2014). 3D modeling in archaeology and cultural heritage-­theory and best practice. In S. Campana &
F. Remondino (Eds.), 3D surveying and modeling in archaeology and cultural heritage theory and best practices (pp. 7–13).
Oxford: BAR International Series.
Carpentier, F. (2015). “Buildings 6, 24 and 17” (Çatalhöyük 2015 Archive Report, report by the Çatalhöyük Research
Project 44–8). Retrieved from Catalhoyuk Research Project www.catalhoyuk.com/research/archive_reports
Chapman, H. (2006). Landscape archaeology and GIS. Stroud: Tempus.
Conolly, J., & Lake, M. (2006). Geographical information systems in archaeology (Cambridge manuals in archaeology). Cam-
bridge, UK: Cambridge University Press.
Dellepiane, M., Dell’Unto, N., Callieri, M., Lindgren, S., & Scopigno, R. (2013). Archaeological excavation monitor-
ing using dense stereo matching techniques. Journal of Cultural Heritage, Elsevier, 14(3), 201–210.
Dell’Unto, N. (2014). The use of 3D models for intra-­site investigation in archaeology. In S. Campana & F. Remon-
dino (Eds.), 3D surveying and modeling in archaeology and cultural heritage theory and best practices (pp. 151–158).
Oxford: BAR International Series.
Dell’Unto, N. (2016). Using 3D GIS platforms to analyse and interpret the past. In M. Forte & S. Campana (Eds.),
Digital methods and remote sensing in archaeology: Archaeology in the age of sensing (pp. 305–322). Cham, Switzerland:
Springer.
Dell’Unto, N., Landeschi, G., Apel, J., & Poggi, G. (2017). 4D recording at the trowel’s edge: Using three-­dimensional
simulation platforms to support field interpretation. Journal of Archaeological Science: Reports, Elsevier, 12, 632–645.
Dell’Unto, N., Landeschi, G., Leander Touati, A. M., Dellepiane, M., Callieri, M., & Ferdani, D. (2016). Experienc-
ing ancient buildings from a 3D GIS perspective: A case drawn from the Swedish Pompeii Project. Journal of
Archaeological Method and Theory, Springer, 23(1), 73–94.
De Reu, J., Plets, G., Verhoeven, G., De Smedt, P., Bats, M., Cherretté, B., . . . De Clercq, W. (2013). Towards a
three-­dimensional cost effective registration of the archaeological heritage. Journal of Archaeological Science, 40,
1108–1121.
Der Manuelian, P. (2013). Giza 3D: Digital archaeology and scholarly access to the giza pyramids: The giza project at
Harvard University. In A. Addison, G. Guidi, L. De Luca, & S. Pescarin (Eds.), Proceedings of digital heritage 2013.
Digital Heritage International Congress (pp. 727–734). Marseille: IEEE – Institute of Electrical and Electronics
Engineers Inc.
Ducke, B., Score, D., & Reeves, J. (2011). Multiview 3D reconstruction of the archaeological site at Weymouth from
image series. Computers & Graphics, 35, 375–382.
ESRI. (2010). What’s new in ArcGIS 3D Analyst 10, February 2010. Resource document. Retrieved from http://help.
arcgis.com/en/arcgisdesktop/10.0/help/index.html#//00qp0000000z000000.htm. Accessed 20 March 2018.
Forte, M. (2014). 3D archaeology: New perspectives and challenges: The example of Çatalhöyük. Journal of Eastern
Mediterranean Archaeology and Heritage Studies, 1, 1–29.
Forte, M., Dell’Unto, N., Issavi, J., Onsurez, L., & Lercari, N. (2012). 3D archaeology at Çatalhöyük. International
Journal of Heritage in the Digital Era, 1, 351–377.
Forte, M., Dell’Unto, N., Jonsson, K., & Lercari, N. (2015). Interpretation process at Çatalhöyük using 3D. In I.
Hodder, & A. Marciniak (Eds.), Assembling Çatalhöyük (Vol. 1). Themes in Contemporary Archaeology, Vol. 1.
Leeds: Maney.
Forte, M., & Silotti, A. (1997). Virtual archaeology. London: Thames and Hudson Ltd.
Frischer, B. (2008). From digital illustration to digital heuristics. In B. Frischer & A. Dakouri-­Hild (Eds.), Beyond
illustration: 2D and 3D digital technologies as tools for discovery in archaeology. Oxford: Archaeopress and BAR Inter-
national Series.
Galeazzi, F., & Rissetto, H. (2018). Editorial introduction: Web-­based archaeology and collaborative research. Journal
of Field Archaeology, 43(sup1), S1–S8. doi:10.1080/00934690.2018.1512701
Garstki, K., Arnold, B., & Murray, M. (2015). Reconstituting community: 3D visualization and early Iron Age social
organization in the Heuneburg mortuary landscape. Journal of Archaeological Science, 54, 23–30.
Garstki, K. (2016). Virtual representation: The production of 3D digital artifacts. Journal of Archaeological Method and
Theory, 24, 726–750.
Giuliani, C. F. (2008). Prefazione. In M. Bianchini (Ed.), Manuale di rilievo e di documentazione digitale in archeologia
(pp. 9–12). Roma: Aracne editrice.
458 Nicoló Dell’Unto

Hodder, I. (2000). Developing a reflexive method in archaeology. In I. Hodder (Ed.), Towards reflexive method in
archaeology: The example at Çatalhöyük (pp. 3–14). Cambridge: McDonald Institute for Archaeological Research.
Huyzendveld, A. H., Di Ioia, M., Ferdani, D., Palombini, A., Sanna, V., Zanni, S., & Pietroni, E. (2012). The virtual
museum of the tiber valley. In A. Grande & V. Bendicho (Eds.), Proceedings of III Congreso International de Arquelo-
gia e Informatica Grafica, Patrimonio e Innovation [proceedings of the III international congress of archaeology and computer
graphics, heritage and innovation] (pp. 97–101). Sevillia, Spain: Virtual Archaeology Review.
Jeffrey, S. (2015). Challenging heritage visualization: Beauty, aura and democratisation. Open Archaeology, 1, 144–152.
Jensen, P. (2018). Semantically enhanced 3D: A web-­based platform for spatial integration of excavation documenta-
tion at Alken Enge, Denmark. Journal of Field Archaeology, S43, 1–14.
Knüsel, C. J., Haddow, S. D., Sadvari, J. D., Dell’Unto, N., & Forte, M. (2013). Bioarchaeology in 3D: Three-­dimensional
modeling of human burials at Neolithic Çatalhöyük. Poster presented at 82nd meeting of the American Association of
Physical Anthropologists in Knoxville, TN, 9–13 April. https://doi.org/10.13140/RG.2.2.18447.69287
Landeschi, G. (2018) Rethinking GIS, three-­dimensionality and space perception in archaeology. World Archaeology.
https://doi.org/1080/00438243.2018.1463171
Landeschi, G., Apel, J., Lundström, V., Storå, J., Lindgren, S., & Dell’Unto, N. (2018). Re-­enacting the sequence:
Combined digital methods to study a prehistoric cave. Archaeological and Anthropological Sciences. https://doi.
org/10.1007/s12520-­018-­0724-­5
Landeschi, G., Nilsson, B., & Dell’Unto, N. (2016). Assessing the damage of an archaeological site: New contribu-
tions from the combination of Image-­based 3D modelling techniques and GIS. Journal of Archaeological Science:
Reports, 10, 431–440.
Lercari, N., & Lingle, A. M. (2016). Çatalhöyük digital preservation project (Çatalhöyük 2016 Archive Report, report by
the Çatalhöyük Research Project). Retrieved from Catalhoyuk Research Project www.catalhoyuk.com/research/
archive_reports
Lercari, N., Shiferaw, E., Forte, M., & Kopper, R. (2017). Immersive visualization and curation of archaeological
heritage data: Çatalhöyük and the Dig@IT App. Journal of Archaeological Method and Theory, 25(2), 368–392.
Lingle, A., Dell’Unto, N., Der, L., Doyle, S., Killackey, K., Klimowicz, A., Tung, B. (2015). Painted plaster head (Çat-
alhöyük 2015 Archive Report, report by the Çatalhöyük Research Project 44–8). Retrieved from Catalhoyuk
Research Project www.catalhoyuk.com/research/archive_reports
Lock, G., & Stancic, Z. (1995). The impact of GIS on archaeology: A European perspective. New York: Taylor and Francis.
Magnani, M., & Schroder, W. (2015). New approaches to modeling the volume of earthen archaeological features:
A case-­study from the Hopewell culture mounds. Journal of Archaeological Science, 64, 12–21.
Opitz, R. (2014). Three dimensional field recording in archaeology: An example from Gabii. In B. R Olson &
W. R. Caraher (Eds.), 3D imaging in Mediterranean archaeology (pp. 64–73). North Dakota: The Digital Press, the
University of North Dakota.
Opitz, R., & Johnson, T. D. (2016). Interpretation at the controller’s edge: Designing graphical users interfaces for
the digital publication of the excavations at Gabii (Italy). Open Archaeology, 2, 1–17.
Opitz, R., & Limp, W. F. (2015). Recent developments in High-­Density Survey and Measurement (HDSM) for
archaeology: Implications for practice and theory. Annual Review of Anthropology, 44(1), 347–364.
Powlesland, D. (2016). 3Di enhancing the record, extending the returns, 3D imaging from free range photography
and its application during the excavation. In H. Kamermans, et al. (Eds.), The three dimensions of archaeology proceed-
ings of the XVII UISPP world congress (1–7 September 2014, Burgos, Spain) volume 7/sessions A4b and A12 (pp. 13–32).
Oxford: Archaeopress.
Reilly, P. (1989). Data visualization in archaeology. IBM Systems Journal, 28(4), 569–579.
Reilly, P. (1991). Towards a virtual archaeology (pp. 133–139). CAA 18. Oxford: Archeopress.
Renfrew, C. (1997). Foreword. In M. Forte & Silotti (Eds.), Virtual archaeology. London: Thames and Hudson Ltd.
Roosevelt, C., Cobb, P., Moss, E., Olson, B., & Unluso, S. (2015). Excavation is destruction digitization: Advances in
archaeological practice. Journal of Field Archaeology, 40(3), 263–380.
Scopigno, R., Callieri, M., Dellepiane, M., Ponchio, F., & Potenziani, M. (2017). Delivering and using 3D model on
the web: Are we ready? Virtual Archaeology review, 8(17), 1–9.
Taylor, J., Issavi, J., Berggren, Å., Lukas, D., Mazuccato, C., Tung, B., & Dell’Unto, N. (2018). “The rise of the
machine”: The impact of digital tablet recording in the field at Çatalhöyük. Internet Archaeology, 47.
The analytical role of 3D graphics 459

Tringham, R., & Stevanović, M. (2012). Last house on the hill: BACH area reports from Çatalhöyük, Turkey. Monumenta
Archaeologica, Vol. 27. Los Angeles: Cotsen Institute of Archaeology (UCLA) Press.
Wheatley, D., & Gillings, M. (2002). Spatial technology and archaeology: The archaeological applications of GIS. London:
Taylor and Francis.
Wilhelmson, H., & Dell’Unto, N. (2015). Virtual taphonomy: A new method integrating excavation and post-­
processing of human remains. American Journal of Physical Anthropology, 157(2), 305–321.
von Schwerin, J., Richards-­Rissetto, H., Remondino, F., Agugiaro, G., & Girardi, G. (2013). The MayaArch3D project:
A 3D WebGIS for analyzing ancient architecture and landscapes. Literary and Linguistic Computing, 28(4).
24
Spatial data visualisation
and beyond
Stuart Eve and Shawn Graham

Introduction

Deformed visions
The attempt to appreciate the sensory worlds of others, distant in time and place necessitates an unlearning:
that we subject to scrutiny our sensory education, of which the prejudice towards vision is only one part.
(Gosden, 2001, p. 166)

I want to propose a theory and practice of a Deformed Humanities. A humanities born of broken, twisted
things. And what is broken and twisted is also beautiful, and a bearer of knowledge. The Deformed Humani-
ties is an origami crane – a piece of paper contorted into an object of startling insight and beauty.
(Sample, 2012)

Vast computational power promises us that rainbow’s-­end we’ve been chasing: the ability to experience,
visualise and explore the past as it was. Even if we couch that desire in caveats, still, the desire remains.
There is nothing wrong with this desire; what is wrong is to pretend that it does not exist.
We have to consider that our digital sense – that extended cognition that overlays and permeates space
(knowing what friends are up to miles away because of constant social media updates; the ability to be
guided through traffic congestion via constantly updated maps; the sense of loss that occurs when there
is no wi-­fi signal) is part of the sensorium that archaeologists must now contend with. Let us then begin
with this newest sense, and consider the ways it can intersect with physical space especially when that sense
is dependent on these ephemeral, ghostly, haunted objects that ‘send [our] social relations off down a new
path, not through any intention on the part of the object, but through its effects on the sets of social
relations attached to various forms of sensory activity’ (Gosden, 2001, p. 165).
For instance – many Wikipedia articles contain geographic metadata. They are articles about a par-
ticular place. What tool would we reach for to understand this geographic coverage? A map, of course,
replete with dots or other icons. But Wikipedia exists in its own digital space(s), social and informatic,
spaces that overlie real world space. Several years ago we built ‘Historical Friction’ (Graham & Eve, 2013),
a web-­toy app extended from Ed Summers’ Ici (Summers, 2016). Summers’ app took the geolocation
from a user’s device and returned the list of Wikipedia articles geocoded to nearby places. ‘Historical
Spatial data visualisation and beyond 461

Friction’ by contrast vocalised that list with several computerised voices from text-­to-­speech synthesizers.
The denser a locale, the greater the cacophony of computers yelling at the listener. The web-­toy was not
a pleasant experience. It depended on the user pulling out the ear-­buds, taking off the headphones, and
seeing the place with new eyes in the revelatory silence.
Digital data is there for us to reach out and experience. It is not safely confined to a computer in the
lab. It permeates space. Experiencing it can be like Sample’s origami crane. In this chapter we gesture
towards ways we might usefully deform archaeological spatial data, thinking especially about sound and
vision.

Archaeological vision
The opening passage of Stephanie Moser’s exploration of the birth of archaeological visualisation states:

It is no surprise that archaeology – a discipline that is centered on the study of material culture –
relies heavily on a large suite of visual products to record, interpret, and present its findings to
professional and public audiences.
(Moser, 2012, p. 292)

She goes on to define visualisation as follows: “On the one hand, it results from the products that result
from graphically representing archaeological materials and on the other it refers to the process of interpre-
tation embodied in this visual translation” (Moser, 2012, p. 295). This is also true of spatial visualisations.
When one thinks of spatial data visualisation the ‘map’ is immediately brought to mind. Cartography
and map-­making have been at the centre of how we visualise the world for millennia (see Andrienko &
Andrienko, 2006; Slocum et al., 2008; Kraak & Ormeling, 2013; Tyner, 2014; Gillings, Hacıgüzeller, &
Lock, 2019a for overviews). As archaeologists we draw plans, sections and put countless dots on maps (the
meditative nature of manually drawing such plans has recently been celebrated by Caraher, 2015, as slow
archaeology). We explore and record an archaeological site horizontally and vertically using complicated
(but also familiar) notation such as the hachure or stippling. As well as using this abstract symbology we
produce more ‘accurate’ products like photographs of our trenches and the landscapes in which we are
working. We convert the electrical impulses from our geophysical equipment to colours and hues to help
us visualise the resistance of the soil to electricity. We capture signals from satellites and convert them to
a precise location on the planet which we then represent by a dot or the node in a line on a map. This
volume itself is replete with precisely this kind of visualisation.
These techniques are all very familiar to the archaeologist and each one has a vast amount of litera-
ture that can be examined, questioned and challenged. There is no space within this short chapter to
do justice to a detailed exploration of each of these methods, however, it is fair to say that the majority
of spatial visualisations created by archaeologists are currently created using Geographic Information
Systems (GIS). These visualisations tend to be presented as 2D plans or maps effectively recreating the
drawn record, albeit with clearer symbology and layout. Perhaps as a result of this digital proxy for the
hand-­drawn record, visualisation of space using GIS has traditionally been seen as a by-­product of a deeper
spatial analysis, as Ebert puts it, the “read-­only mode of GIS” (2004, p. 320). This view has been recently
challenged by Gupta and DeVillers, who argue that “visualisation encourages the use of our cognitive
abilities (rather than equations and algorithms) to process information and generate new knowledge”
(2017, p. 855). The degree to which our orthodox visualization techniques nurture and encourage such
an engagement is currently moot.
462 Stuart Eve and Shawn Graham

Archaeological spatial visualisation also has to take account of the temporal dimension. For example,
we present results from surveys that took place at specific times, representing artefacts that that were
deposited sometimes thousands of years apart. Any spatial visualisation that we create must necessar-
ily deal with both spatial and temporal uncertainty (see Fusco & de Runz this volume). Unfortunately,
current GIS software “typically enables navigation of the spatial and thematic dimensions, but it does
not offer effective exploration of the temporal dimension” (Gupta & DeVillers, 2017, p. 876). The
overwhelming majority of archaeological spatial visualisation is performed through the medium of the
cartographic product, be that a set of time-­series diagrams, an interactive 2D or 3D GIS interface with
a ‘time-­slider’ to explore the temporal aspect or a simple map showing points, lines and polygons. This
often means that the data has to be simplified to fit the requirements of the available tools, rather than
encouraging an exploration of different forms of visualisation.
As Gillings, Hacıgüzeller and Lock state, “there is nothing wrong with maps that are argumentative,
discordant, disruptive, playful, provocative or simply beautiful . . . . if novel connections and relations can
only be built [through these methods] then that is how it will have to be” (2019b, pp. 11–12). Beyond
the traditional map or plan, other forms of spatial visualisation exist that enable us to approach archaeo-
logical data in different ways. These include the novel presentation of statistical analyses, such as Martin
Sterry’s work (2018) which uses the Hue-­Saturation-­Value (HSV) colour wheel to visualise results of
multi-­dimensional correspondence analysis of pottery use in Roman Britain. Recent advances in web-­
based technology have allowed annotated interactive 3D virtual reality visualisations of LiDAR and other
data to be presented through online portals such as SketchFab (see https://sketchfab.com/markwalters).
It is now even possible to 3D print scale models of landscapes or artefacts and ‘visualise’ them haptically
(Neumüller, Reichinger, Rist, & Kern, 2014; Di Franco, Camporesi, Galeazzi, & Kallmann, 2015).
Perhaps unsurprisingly, an analysis of the available literature on visualisation suggests that archae-
ologists are only the tip of the spatial visualisation iceberg (see for instance MacEachren et al., 1998;
Slocum et al., 2001; Brewer, MacEachren, Abdo, Gundrum, & Otto, 2000; Crampton, 2002; Howard &
MacEachren, 1996, whose work, from a network-­theoretic point of view, ties the scholarship of geo-
graphic visualisation together).
If we perform the same quick computational reading of the citation knowledge graph (a network
analytic reading of the results of a Google Scholar search for ‘archaeological + data + visualization’, so
as to see data visualization beyond archaeological GIS), Figure 24.1, and look for the articles that tie the
network together (taking that as a signal that the ideas contained therein bridge scholarship), we find a
very strong focus on Virtual Reality work (Acevedo, Vote, Laidlaw, & Joukowsky, 2001; Vote, Acevedo,
Laidlaw, & Joukowsky, 2002; Allen et al., 2004; Van Dam, Laidlaw, & Simpson, 2002; Forte, Dell’Unto,
Issavi, Onsurez, & Lercari, 2012). If we are not doing GIS and if we are not drawing plans or plotting
dots, we are building virtual reality (VR); our debt to archaeological photography and the ‘visual’ aspect
of visualisation seems clear.
Hamilakis (2014, p. 22) has argued that the emergence of photography was the medium of capitalism
in the 19th century, in that photographs themselves became a kind of currency, a new form of visual
economy (citing Sekula, 1981; Poole, 1997). This autonomous and disembodied sense of vision was
quickly adopted in archaeology, making archaeology a “device of modernity” (Hamilakis, 2014, p. 9).
Archaeology’s privileging of the visual therefore is also complicit in the ontological admixture that
Hamilakis describes between aesthetics and politics, in that both circle around what is permitted to be
sensed, experienced, and appreciated, and by whom: consensus versus dissensus (2014, p. 415). The tools
and techniques of computational approaches to archaeology merely replicate this consensus. And yet,
archaeology is about that full-­bodied sensuous engagement with the things and environments of the world,
at the trowel’s edge, from which we craft the past. This tension, Hamilakis tells us, is the wedge with
Figure 24.1 A citation network of results returned from a Google Scholar search of ‘geographic + visual-
ization’, using Ed Summers’ python package ‘Etudier’. “Google Scholar aims to rank documents the way
researchers do, weighting the full text of each document, where it was published, who it was written by, as
well as how often and how recently it has been cited in other scholarly literature.” https://scholar.google.
com/intl/en/scholar/about.html. The results give us a sense of the most important works by virtue of
these citation patterns. Thus, MacEachren, Boscoe, Hau, & Pickle (1998); Slocum et al. (2001); Brewer,
MacEachren, Abdo, Gundrum, & Otto (2000); Crampton (2002); Howard & MacEachren (1996) are most
functionally important in tying scholarship together. This is not the same thing as being the most often cited
work. Rather, these are the works whose ideas bridge otherwise disparate clumps; they are most central. A
colour version of this figure can be found in the plates section.
464 Stuart Eve and Shawn Graham

which we might insert a more fully sensorial engagement in archaeology (2014, p. 9). “There is no per-
ception which is not full of memories” going on to assert that “ . . . it is my conviction that all academic
writing should become evocative, merging scholarly discourses with mnemonic and autobiographical
accounts” (2014, pp. 9–10). For Hamilakis, the merging of different ways to sense the world (Ingoldian
knots, perhaps, of lives lived, Ingold, 2015) means that all sensorial experience is synesthesia (Hamilakis,
2014, pp. 410–411).
Hamilakis also argues that “The human individual, especially as perceived and enacted in Western
capitalist modernity, is not the most appropriate unit of analysis for an archaeology of the senses. This
is not only because, as anthropological accounts have shown, human persons can be conceptualized and
embodied in diverse ways . . . . More important[ly] such an analytical category is inappropriate because
sensorial experience is activated at the moment of a transcorporeal encounter; this is an encounter among
human bodies, between human bodies and the bodies of other beings, and between human bodies and
objects, things, and environments” (Hamilakis, 2014, p. 411). This echoes Ingold in his discussion of the
“life of lines” (2015) where he argues not for assemblages, but for correspondences. The concept of the
‘assemblage’ is ‘too static’ because it does not allow for the frictions or tensions that bind things together.
Lines do – for they knot and twist and respond to one another. Meaning is not built from blocks juxta-
positioned, but from movement along a line, where it bunches up encountering other lines. As we shall
see later, if we cannot use the experienced senses of a human individual of today as a direct proxy for past
senses, perhaps we can instead use our present senses to create and experience new things about the past.
Hamilakis only deals with digital media briefly, regarding them as merely another prosthesis for
thought. Yet digital media is itself active and has a kind of agency (a way to effect change in the world)
in a way other classes of materials genuinely do not. Moreover, digital media bring another actor into the
mix, for digital work is a correspondence between user, machine, and programmer. Digital synaesthesia
emerges from this knotting. To work in a digital medium, to work with computational tools and semi-­
autonomous software agents requires the performance of tacit knowledge and experience. We respond to
the machine and it in turn responds to us. We may call it a ‘black box’, which only serves to show that the
result is a deformance (portmanteau of ‘deform’ and ‘performance’), a making strange and an estrange-
ment from the sand and dirt and flies of the excavation. But if we recognize that computation is a kind of
knotted performance, then we should recognize also that computation returns an emotional connection
to this data, to remind us that data is always a proxy for human lives lived. And so it is not without ethical
consequences. The decisions we take in a digital medium, given the nature of computation (whose fun-
damental action is to copy), get multiplied in their effects. Copying implies connection, a tangled web of
articulations. Hence, the choice of representation (whether visual or aural), or form (or indeed, whom to
cite!), when there is a choice to be made (as there always is), is a force multiplier. Computation entangles
us, knots us, in networks/meshworks/filigrees of time and space. Computation expands our senses and
at the same time our entanglements with the world (c.f. Hodder, 2012).
As archaeologists we are still very much at the edge of exploring the potential of the full sensorium
(see Mlekuz, 2004; Frieman & Gillings, 2007; Eve, 2014; Primeau & Witt, 2017) and especially so when
attempting to ‘visualise’ the past and the results of our spatial analyses (for example, see work by Mur-
doch & Davies, 2017, on whether or not VR reconstructions could be spiritually affective). As the Inter-
net of Things (Xia, Yang, Wang, & Vinel, 2012; Kopetz, 2011) becomes a reality, our concept of what is a
computer has also become more complicated. Our laptops, our smartphones, our GPS devices and even
our toasters (Engadget, 2018) are connected to the internet at all times and beginning to blur the bound-
aries between the real world that we inhabit, and the virtual world that we visit via our devices. Currently,
however, a paradigm shift is occurring within the computer science sector towards ‘spatial computing’
(Shekhar, Feiner, & Aref, 2015). Spatial computing recognises this digital kinesthesia, encompassing “the
Spatial data visualisation and beyond 465

ideas, solutions, tools, technologies, and systems that transform our lives by creating a new understand-
ing of locations – how we know, communicate, and visualise our relationship to locations and how we
navigate through them” (ibid., 72). Historically archaeologists have only embraced some aspects of spatial
computing, most notably geographic information systems (Conolly & Lake, 2006) and spatial statistics
(Wheatley & Gillings, 2000). These technologies and methods are now as familiar to the archaeologist
as the trowel – but spatial computing needs to meet the challenges and embrace the opportunities of
constantly emerging and evolving technologies. This includes the sheer quantity of data being collected
(see Green, this volume; McCoy, 2017; Cooper & Green, 2016, for discussions of [geospatial] big data in
archaeology), but also the evolving concept of space as represented within the computing environment.
Traditional GIS deals with points, lines, polygons and rasters in a very abstracted way, yet there is now
a “ . . . need for new algorithms, as well as cooperation between users and the cloud, full 3D position
and orientation (pose) estimation of people and devices, and registration of physical and virtual things”
(Shekhar et al., 2015, p. 77). We are developing the technology to capture human bodies and archaeo-
logical objects with full degrees of freedom and can represent them in virtual space (Eve, 2018a). As we
will go on to demonstrate, we can now take our GIS objects or the results of our statistical analyses and
present and explore them in the real locations of real reality, rather than just on the screen of a computer.
The familiar 2D or 2.5D representations of the printed map or illustration can become a real 3D world
overlaid on the actual environment with which we can engage and embody.
The ‘embodied GIS’ was first introduced by Stuart Eve (Eve, 2012, 2014, 2017) to formalise the use of
Augmented Reality (AR) technology within archaeology. Augmented reality is a form of mixed reality
that takes digital data and blends it with the real world. Augmented reality “ . . . allows a user to work in
a real world environment while visually receiving additional computer-­generated or modelled informa-
tion to support the task at hand” (Schnabel, Wang, Seichter, & Kvan, 2007, p. 4). George Papagiannakis
and colleagues produced one of the best-­known cultural heritage AR applications, centred on the site of
Pompeii (Papagiannakis et al., 2004, 2005; Papagiannakis & Magnenat-­Thalmann, 2007) . Using a special
see-­through video headset along with dynamic modelling of the real and virtual world, Papagiannakis
and his team were able to insert virtual characters into various real buildings within Pompeii and guide
the visitors through a narrative as they walked through the site. A recent example of the use of AR in
archaeology was a ‘Pokémon Go’ meet up in the city of Chester orchestrated by Big Heritage and Nian-
tic Labs in 2017. Users of the Pokémon Go app were guided around the historical sites in the hope of
hunting virtual creatures (Pokémon) while learning more about the history of the city (Zeroghan, 2017).
Both of these examples overlay digital data on physical spaces, but in the context of our discussions of
Hamilakis’ work, it is worth remembering when using AR

[T]he introduction of the virtual elements should be kept to a minimum and, in contrast, the land-
scape itself should provide the bulk of the experience – the way in which steep slopes tire you; the
shelter gained from standing in the lee of a hill; the smells of the flowers; the sound of the birdsong;
and the views and perspectives that open and close as you explore the landscape.
(Eve, 2017, para. 3.3)

These are powerful modalities to explore. Yet they depend on proprietary software and hardware,
clunky to handle and awkward in the field. The embodied GIS and our entangled digital kinesthetic
sense can (and should) involve haptic full-­body engagements (TeslaSuit, 2018), olfactory stimulation
(Eve, 2018b), gustatory stimulation (Iwata et al., 2004) or even direct electrical stimulation of nerve cells
across the body (Delazio et al., 2018). However, without picking the low-­hanging fruit of the visual, at
present one of the easiest and most accessible way of evoking this digital kinaesthesia, and exploring and
466 Stuart Eve and Shawn Graham

Figure 24.2 Citation analysis using Summers’ Etudier package, from a Google Scholar Search for ‘data + soni-
fication’. Colours are works that have similar patterns of citation; size are central works that tie scholarship
together. This is not the same thing as ‘most cited’. On this reading, one should begin with Madhyastha and
Reed (1995); Wilson and Lodha (1996); Zhao, Plaisant, Shneiderman, and Duraiswami (2004); De Campo
(2007); Zhao, Plaisant, Shneiderman, and Lazar (2008). A colour version of this figure can be found in the
plates section.

presenting data is through the creative manipulation of aural data points across and within spaces as we
demonstrate in our method and case studies.
As long ago as 1994, John Krygier was arguing for the use of sound and ‘auralisation’ to represent
geographic data, pointing to even earlier work in the 1950s (Pollack & Ficks, 1954) on the use of sound
to represent multivariate data. A citation network analysis shows that Krygier’s work (Figure 24.2) has
not penetrated to any great degree into archaeology, and so we re-­introduce ideas of sonification into
this space. In particular, he points to the use of sound coupled with animation, to indicate uncertainty:

Maps tend to be ‘totalising’ creatures: variations in uncertainty and quality are smoothed over to
create an orderly, homogeneous graphic. On one hand, this is why maps are so useful, and it is
obvious that maps enable us to deal with our uncertain and messy world by making it look more
certain and tidy. Yet it seems important that some sense of the uncertainty or quality of the repre-
sented data be available . . . The purpose of maps, remember, is to impose order, not to accurately
represent chaos. Further, there is only so much visual headroom on a display: using visual variables
to display uncertainty may have the effect of limiting the display of other data variables.
(Krygier, 1994, p. 161)
Spatial data visualisation and beyond 467

This concern with uncertainty fits well with the ‘fuzziness’ that a digital synesthesia would promote, and
the kinds of ‘deformance’ or ‘brokenness’ that digital humanities theoreticians like Mark Sample (2012)
argue for. We turn then to sonification as a method and simple ways/case studies that some of this bro-
kenness can be returned to our archaeological geographies.

Method
There is a deep history and literature on archaeoacoustics and soundscapes that tries to capture the sound
of a place as it was (see, for instance, Wall, 2018, on the creation of St. Paul’s or Jeff Vietch’s work on
ancient Ostia, 2017). But we are attempting to sonify spatial datasets – to visualise them with sound, in
situ. This is not so much a recreation of the sounds of the past, but instead a way of exploring our data
about the past. For example, where we might look at a graphical representation of a scatter of flints over
a field, using the visual devices to distance ourselves from the abstract notion of flint counts – we can
instead move through that field wearing headphones, retrieving our location from GPS, and hear the
changes in the data, hear the hotspots (and perhaps more importantly notice the absences of sound) as
we walk. The resulting aural experience is a literal ‘deformance’ that makes us hear modern layers of the
past in a new way.
As Graham (2016) outlines:

Sonification is the practice of mapping aspects of the data to produce sound signals. In general, a
technique can be called ‘sonification’ if it meets certain conditions. These include reproducibil-
ity (the same data can be transformed the same ways by other researchers and produce the same
results) and what might be called intelligibility – that the ‘objective’ elements of the original data
are reflected systematically in the resulting sound.

Last and Usyskin (2015) have undertaken a number of experiments to test how humans react to soni-
fication of datasets and what kinds of tasks this method can achieve. Their results show that even listeners
with no formal training in music can perceive useful distinctions on the data. These distinctions included
common data visualisation tasks such as classification and clustering.
Because music is sequential and has a duration, Last and Usyskin argue that time-­series data is particu-
larly well-­suited to sonification (2015, p. 424). Time-­series data is also sequential and evolves over time. In
many aspects of sonification,‘parameter mapping’ is used to match a certain data series to various auditory
dimensions (in our flint example, the amount of flint present in a location might be matched to the pitch
of the sound – the higher the pitch the greater the concentration of flint). Rasterised GIS datasets, by their
very definition, are continuous surfaces of data, and every point of space has a value. Therefore, when
we move through the space represented by that raster, physically walking over the field of flint scatters, it
can be considered similar to panning the mouse pointer over the raster of flint concentrations. The data
is continuous and so sonification of that data is quite appropriate. We journey through the space, at the
same time as journeying through the soundscape created by and from that data.
There is also an effect where our expectations of what the sound ‘is’ or ‘represents’ causes us to literally
hear sounds that are not there. A typical example involves flattening all of the instruments and voices in
a pop-­song into a midi file, and then playing that midi file as a piano solo. If one is already familiar with
the song, one can hear the ‘voice’ singing. If not, the sound is unintelligible noise. This effect is sometimes
called an ‘auditory hallucination’(c.f. Koebler, 2015). This example shows how in any representation of
data we can hear/see what is not, strictly speaking, there. We fill the holes with our own expectations.
The sonification of the flint example is subject to the same spatial resolution issues as a more traditional
468 Stuart Eve and Shawn Graham

visualisation, the resulting soundscape will change if we use a 5m pixel resolution (picking up the smaller
variations in the data) or a 25m pixel resolution (only playing the broader trends). The same is true of
any visualisation; it just is perhaps more apparent as we consider sound. Thus, as with all methods of
visualisation, we need to be critically self-­aware, and foreground that reflection as part of our analysis.

Case studies

Sonification out loud


We will now present three case studies that represent recent examples of sonifying archaeological spatial
data. Each case study has a set of 3D points as its underlying dataset, but each presents the data in a dif-
ferent way – and can be experienced either in situ or via a desktop computer.
Recalling Sample’s origami crane – part of the art of origami is to delight in the care and meditation
that the process affords. The act of sonification does not always produce pleasing or necessarily imme-
diately intelligible sound. In which case, we need to devote attention to process, to blind alleys, to dead
ends. That is, we argue for the ‘failure as epistemology’ developed for by Croxall and Warnick (2017). The
way that things break, the ways our digital tools do not really achieve what we wanted or expected, reveal
in their fault lines truths about our ideas about the world, the data, and the past. Surfacing the process of
digital work is as important as the finished products we make.

York municipal cemetery


As part of the 2014 Heritage Jam organised at the University of York, UK (Laino, 2014) we decided to
explore how we could use sound to affect and inform visitors to the 19th–20th century municipal cem-
etery of York. The resulting application, entitled Voices/Recognition, was “designed to augment one’s
interaction with York Cemetery, its spaces and visible features, by giving a voice to the invisible features
that represent the primary reason for the cemetery’s existence: accommodation of the bodies buried
underground” (Eve, Hoffman, Morgan, Pantos, & Kinchin-­Smith, 2014).
The prototype application is delivered via the speakers or headphones of the user’s smartphone. It
reads the user’s location from the GPS sensor in the smartphone as they walk around the cemetery and
then compares that with an underlying spatial database. If the user is in close proximity to a grave that
has additional data related to it a sound file is played (the volume of which is determined by the user’s
distance from the grave itself). The data underlying the application is built from a simple GIS database of
the burial register including grave locations along with the names of the people buried. As the application
was a prototype, instead of a fully finished product, the grave details were not complete and instead the
sound files were created as various whispering voices that were triggered using Apple’s Core Location
libraries. The use of sonification to explore the grave data raised a number of previously unconsidered
questions about the experience of graveyards. For example, while a lot of the graves have markers, there
are also a large number of unmarked burial pits – pauper’s graves – that contain a large number of skel-
etons all piled into one pit. These pits tend to be beneath the pathways between the grave markers and
in the open spaces, and (being unmarked) are not considered by visitors to the cemetery. As we had no
idea how many bodies were interred in each pit, we represented them by a cacophony of different voices
telling random stories.

The areas of the cemetery that are visually empty are suddenly transformed into areas containing a
vast number of voices of the dead. There is a common belief that it is bad luck or disrespectful to
Spatial data visualisation and beyond 469

walk over somebody’s grave, therefore the ‘empty’ paths that were previously seen as a ‘safe’ places
to walk, suddenly become areas that are superstitiously liminal.
(Eve, 2017, para. 4.2)

The experiment also raised issues about power and control in the cemetery and how that is reflected by
the placement of the graves. In contrast to the cacophony produced by the pauper pits, when you move
closer to a larger, expensive grave monument the cacophony is reduced to just one or two voices – as
the expensive graves have been placed to stand apart from the other graves. The voices of the rich and
powerful are heard as clearly in death as they were in life. We would argue that this social stratification
and also the affective nature of using sounds and voices to represent the pauper graves would not be so
obvious if we were looking at a simple visualisation on a screen or printed on a map.

Listening to Watling Street


Part of the 2015 Heritage Jam, Graham was inspired by the work of the ‘Data Drive Dj’, Brian Foo, and
his piece ‘Two Trains – Sonification of Income Inequality on the NYC Subway’. In this piece, Foo takes
the US Census data on median wealth along the stops of the subway, and uses this data to generate a piece
of music. The length of the piece is scaled against the length of the subway line. The song is generated by
running an auction for sound samples at each point along the line. In general, the higher the income, the
more sound samples that can be selected and played for the duration until the next subway station is reached.
Each station has a ‘budget’, which is set from the US Census data for average monthly wage at that station;
each instrument has a ‘price’. The poorer the district, the softer, less complex, the music. Foo’s code is open
source (Foo, 2018), and well documented and so we can see exactly how the song is generated.
The vision of space in the Roman world, as a sequence of places-­that-­come-­next as depicted on mile-
stones and in written itineraries, is readily amenable to Foo’s vision for hearing inequality along a subway
line. In the case of ‘Listening to Watling Street’, the data comes from the Inscriptions of Roman Britain web
site – counts of coins. We take each town in the Antonine Itinerary along Watling Street, and find the
relevant number of coins. Then, we set the ‘price’ for each instrument such that towns with more coins
obtain a greater tonal variety. Graham experimented with various combinations of instrument clips,
aiming for a tonal composition that would be appropriate for a kind of Roman procession (see Favro &
Johanson, 2010).
As we listen to this song, we hear crescendos and diminuendos that reflect a kind of place-­based shout-
ing: here are the places that are advertising their Romanness, that have an expectation to be heard (Roman
inscriptions quite literally speak to the reader); as Western listeners, we have also learned to interpret such
musical dynamics as implying movement (emotional, physical) or importance. The same itinerary can
then be repeated using different base data – coins from the Portable Antiquities Scheme database, for
instance – to generate a new tonal poem that speaks to the economic world, and, perhaps the insecurity
of that world (for why else would one bury coins?).
Foo draws his musical samples from music written by New York artists, music that ‘captures the throb-
bing vibrancy of New York and the movement of its citizens’. In ‘Listening to Watling Street’ (Graham,
2015) we too are interested in movement, but using these base samples Foo provides (although a small
set of these) perhaps unwittingly makes aural comparison to New York. In the first sketches of ‘Listening
to Watling Street’ we slowed down the beats-­per-­minute to reflect a kind of marching cadence, to subtly
introduce the idea of the marching Roman army. In the second version (which was submitted to the
Heritage Jam), the tempo was sped up and more instrumentation was used to capture the frenetic motion
of the Roman trader. Both versions are true, for a given value of ‘truth’.
470 Stuart Eve and Shawn Graham

Ottawa love stories


Tim Ingold directs us to consider the lived life of lines in the landscape (2015). These lines, which humans
extend outwards from our experiences entangle with the lines of other humans, other beings, other
things. One way Ingold directs us to think about these lines, their knottings, and their co-­respondences
is to think of them in terms of sound. A vibrating line – a string under tension – makes a noise in the
world. If we considered movement through space as a similar kind of noise-­making, what would our
traces sound like?
Cassandra Marsillo, a student in Carleton University’s Public History MA program, has thought
about these issues and provides us with another case study. Working with digitised historical newspa-
pers, she was struck by the way the obituaries paid particular attention to spaces and places of these
lived lives. She identified a particular genre of these obituaries where a husband or wife died shortly
after the death of their spouse, ‘of a broken heart’. The emotional impact of the places mentioned in
these obituaries seemed clear. She wanted to take the digital representation of the meaningful affect of
these spaces into the physical locations. It seemed however invasive: the dead had given no permission
to have their lives represented this way. Marsillo decided to work with the living, and their memories
of emotional spaces. Thus the ‘Ottawa Love Stories’ project was born (Marsillo, 2018). Marsillo asks her
respondents, ‘where were/are the places that are important in the shared life of you and your partner?’.
The resulting maps inscribe these personal histories as lines on the map within the boundaries of the
city of Ottawa.
The map quickly becomes a tangle of knots; but the knots extend also in time. With time comes dura-
tion, and with duration comes sound. Marsillo uses simple techniques of parameter mapping to map the
changing latitude and longitude and ‘amplitude’ (intensity of the emotion tied to the location) against
the 88-­key keyboard. That is to say, she takes a set of values and performs a mathematical transformation
against them to scale their relative value within a couple of octaves on the piano. Given that all of these
stories are taking place against the map of Ottawa, particular locations appear again and again in these
stories. As the songlines are played, those locations form a kind of sonic architecture against which the
other notes sound. Unexpected congruences and harmonies emerge, dissipate; lives lived, lines traced.
Each story then takes place inside the same sonic space but altogether certain chords keep happen-
ing. Why these chords? Why these places? A sonification of simple point data draws our attention to
an Ingoldian conception of lines in the landscape. For readers of this volume, these baselines (bass lines),
could be accentuated with other kinds of archaeological data. The archaeological data become the grace
notes of a song as a way of approximating an effective approach to the sense history of the place. We
cannot recover emic sensations of the past, but we can create new sonic experiences of the past that could
redirect our attention.

Conclusion
Within this chapter, we have shown that the visualisation of spatial data is not just limited to dots on a
map, or hachures on an archaeological plan – instead we demonstrate that opening up archaeological data
to be experienced through other sensory modalities might open our understandings of the past in new
ways. The traditional methods of visualising our data have much merit and should not be discarded, they
are familiar, and because of that familiarity they are easy to understand and also often easy to produce
using modern software. But we would argue that we are now at the point in the development of spatial
computing where we can explore our data in parallel using different interfaces and different sensory
modalities.
Spatial data visualisation and beyond 471

We have used examples of the sonification of data as one way into accessing these different modalities.
Whilst the software and hardware to sonify data is still not mainstream, presently it is developed enough
to enable researchers to begin to use it (much more so than, for instance, olfactory or gustatory inter-
faces). Not all data is suitable for sonification, in the same way that not all data is suitable for visualisation
in a scatter chart or a raster surface. Nevertheless we have shown that sonification can become another
vector for knowledge mobilisation. Just as in a stylised visual map, it is not a passive representation of
the archaeological data, but a performance of the data that gestures beyond itself, to conjure up other
associations, meanings, and emotions.
As available technology and methods progress we are going to be able to move beyond the simple
map or distribution chart and begin to experience our data with our bodies, with multiple senses. We
are going to be able to experience our datasets in situ, as we walk through an archaeological site or land-
scape – and we are not going to just see the patterns change, we are going to hear, feel, taste and smell
them. Spatial data visualisation is no longer visualisation at all, it is an embodied experience that uses
multiple sensory modalities to represent the same underlying datasets, each modality telling its own story
and revealing its own unique patterns.

References
Acevedo, D., Vote, E., Laidlaw, D. H., & Joukowsky, M. S. (2001). Archaeological data visualization in VR: Analysis of
lamp finds at the Great Temple of Petra, a case study. Proceedings of the conference on Visualization’01 (pp. 493–496).
IEEE Computer Society.
Allen, P., Feiner, S., Troccoli, A., Benko, H., Ishak, E., & Smith, B. (2004). Seeing into the past: Creating a 3D modeling
pipeline for archaeological visualization. 3D Data Processing, Visualization and Transmission, 2004. 3DPVT 2004.
Proceedings. 2nd International Symposium on (pp. 751–758). IEEE.
Andrienko, N., & Andrienko, G. (2006). Exploratory analysis of spatial and temporal data: A systematic approach. New
York: Springer Science & Business Media.
Brewer, I., MacEachren, A. M., Abdo, H., Gundrum, J., & Otto, G. (2000). Collaborative geographic visualization: Enabling
shared understanding of environmental processes. Information Visualization, 2000. InfoVis 2000. IEEE Symposium
on (pp. 137–141). IEEE.
Caraher, W. (2015). Slow archaeology. North Dakota Quarterly, 80(2), 43–52.
Conolly, J., & Lake, M. (2006). Geographical information systems in archaeology. Cambridge: Cambridge University Press.
Cooper, A., & Green, C. (2016). Embracing the complexities of “Big Data” in archaeology: The case of the
English landscape and identities project. Journal of Archaeological Method and Theory, 23(1), 271–304. https://doi.
org/10.1007/s10816-­015-­9240-­4
Crampton, J. W. (2002). Interactivity types in geographic visualization. Cartography and Geographic Information Science,
29(2), 85–98.
Croxall, B., & Warnick, Q. (2017). Failure | digital pedagogy in the humanities | MLA commons. Retrieved April 29,
2018, from https://digitalpedagogy.mla.hcommons.org/keywords/failure/
De Campo, A. (2007). Toward a data sonification design space map. Atlanta: Georgia Institute of Technology.
Delazio, A., Nakagaki, K., Klatzky, R. L., Hudson, S. E., Lehman, J. F., & Sample, A. P. (2018). Force jacket:
Pneumatically-­actuated jacket for embodied haptic experiences. In Proceedings of the 2018 CHI Confer-
ence on Human Factors in Computing Systems (pp. 320:1–320:12). New York, NY, USA: ACM. https://doi.
org/10.1145/3173574.3173894
Di Franco, P. D. G., Camporesi, C., Galeazzi, F., & Kallmann, M. (2015). 3D printing and immersive visualization for
improved perception of ancient artifacts. Presence, 24(3), 243–264.
Ebert, D. (2004). Applications of archaeological GIS. Canadian Journal of Archaeology/Journal Canadien d'Archéologie,
319–341.
Engadget. (2018). The world now has a smart toaster. Retrieved April 26, 2018, from www.engadget.com/2017/01/04/
griffin-­connects-­your-­toast-­to-­your-­phone/
472 Stuart Eve and Shawn Graham

Eve, S. (2012). Augmenting phenomenology: Using augmented reality to aid archaeological phenomenology in the
landscape. Journal of Archaeological Method and Theory, 19(4), 582–600. https://doi.org/10.1007/s10816-­012-­9142-­7
Eve, S. (2014). Dead men’s eyes: Embodied GIS, mixed reality and landscape archaeology. BAR British Series 600. Oxford:
Archaeopress.
Eve, S. (2017). The embodied GIS: Using mixed reality to explore multi-­sensory archaeological landscapes. Internet
Archaeology, (44). https://doi.org/10.11141/ia.44.3
Eve, S. (2018a). Losing our senses, an exploration of 3D object scanning. Open Archaeology, 4(1), 114–122. https://
doi.org/10.1515/opar-­2018-­0007
Eve, S. (2018b). A dead man’s nose: Using smell to explore the battle of waterloo. In D. Medway, K. McLean,
C. Perkins, & G. Warnaby (Eds.), Designing with smell: The practices, techniques and challenges of olfactory creation (In
Press). London: Routledge.
Eve, S., Hoffman, K., Morgan, C., Pantos, A., & Kinchin-­Smith, S. (2014). Voices recognition paradata document. Retrieved
February 3, 2016, from www.heritagejam.org/s/VoicesRecognitionParadata.pdf
Favro, D., & Johanson, C. (2010). Death in motion: Funeral processions in the Roman forum. Journal of the Society
of Architectural Historians, 69(1), 12–37.
Foo, B. (2018). music-­lab-­scripts: Scripts for generating music. Python. Retrieved from https://github.com/beefoo/music-­
lab-­scripts (Original work published 2014).
Forte, M., Dell’Unto, N., Issavi, J., Onsurez, L., & Lercari, N. (2012). 3D archaeology at Çatalhöyük. International
Journal of Heritage in the Digital Era, 1(3), 351–378.
Frieman, C., & Gillings, M. (2007). Seeing is perceiving? World Archaeology, 39(1), 4. https://doi.
org/10.1080/00438240601133816
Gillings, M., Hacıgüzeller, P., & Lock, G. (Eds.). (2019a). Re-­mapping archaeology: Critical perspectives, alternative mappings.
New York, NY: Routledge.
Gillings, M., Hacıgüzeller, P., & Lock, G. (2019b). On maps and mapping. In M. Gillings, P. Hacıgüzeller, & G. Lock
(Eds.), Re-­mapping archaeology: Critical perspectives, alternative mappings (pp. 1–16). New York, NY: Routledge.
Gosden, C. (2001). Making sense: Archaeology and aesthetics. World Archaeology, 33(2), 163–167.
Graham, S. (2015). Listening to watling street. Retrieved from www.heritagejam.org/2015exhibitionentries/2015/9/18/
listening-­to-­watling-­street-­dr-­shawn-­g raham
Graham, S. (2016). The sound of data (a gentle introduction to sonification for historians). Programming Historian.
Retrieved from https://programminghistorian.org/lessons/sonification
Graham, S., & Eve, S. (2013). Historical friction. Retrieved February 3, 2016, from https://github.com/shawngraham/
historicalfriction
Gupta, N., & DeVillers, R. (2017). Geographic visualization in archaeology. Journal of Archaeological Method and
Theory, 24, 852–885.
Hamilakis, Y. (2014). Archaeology and the senses: Human experience, memory, and affect. Cambridge: Cambridge Uni-
versity Press.
Hodder, I. (2012). Entangled: An archaeology of the relationships between humans and things. New Jersey: John Wiley &
Sons.
Howard, D., & MacEachren, A. M. (1996). Interface design for geographic visualization: Tools for representing reli-
ability. Cartography and Geographic Information Systems, 23(2), 59–77.
Ingold, T. (2015). The life of lines. Abingdon, UK: Routledge.
Iwata, H., Yano, H., Uemura, T., & Moriya, T. (2004, March). Food simulator: A haptic interface for biting. In IEEE
virtual reality 2004 (pp. 51–57). IEEE.
Koebler, J. (2015, December 18). The strange acoustic phenomenon behind these wacked-­out versions of pop songs.
Retrieved April 29, 2018, from https://motherboard.vice.com/en_us/article/kb7agw/the-­strange-­acoustic-
­phenomenon-­behind-­these-­wacked-­out-­versions-­of-­pop-­songs
Kopetz, H. (2011). Internet of things. In Real-­time systems (pp. 307–323). New York: Springer.
Kraak, M.-­J., & Ormeling, F. J. (2013). Cartography:Visualization of spatial data. Abingdon, UK: Routledge.
Krygier, J. B. (1994). Chapter 8: Sound and geographic visualization. In A. M. Maceachren & D. R. F. Taylor
(Eds.), Modern cartography series (Vol. 2, pp. 149–166). Academic Press. https://doi.org/10.1016/B978-­0-
­08-­042415-­6.50015-­6
Laino, F. (2014). 2014 entries. Retrieved February 3, 2016, from www.heritagejam.org/2014-­entries/
Spatial data visualisation and beyond 473

Last, M., & Usyskin, A. (2015). Listen to the sound of data. In Multimedia data mining and analytics (pp. 419–446).
New York: Springer.
MacEachren, A. M., Boscoe, F. P., Haug, D., & Pickle, L. W. (1998). Geographic visualization: Designing manipulable
maps for exploring temporally varying georeferenced statistics. In Information Visualization, 1998. Proceedings. IEEE
Symposium on (pp. 87–94). IEEE.
Madhyastha, T. M., & Reed, D. A. (1995). Data sonification: Do you see what I hear? IEEE Software, 12(2), 45–56.
Marsillo, C. (2018). Ottowa love stories. Retrieved from https://ottlovestories.wordpress.com/
McCoy, M. D. (2017). Geospatial Big Data and archaeology: Prospects and problems too great to ignore. Journal of
Archaeological Science, 84, 74–94.
Mlekuz, D. (2004, November 11). Listening to landscapes: Modelling past soundscapes in GIS [text.article]. Retrieved
November 16, 2010, from http://intarch.ac.uk/journal/issue16/mlekuz_index.html
Moser, S. (2012). Early artifact illustration and the birth of the archaeological image. Archaeological Theory Today,
292–322.
Murdoch, M., & Davies, J. (2017). Spiritual and affective responses to a physical church and corresponding virtual
model. Cyberpsychology, Behavior, and Social Networking, 20(11), 702–708. https://doi.org/10.1089/cyber.2017.
0249
Neumüller, M., Reichinger, A., Rist, F., & Kern, C. (2014). 3D printing for cultural heritage: Preservation, accessibil-
ity, research and education. In 3D research challenges in cultural heritage (pp. 119–134). Berlin, Heidelberg: Springer.
https://doi.org/10.1007/978-­3-­662-­44630-­0_9
Papagiannakis, G., & Magnenat-­Thalmann, N. (2007). Mobile augmented heritage: Enabling human life in ancient
Pompeii. International Journal of Architectural Computing, 5(2), 396–415.
Papagiannakis, G., Schertenleib, S., O’Kennedy, B., Arevalo-Poizat, M., Magnenat-Thalmann, N., Stoddart, A., &
Thalmann, D. (2005). Mixing virtual and real scenes in the site of ancient Pompeii. Computer Animation and
Virtual Worlds, 16(1), 11–24.
Papagiannakis, G., Schertenleib, S., Ponder, M., Arévalo, M., Magnenat-­Thalmann, N., & Thalmann, D. (2004).
Real-­time virtual humans in AR sites. Proceedings of IEE Visual Media Production 2004 (pp. 273–276). Stevenage,
Hertfordshire: IEE.
Pollack, I., & Ficks, L. (1954). Information of elementary multidimensional auditory displays. The Journal of the
Acoustical Society of America, 26(2), 155–158.
Poole, D. (1997). Vision, race, and modernity: A visual economy of the Andean image world. Princeton, NJ: Princeton
University Press.
Primeau, K. E., & Witt, D. E. (2017). Soundscapes in the past: Investigating sound at the landscape level. Journal of
Archaeological Science: Reports, 19, 875–885.
Sample, M. (2012). Notes towards a deformed humanities. Retrieved April 29, 2018, from www.samplereality.
com/2012/05/02/notes-­towards-­a-­deformed-­humanities/
Schnabel, M. A., Wang, X., Seichter, H., & Kvan, T. (2007). From virtuality to reality and back. In Proceedings of the
IASDR 2007 conference. Hong Kong: The Hong Kong Polytechnic University.
Sekula, A. (1981). The traffic in photographs. Art Journal, 41(1), 15–25. https://doi.org/10.1080/00043249.1981
.10792441
Shekhar, S., Feiner, S. K., & Aref, W. G. (2015). Spatial computing. Commun. ACM, 59(1), 72–81. https://doi.
org/10.1145/2756547
Slocum, T. A., Blok, C., Jiang, B., Koussoulakou, A., Montello, D. R., Fuhrmann, S., & Hedley, N. R. (2001). Cogni-
tive and usability issues in geovisualization. Cartography and Geographic Information Science, 28(1), 61–75. https://
doi.org/10.1559/152304001782173998
Slocum, T. A., McMaster, R. M., Kessler, F. C., Howard, H. H., & Mc Master, R. B. (2008). Thematic cartography and
geographic visualization. New Jersey: Prentice Hall.
Sterry, M. (2018). Multivariate and spatial visualisation of archaeological assemblages. Internet Archaeology, (50).
https://doi.org/10.11141/ia.50.15
Summers, E. (2016). ICI: Edit Wikipedia pages near you. JavaScript. Retrieved from https://github.com/edsu/ici
(Original work published 2013).
Teslasuit. (2018). Teslasuit: Full body haptic suit. Retrieved April 27, 2018, from https://teslasuit.io/
Tyner, J. A. (2014). Principles of map design. New York: Guilford Publications.
474 Stuart Eve and Shawn Graham

Van Dam, A., Laidlaw, D. H., & Simpson, R. M. (2002). Experiments in immersive virtual reality for scientific visu-
alization. Computers & Graphics, 26(4), 535–555.
Veitch, J. (2017). Soundscape of the street: Architectural acoustics in Ostia. In E. Betts (Ed.), Senses of the empire:
Multisensory approaches to Roman culture. Abingdon-­On-­Thames: Taylor & Francis.
Vote, E., Acevedo, F. D., Laidlaw, D. H., & Joukowsky, M. S. (2002). Discovering petra: Archaeological analysis in
VR. IEEE Computer Graphics and Applications, 22(5), 38–50.
Wall, J. (2018). Recovering lost acoustic spaces: St. Paul’s cathedral and Paul’s churchyard in 1622. Retrieved April 27, 2018,
from www.digitalstudies.org/articles/10.16995/dscn.58/
Walters, M. (2018). Mark Walters on Sketchfab. Retrieved April 27, 2018, from https://sketchfab.com/markwalters
Wheatley, D., & Gillings, M. (2000). Vision, perception and GIS: Developing enriched approaches to the study of
archaeological visibility. In G. Lock (Ed.), Beyond the map (pp. 1–27). Amsterdam: IOS Press.
Wilson, C. M., & Lodha, S. K. (1996). Listen: A data sonification toolkit. Atlanta: Georgia Institute of Technology.
Xia, F., Yang, L. T., Wang, L., & Vinel, A. (2012). Internet of things. International Journal of Communication Systems,
25(9), 1101.
Zeroghan. (2017, July 28). Interview with Dean Paton, big heritage: Pokemon GO at the chester heritage festival. Retrieved April
27, 2018, from https://pokemongohub.net/post/interview/interview-­dean-­paton-­big-­heritage-­pokemon-­go-
­chester-­heritage-­festival/
Zhao, H., Plaisant, C., Shneiderman, B., & Duraiswami, R. (2004). Sonification of geo-­referenced data for auditory informa-
tion seeking: Design principle and pilot study. Atlanta: ICAD.
Zhao, H., Plaisant, C., Shneiderman, B., & Lazar, J. (2008). Data sonification for users with visual impairment: A case
study with georeferenced data. ACM Transactions on Computer-­Human Interaction (TOCHI), 15(1), 4.
Index

Note: page numbers in italics indicate figures and page numbers in bold indicate tables on the corresponding
pages.

ABC of the British Iron Age 7 Alçiçek, M. C. 94


access analyses 276 Aldenderfer, M. S. 247
accumulated cost surface (ACS) 340–346, 341–346 Almyriotiki geophysical data 399–402
adaptive sampling 47–48 Almyros 2 geophysical data 403–405
adjacency matrix 274, 277 Altshul, J. H. 20, 24
Advanced Research Infrastructure for Archaeological ambiguity 170–171
Dataset Networking in Europe (ARIADNE) 21 analysis of variance (ANOVA) 51
Advanced Very High Resolution Radiometer (AVHRR) Analytical Archaeology 8–9
368 Anatomically Modern Humans (AMH) project 135,
Agent-Based and Individual-Based Modelling: A Practical 142–151, 143, 144, 145, 146, 147–149, 150
Introduction 247 Ancient History of Wiltshire, The 5
agent-based modelling (ABM): accounting for multiple anisotropic kriging 124, 125
patterns 261–262; agent prediction/learning in 255; Annaghmare, Northern Ireland 197–201, 199–201
agents in 254–255; attributes, states, and behaviour Antikytyhera case study 70–74, 72–73, 74
in 254; case study on 263, 263–267, 265–266, 266; Arcaute, E. 79, 80, 89
characteristics of 248–250, 249–250; collectives archaeological data see data
in 255–256; computer hardware in 257; detailed archaeological location monitoring (ALM) 216, 228
design considerations in 253–256; dissemination archaeological theory 8
and re-use in 262–263; environment in 253–254; archaeological vision 461–467, 463, 466
experimentation and analysis in 259; geometry in Archaeology Data Service (ADS) 21
253; goals in 254; history of, in archaeology 248; Archaeology of Time 408
impact of chance events on 259; implementation archival databases 23
and verification in 257–258; input data in 253–254; ArcMap 10.3 27
integrating GIS and 258; introduction to 247–251, Asch, D. L. 216
249–250; method of 251–263; parameter changes assignment, spatial approaches to: Annaghmare,
and 259–261, 260; problem definition and system Northern Ireland, case study on 197–201, 199–
bounding in 251–253; rules in 254–255; software 201; conclusions on 206–207; Duggleby Howe,
platforms in 257–258; structural changes and 261; Yorkshire Wolds, case study 201–206, 203, 205,
treatment of time in 256; updating in 253; uses of, in 206; introduction to 192–193; local or non-local
archaeology 250–251; verification in 258 194; methodology in 195–197; oxygen 193–194;
Agrippa Road, Germany 347–351, 347–351, 348, strontium 193
350–351 Athanassas, C. D. 94
Alcaina-Mateos, J. 93 Atici, L. 20
476 Index

Atkinson, P. M. 93, 126 Brown, T. 80, 82, 83, 85, 91


Atlas of Hillforts in Britain and Ireland project 85–89, Buran Kaya III see Anatomically Modern Humans (AMH)
86–88
Attwell, M. R. 213 Camden, W. 4
Atwater, C. 6 Canadian Archaeological Radiocarbon Database
Aubrey, J. 4 (CARD) 156
augmented reality (AR) 465 Canadian Tri-Agency Statement for Principles on
Autocorrelated Errors Model 141 Digital Data Management 21
autocorrelation, spatial 138–142, 149–150 Cannon, J. 367
Axtell, R. L. 265 Canosa-Betés, J. 352, 353
Cape Cod survey 55
Baddeley, A. J. 63 Caraher, W. 461
Baden, W. W. 247, 250 Carleton, W. C. 235, 236
Bailey, G. N. 408 Carrer, F. 93
Balla, A. 171 cartographic time 409
Ballyhenry rath, County Antrim, Northern Ireland Cascalheira, J. 135, 142
103–106, 104–112 Castle, C. J. E. 258
Banerjee, R. 171 Castrorao Barba, A. 242
Banning, E. B. 55 Çatalhöyük, Turkey 418, 451–454, 452–453
Barceló, J. A. 114 categorical data 213, 217–218, 221, 222
Barrett, J. C. 408 Catella, L. 94
Barrientos, G. 94 Catts, W. 20
Barthelemy, M. 278 causality 136
Baxter, M. 171 Cavalli-Sforza, L. L. 142
Bayesian Conditional Autoregressive Model 141 CCD (Charged-Coupled Device) 362
Bayesian statistics 195–197, 232 Cegielski, W. 247, 250
Beale, C. M. 141, 142 Central Place Foraging 236
Beaudry, M. C. 424 Central Place Theory 9
Beck, H. 296 Chacoan tower kivas, Southwestern US 325–328, 326
Beekman, C. S. 247, 255 CHaMP Topo Processing Tool 32–33, 33
Bellavia, G. 334 Chapman, J. 213
Berrocal, A. 114 Chaput, M. A. 163
Bertoncello, F. 242 Chessel, D. 148
beta skeletons 280–281, 281 Chi, G. 151
Bevan, A. 161–163, 213, 215–216, 228, 242–243 Childe, V. G. 6–7
Bianchi, E. 160 Childs, C. 367
Bicho, N. 135, 142 Chorley, R. 9
big data 169; case study on 436–440, 437–439; CIDOC (International Council for Documentation)
conclusions on 440–441; defined 431–432; Conceptual Reference Model (CRM) 422–423
exploratory statistics 435; ‘human’ 432–434; Cimler, R. 171
introduction to 430–431, 431; method for using City Clustering Algorithm (CCA) 78–79, 78–80
434–436; ‘scientific’ 432; semantic technologies and Clark, G. 408
linked geospatial 435–436; spatial binning method Clark, P. J. 65
for 434–435; workflows and visualization 436 Clarke, D. 8–9
Billman, B. R. 212 cluster plots 80–81, 81
Binford, L. R. 8, 41, 43 cluster sampling 47
bivariate K function 67 CMOS (Complementary Metal Oxide Semiconductor)
block kriging 125 362
Bocquet-Appel, J. P. 94 coefficient of determination 140
Bogaard, A. 161 Coins of Allectus 106–114, 113–114
Bradley, R. 408 Colt Hoare, R. 5
Branting, S. 367 Common-Mid-Point (CMP) technique 392–393,
Braudel, F. 408 393–395
Breshears, D. D. 118 complexity science 247
Britannia 4 conditional simulation 102–103; model-fitting and 69
Broadbent, S. R. 77 Conolly, J. 93, 100, 162, 235
Brookes, S. 80 contextual knowledge 22
Index 477

continuous data: one-sample comparisons for 214–215, Demars, P. Y. 94


219, 221, 223; two-sample comparisons for 214, Density Based Spatial Clustering of Applications with
218–219, 222–223 Noise (DBSCAN) 81–82, 89, 91
Contreras, D. 254 Dent, C. L. 118
convenience sampling 48–49 Depthmap 308
Conyers, L. B. 376 De Runz, C. 176, 177
Cooper, A. 20, 23, 421 Desjardin, E. 176, 177
CORONA spy satellite 367–369, 368–370 detection effects 51
correlation: case study on 142–151, 143, 144, 145, 146, DeVillers, R. 461
147–149, 150; conclusions on 151; introduction to Dibble, H. L. 20
135–136; method of 136–139; Pearson’s correlation Diet Breadth 236
coefficient 136–139; scatter plots and question of diffusion 77
causality in 136 the Digital Archaeological Record (tDAR) 21
cost functions spatial analysis see least-cost path (LCP); digital elevation models (DEM) 13, 93, 233, 314, 316,
least-cost site catchment (LCSC) 319–321, 319–321, 334–335; scaling up of 322–323
Costopoulos, A. 247, 248, 249, 250 digital terrain models (DTMs) 316
Cousins, S. A. O. 254 Dijkstra, E. W. 345
Crema, E. R. 160, 161, 163, 213, 261 distance-weighted betweenness centrality 283–285,
Crescioli, M. 171 284, 285
Crooks, A. T. 258 distance-weighting interpolation 120–122, 121
Croxall, B. 468 Domesday Book percolation analysis 80–85, 83–84,
cultural ecology 7 89–90
Cultural Resource Management (CRM) 21, 24, 26, Doneus, M. 353
231–232; CIDOC (International Council for Doran, J. E. 8, 250
Documentation) 422–423 Duffy, P. 55
culture networks, spatial material 276–277 Duggleby Howe, Yorkshire Wolds 201–206, 203, 205, 206
Dunnell, R. C. 20
Daly, P. T. 408, 414, 422 dynamic mapping 417, 417–418
D’andrea, A. 171
Danielisová, A. 171 ecological niche modelling 232
Dark Object Subtraction (DOS) 365 edge effects 126
data: Big (see big data); case studies on 26–34, 28–33; edge list 274, 277
categorical 213, 217–218, 221, 222; conclusions on Edimborough, K. 261
35–37; continuous 214–215, 218–219, 221, 222–223; electrical resistivity tomography (ERT) 383–384, 384
fuzzy (see fuzzy sets); geophysical (see geophysical electromagnetic induction (EMI) data 382, 392
data); introduction to 17–21, 19, 20; iterative electromagnetic radiation 360–362, 361–362
cleaning of 26–30, 28–30; methods of collection Ellison, A. 9
of 21–26; one-sample comparisons for continuous embodied GIS 465
214–215, 219, 221, 223; open and collaborative English Landscapes and Identities project 436–440,
research for 25–26; processing field 30–33, 31–33; 437–439
quality of 19–20, 20; reliability of 17–18; Roman errors, sampling 50–51
transport system spatial network analysis 283; satellite Evans, F. C. 65
(see satellite data); scripted workflows of 25; sources Evans, T. N. L. 20, 23
and types of errors in collection and compilation Evans, T. S. 282
of 18, 19; two-sample comparisons for continuous Eve, S. 160, 465, 468–469
214, 218–219, 222–223; uncertainty in 18–19, 70, event-based modeling 415, 416
170–171, 185–186; versioning of 25, 34 Event-Based Spatio Temporal Data Model’ (ESTDM)
Database Management Systems (DBMS) 409 411
databases, archaeological 21; archival 23; integrated 23 exploitation territories 213
Davies, I. 254 exploratory statistics 435
Davis, E. H. 6, 8
Dean 380 F-distribution 219
De Dufour, K. 20 Fera, M. 353
De Guio, A. 80, 91 Ferber, J. 247
De Laet, V. 367 FGISSAR (Fuzzy Geographic Information System for
Delaunay triangulation 282 Spatial Analysis in aRchaeology) 175–188, 176–185,
Deleuze, G. 10 187–188
478 Index

field data processing 30–33, 31–33 potential field measurements 379–382, 381; main
first-order effect 158 convolution 377–379, 380; method of using
first-order spatial intensity 61, 62 377–396; micro-gravity measurements 390–391,
Fish, P. R. 221 391; multi-sensor magnetic measurements 382–383;
Fish, S. K. 221 preprocessing techniques 377; processing of shallow
Fisher, P. F. 170, 320, 322 depth 377–383, 378, 381; seismic measurements
Fletcher, M. 213 392–393, 393–395
Flory, P. J. 77 georeferencing 365
Fort, J. 130, 131, 132, 132, 142 Georgia Coast model 236–242, 238–241
Fourier series 380, 381, 383 geostatistics: Ballyhenry rath, County Antrim,
Fox, C. 7, 85 Northern Ireland 103–106, 104–112; basis of 95–96;
Frachetti, M. 408 case studies in 103–114; Coins of Allectus 106–114,
Franklin, J. 160 109–115; conclusions on 114–116; conditional
F-ratio 219 simulation in 102–103; introduction to 93–94;
Frisch, H. L. 77 method in 94–103; scale and spatial structure in
F-statistic 140, 151 94–95; simple kriging (SK) in 100–102; variogram in
F-test 223 96–99, 96–100
Fulminate, F. 282 Getis, A. 160
functional uncertainty 185 Getis’ local Gi* 161
Fusco, J. 178 Gietl, R. 353
fuzzy sets: case studies on 175–188; conclusions on Gilbert, N. 247
188–189; FGISSAR (Fuzzy Geographic Information Gillings, M. 20, 93, 333, 462
System for Spatial Analysis in aRchaeology) 175–188, GISSAR (Geographic Information System for Spatial
176–185, 187–188; introduction to 169–171; Analysis in aRchaeology) 175
method of 172–174, 172–175 GitHub 34, 36
Givoni, B. 338
Gabaix, X. 78, 79, 80 Gkiasta, M. 162
Gabriel graphs 280–281, 281, 287–289, 288, 289, 291 Global Navigation Satellite System positioning 30
Gadsby, D. 36 Global Positioning Systems (GPS) 21, 94; field data
Gaffney, V. 215, 334 processing 31–32, 31–32; Real-Time Kinematic
Gajewski, K. 163 (RTK) 454, 454–455
Gangodagamage, C. 138 global spatial network analysis 278–290, 279
Ganskopp, D. 352 GlobalXplorer Project 370–371
Gardner, R. H. 118 Goldman, R. 338
Garmy, P. 352 Gonçalves, C. 135, 142
Garrad, C. 32 Google Scholar 462, 463
Gattiglia, G. 432, 441 Gosden, C. 408, 460
Geissenklösterle see Anatomically Modern Humans Graham, S. 467, 469
(AMH) Graph theory 323
generalised least squares 141 GRASS GIS 80, 83, 89–90
Geographical Information Science (GISci) 410 Green, C. 20, 23, 419–420, 421
geographically weighted regression (GWR) 161–162 Grewe, K. 348
geographic information systems (GIS) 1, 11, 18, 27, 156, grey literature 23
231, 409; 3D (see three-dimensional models); agent- Grimm, V. 247, 258, 261, 262
based modelling (ABM) and 258; archaeological Groenhuijzen, M. R. 353
vision and 461–462; embodied 465; GRASS 80, ground penetrating radar (GPR) 385–389, 386–389
83, 89–90; local spatial analysis and 157–158; Guattari, F. 10
purposive selection and 50; in regional environmental Guichard, F. 249
relationships analysis 215–217, 221, 222; sampling and Güimil-Fariña, A. 352
50; temporal 410, 411–412, 419–421, 421; visibility Guiot, J. 254
analysis based on (see visibility analysis, GIS-based) Gupta, N. 461
geophysical data: case study on 396–398, 397–405;
conclusions on 398, 406; creation of maps from 383; Hacıgüzeller, P. 462
electrical resistivity tomography (ERT) 383–384, Hage, P. 281
384; electromagnetic induction measurements 382, Hagen, J. 348
392; ground penetrating radar 385–389, 386–389; Hägerstrand, T. 414
interpretation of 396; introduction to 376; magnetic Haggett, P. 9
Index 479

Halls, P. J. 414 Jahjah, M. 367


Hamilakis, Y. 462, 464 Janssen, M. A. 265, 267
Hammer, E. 367 Jerardino, A. 150
Hammersley, J. M. 77 Johnson, I. 417
Hanson, J. 296, 298; see also space syntax methodology Jol, H. M. 376
Harary, F. 281 Jones, B. 20
Harbaugh, J. W. 94 Jordan Rift Valley survey 52, 52–53
Harris, J. 433 Judge, J. W. 243
Harris, T. M. 215
Harriss, J. 9 Kachel, A. F. 249
Hatzichristos, T. 171 Kalayci, T. 368
Hatzinikolaou, E. G. 171 Kämpinge, Sweden 454, 454–455
Hawkes, C. 7 Kansa, E. C. 20, 21, 26
Heap, A. D. 118 Kansa, S. W. 20
Heilen, M. P. 20, 24 Kantner, J. 325
Herbin, M. 176, 177 Kellogg, D. C. 214
Hermon, S. 171 Kendall’s tau 136
Herrick, J. E. 118 Kerig, T. 261, 344
Herzog, I. 281, 282, 338–339, 353, 376 kernel density surface (KDE) estimation 61, 63, 120,
Hesse, A. 376 163, 353
Higgs, E. S. 213 King, J. A. 424
hillfort clusters in Britain and Ireland 85–89, 86–88 Kirman, A. 254
Hillier, B. 296, 298; see also space syntax methodology Kitchin, R. 433, 441
Hitchings, P. 54 K-nearest neighbours 282, 287
Hobgood, R. 325 Kohler, T. A. 247, 250, 252, 267
Hodder, I. 9, 17, 216 Kolmogorov-Smirnov test 218–219, 223, 225
Hodson, F. 8 Krems-Hundssteig see Anatomically Modern Humans
Hogg, A. H. A. 85 (AMH)
Hole, B. L. 58 Krige, D. 123
Hublin, J.-J. 249 kriging 122, 123–126, 124–125; anisotropic 124, 125;
Hunt, E. D. 215 block 125; simple (SK) 100–102; with a trend model
(KT) 102, 106–107, 109–115
Iannone, G. 235 Krygier, J. 466
imprecision 170 Kuhn, S. L. 249
induced spatial dependency 158 Kvamme, K. L. 214, 216, 217, 219
Ingold, T. 408, 424, 470 Kwan, M.-P. 414
inherent spatial dependency 158
inhomogeneous Poisson model 160 Lake, M. 93, 161
integrated databases 23 Lake, M. W. 247, 249, 250, 255
integration of ABM and GIs 258 Lancelotti, C. 93
intensity, spatial 60–63, 62, 224; compared against Langran, G. 410–411, 414–415, 415, 416
environmental categories 215, 219–220, 224 Larsen, S. O. 367
interpolation: case studies on 130–132, 131–132; Last, M. 467
conclusions on 132–133; distance-weighted 120– Lauricella, A. 367
122, 121; introduction to 118–119; inverse distance Lavorel, S. 254
weighted (IDW) 121–122; kriging 123–126, 124– Lay, M. G. 339
125; method of 120–130, 121, 123–125, 127, 128, least-cost path (LCP) 333–334; accumulated cost
129; model comparisons 127, 127–130, 128, 129; surface (ACS) in 340–346, 341–346; case study on
sample intensity, measurement scale and edge effects 347–351, 347–351, 348, 350–351; conclusions on
126; thin plate spline (TPS) 122–123, 123 352–353; estimating movement costs in 336–339,
Introduction to American Archaeology 7 337, 338–339, 340; overview of 336
inverse distance weight (IDW) interpolation 121–122; least-cost site catchment (LCSC): accumulated cost
model comparisons 127, 127–130, 128, 129 surface (ACS) in 340–346, 341–346; case study on
Isern, N. 132 347–351, 347–351, 348, 350–351; conclusions on
isotope tracers 192–194; see also assignment, spatial 352–353; estimating movement costs in 336–339,
approaches to 337, 338–339, 340; introduction to 333–335, 335;
iterative data cleaning 26–30, 28–30 overview of 336
480 Index

Lee, G.-H. 55 Maschner, H. D. G. 214


Lee, J. 334 maximum likelihood estimation 197
Lee, S.-I. 138 McCoy, M. D. 20, 27–30, 432
Lefebvre, H. 3 McGlade, J. 247
Leland, J. 4 McManamon, F. P. 55
Leone, M. 408 McPherron, S. P. 20
Lessard, Y. A. 379 Mendel, J. M. 175
Levene’s test 219, 223 mental archetypes 235–236
Lev-Tov, J. 20 Menze, B. H. 367
Li, J. 118 micro-gravity measurements 390–391, 391
Li, X. 213 Miksicek, C. 221
Light Detection and Ranging (LiDAR) 36, 236 Miller, A. P. 414
Lin, H. 138 Miller, J. R. 118
linear regression: case study on 142–151, 143, 144, 145, Mills, B. J. 277
146, 147–149, 150; coefficient of determination minimum spanning tree 281
and root mean square deviation 140; conclusions on Mithen, S. J. 247, 248, 251, 262
151; geographically weighted 161–162; introduction Mlekuz, D. 333
to 135–136; with ordinary least squares method Mn/Model 234–235
139–142; regression line and residuals 139; spatial Models in Archaeology 9
autocorrelation and 140–142 Models in Geography 9
line-of-sight (LOS) 314; enriching basic 321–322; Modifiable Areal Unit Problem (MAUP) 94
scaling up and 323; viewshed algorithms and Modis, K. 94
316–318, 317–318 Monte Carlo simulation 69, 232
linguistic variables 170 Montelius, O. 6, 7
Lloyd, C. D. 93, 126 Moran’s I spatial autocorrelation coefficient 94, 138,
local indicators of spatial association (LISA) 160–161, 147, 150
179–181 Moran’s local I 161
Locally Adaptive Model of Archaeological Potential Moser, S. 461
(LAMAP) approach 235–236 multistage sample designs 50
local Moran statistics 179 multivariate approaches in regional environmental
local point pattern analysis 158–160, 159, 161 relationships analysis 216–217, 220–221, 225,
local spatial network analysis 278–290, 279 225–227, 226, 227
Lock, G. 353, 462 Mumford, G. 367
Lock, G. R. 215, 408, 414, 422 Muñoz, F. 114
Longacre, W. A. 219 Murphy, R. F. 212
Long House Valley ABM 248–250, 249–250, 263,
263–267, 265–266, 266 Nagle, C. L. 20, 24
loose coupling of ABM and GIS 258 National Institute of Geographical and Forestry
Loots, L. 322 Information (IGN-F) 33
López-Bultó, O. 114 natural groupings 80
Löwenborg, D. 3 Nearest Neighbour Index 158
Lowerre, A. G. 83 Negre, J. 114
Lucas, G. 408, 424 Negre Pérez, J. 93
Lynch, J. 352 networks, spatial: access analyses in 276; building 277;
case study of 282–289; conclusions on 291–292;
Machálek, T. 171 defined 273–275, 274; Delaunay triangulation of
Maddison, S. 82 282; introduction to 273–277; K-nearest neighbours
Madsen, J. 221 and maximum distance in 282, 287; local and global
Maier, U. 161 analysis measures of 278–290, 279; method of studying
Makse, H. A. 78, 79, 80 277–282; minimum spanning tree 281; models of
Mann-Whitney test 218, 223 280; overview of research into 275; planar and non-
Mantzourani, E. 171 planar 277, 278; relative neighbourhood networks,
marked point patterns 67–69, 68 beta skeletons and Gabriel graphs of 280–281, 281,
Markov Chain Monte Carlo simulation 69 287–289, 288, 289, 291; roads, rivers, oceans, traversal
Márquez-Pérez, J. 336 and transportation in 275; spatial material culture
Marsillo, C. 470 networks in 276–277; visibility networks in 276
Marwick, B. 262 New Archaeology 10, 11
Index 481

Newcomb, R. E. 85 City Clustering Algorithm (CCA) in 78–79,


Newtonian view of time 409 78–80; conclusions on 90–91; Density Based Spatial
Niccolucci, F. 171 Clustering of Applications with Noise (DBSCAN)
Nielsen, S. E. 156 in 81–82; Domesday vills 80–85, 83–84, 89–90;
node-link-diagram 274, 277 implementation details 89–90; introduction to
non-planar networks 277, 278 77–78; method of 78–79, 78–82, 81; rural settlement
non-probability sampling 48 in England 80, 82, 83–85, 84
non-spatial ordinary least squares 141 Personality of Britain 7
non-stationarity and local spatial analysis: case study on Peters, D. P. C. 118
162–165, 164; conclusions on 165; geographically Petropoulos, G. P. 171
weight regression 161–162; introduction to 155–156, Peuquet, D. J. 411
157; local indicators of spatial association (LISA) Phillips, P. 8
160–161; local point pattern analysis 158–160, 159, Piantoni, F. 176, 177
161; method of 157–162 Pike, A. W. G. 171
normal distribution 196–197 place writing 5
Normalized Difference Vegetation Index (NDVI) 368 planar networks 277, 278
Nuninger, L. 242 Plewe, B. 170
Plog, F. 213
object lifespan 418–419, 419, 420 point pattern analysis: case study in 70–74, 72–73,
object-oriented (O-O) database model 422 74; conditional simulation and model-fitting in 69;
ODD (Overview, Design concepts, and Details) introduction to 60; local 158–160, 159, 161; market
protocol 262 point patterns and relative risk and 67–69, 68;
Oliva, F. 94 method of 60–70; spatial intensity and 60–63, 62;
Olševičová, K. 171 spatial interactions and 63–67, 64, 65, 66; uncertain
one-sample comparisons for continuous data 214–215, and incomplete data in 70
219, 221, 223 Poisson model 159, 159–160
Open ABM 262 Pompeii, Italy 53
Open Context 21 population and sampling frame 42–43
Open Digital Archaeology Textbook and Environment Portable Antiquities Scheme (PAS) database 440
36 Posluschny, A. G. 334
OpenRefine 27–30, 28–30 post-processualism 10
Open Science in archaeology 18, 25–26 Pouncett, J. 353
Optimal Foraging theory 236 predictive modelling 216; case studies on 234–243;
Optiz, R. 36 conclusions on 243; effects of pre-existing settlement
ORBIS 36; see also Roman transport system spatial on location choice and 242–243; Georgia Coast
network analysis model 236–242, 238–241; introduction to 231–232;
ordinary least squares (OLS) methods 135–136, LAMAP approach 235–236; method of 232–234;
151, 435; linear regression with 139–142; spatial Mn/Model 234–235; problems and pitfalls of
autocorrelation and 140–142 233–234
Orton, C. 9, 17, 160, 216 Prehistoric Settlement Patterns in the Virú Valley, Peru 212
oxygen 193–194 Premo, L. S. 161, 249, 250, 251
principal components analysis (PCA) 130, 367; in
pair correlation function (PCF) 66, 66–67, 158 regional environmental relationships analysis
Paliou, E. 243 216–217, 221, 226, 226–227, 227
Palmisano, A. 213 Probability Proportional to Size (PPS) sampling 46,
Pandolf, K. B. 338 46–47
Papagiannakis, G. 465 prospect theories 236
Parcak, S. 367 Pujol, T. 142
Parcero-Oubiña, C. 352 purposive selection in sampling 48–50
pattern oriented modelling (POM) 261–262 Pylos palace, Western Messenia 305–308, 306
Paulissen, E. 367
Pearson’s correlation coefficient 136–137, 213; quality assurance 20
significance testing for 137–139 quality standard 20
Pelagios Commons 36 Quantifying Archaeology 213
percolation 77
percolation analysis: Atlas of Hillforts in Britain and Rackham, O. 83
Ireland project 85–89, 86–88; case studies in 82–90; Railsback, S. F. 247, 258, 261, 262
482 Index

random function (RF) model 95–96 Sabloff, J. A. 3


random sampling: simple 44, 44; stratified 52, 52–53 Salisbury, R. B. 130
Rational Polynomial Coefficients (RPCs) 365 Sample, M. 460
Real-Time Kinematic (RTK) Global Positioning System sample size 43–44; determining appropriate 55, 56
(GPS) 454, 454–455 sampling: adaptive 47–48; case studies on 52, 52–57,
Real-Time Kinematic systems 30 54, 56; cluster 47; conclusions on 57–58; defining
reasoning, processes of 13 meaningful units in 53–54, 54; detection effects
record, archaeological 21 51; determining appropriate sample size for 55,
Reese River Survey 56–57 56; evaluating results and errors of 50–51; GIS
regional environmental relationships analysis: archaeological and 50; introduction to 41–42; method for 42–51;
location modelling in 216; case studies in 221–227; multistage sample designs for 50; non-probability
categorical data in 213, 217–218, 221, 222; comparing 48; population and sampling frame for 42–43;
site intensity against environmental categories in 215, Probability Proportional to Size (PPS) 46, 46–47;
219–220, 224, 224; comparisons between two or purpose and history of 41–42; purposive selection
more site types 214, 223; conclusions on 227–228; and optimal searching for 48–50; sample size and
core methods in 213–217; GIS modifications in sampling fraction in 43–44; sequential 47–48;
215–217, 221, 222; introduction to 212–213; issues simple random 44, 44; stratified 44, 45, 52, 52–53;
in 217; method of 217–221; multivariate approaches systematic 44, 44–45; testing specific hypotheses and
in 216–217, 220–221, 225, 225–227, 226, 227; one- 56–57
sample comparisons for continuous data 214–215, 219, sampling fraction 43–44
221, 223; redefining “environmental background” sampling frame 42–43
for 224–225; site catchment analysis in 213–214; two- Sarris, A. 392
sample comparisons for continuous data 214, 218–219, satellite data: case study on 367–369, 368–370;
222–223 conclusions on 370–371; generic workflow for
regionalized variables 95–96 analysis of 363, 363–367; introduction to 359–360;
relative neighbourhood networks 280–281, 281, method of using 360–367, 361–363; principles of
287–289, 288, 289, 291 satellite remote sensing and electromagnetic radiation
remotely sensed imagery 21 and 360–362, 361–362; sensor design for 362; sensor
remote sensing, satellite 360–362 resolutions and 362–363
representational spaces 3, 12–13 scale and spatial structure 94–95
residuals, calculation of 195 scatter plots 136
Reynolds, A. 80 Schlichtherle, H. 161
Rihll, T. E. 242, 280 Schloss Friedeburg, Saxony-Anhalt, Germany 298–300,
Ripley’s K function 158, 159, 160, 353 299; nodes and edges in 301–303, 302; qualities and
Riris, P. 160 indicators in 303–305
Rivers, R. J. 282 Schneider, J. 347
Roberts, B. K. 80, 83, 85 Scollar, I. 376, 380
Robinson, J. M. 93 scripted workflows 25
Rogers, J. 247, 250 Sebastian, L. 243
Rogers, S. R. 338, 352 Secco, G. 80, 91
Rogerson, P. 138, 142 second-order effect 158
Roman transport system spatial network analysis: second-order neighbourhood analysis 160
conclusions on 289; data in 283; distance from Rome second-order patterning 64–65, 66
in 285–286, 286; distance weighted betweenness seismic measurements 392–393, 393–395
centrality in 283–285, 284, 285; Gabriel graph and semantic technologies 435–436
relative neighbourhood network in 287–289, 288, 289, sequential Gaussian simulation (SGS) 103
291; K-nearest neighbour networks in 287; maximum sequential sampling 47–48
distance networks in 287, 290; network models in sequent snapshots 414–415, 415
286–287; spatial network visualisation in 283 settlement archaeology 212
Rootenberg, S. 41 settlement patterns 212; location choice and pre-
root mean square deviation (RMSD) 140, 142, 149 existing 242–243
root-mean-square (RMS) error 128–130, 129 shallow depth geophysical data 377–383, 378, 381
Rozenfeld, H. D. 78, 79, 80 Shanks, M. 408
R-squared 140 Shennan, S. 162, 163, 213, 261
rural settlement in England percolation analysis 80, 82, Shermer, S. J. 214
83–85, 84 Silva, F. 135
Rybski, D. 78, 79, 80 Simon, F.-X. 392
Index 483

simple kriging (SK) 100–102 state historic preservation offices (SHPOs) 22–23
simple random sampling 44, 44 stationarity 155–156
Simultaneous Autoregressive Model (SAR) 141–142 Steele, J. 162
Siolas, A. 171 Stephan, E. 161
site catchment analysis (SCA) 9, 213–214 Sterry, M. 462
slow archaeology 461 Steward, J. H. 7, 56, 212
Smith, J. 36 Stockmayer, W. H. 77
Smithwick, E. A. H. 118 Stolar, J. 156
Society of American Archaeology 36 stratified sampling 44, 45, 52, 52–53
Solberg, R. 367 strontium 193
sonification: archaeological vision and 461–467, 463, Strupler, N. 34
466; case studies on 468–470; conclusions on 470– Stucky, D. 334
471; introduction to 460–467, 463, 466; method for Study of Archaeology, A 8
467–468 Stukeley, W. 5
Soule, R. 338 Styring, A. 161
soundscape 467–468 Suarez, R. 254
Southwestern Archaeological Research Group (SARG) summed probability distribution of radiocarbon dates
213 (SPDRD) 162–165, 164
space: archaeology and 1–3; concepts of 12; Sweet, R. 5
representational 3, 12–13 systematic sampling 44, 44–45
space and time: case studies on 416–421; concepts Systems Theory 8
of spatiotemporality and 408–409; conceptual
computer modeling of 410–411, 411; conclusions on Tabbagh, A. 376, 392
421–424, 423; dynamic mapping of 417, 417–418; Taylor, J. S. 418
event-based modeling 415, 416; implementation Taylor, W. 8
of models of 411–412; introduction to 408–411, t-conorms 173
411; object lifespan approach to 418–419, 419, 420; technological determinism 13
spatiotemporal model of 412–415, 413, 415–416; Teltser, P. 20
temporal-GIS and 410; towards an archaeological Temporal-GIS 410, 411–412, 419–421, 421
temporal GIS of 419–421, 421 Terradas, X. 114
space syntax methodology: basic principles of 298–300, Terrell, J. E. 408
299; case study on 305–308, 306; conclusions on Thematic Mapper (TM) multispectral sensors 360
308–309; introduction to 296–298, 297; nodes and Theodorakopoulou, K. 94
edges in 301–303, 302; qualities and indicators in Thiesson, J. 392
303–305 thin plate spline (TPS) interpolation 122–123, 123
space-time cube 412–414, 413 Thomas, D. H. 56, 237
space-time path (STP) 414 Thomas, J. 408
spatial analysis 8–11, 12–13 Thomsen, C. 6
Spatial Analysis in Archaeology 9, 17 three-dimensional models: brief history of 444–445;
Spatial Archaeology 9 case studies on 451–455, 452–454; Çatalhöyük,
spatial archaeology, concept of 3–8 Turkey 451–455, 452–454; conclusions on 455–456;
spatial binning method 434–435 introduction to 444–449, 446–448; Kämpinge,
spatial expansion method 162 Sweden 454, 454–455; method for 449–450, 451;
spatial intensity 60–63, 62 visualized in archaeological field practice 445–449,
spatial interactions 63–67, 64, 65, 66 446–448
spatial material culture networks 276–277 Tiffany, J. A. 214
spatial narrative 8–11 tight coupling of ABM and GIS 258
Spatial regression models for the social sciences 151 Tilley, C. 11, 408
spatial sampling see sampling time see space and time
spatial structure in archaeology see geostatistics Time in Geographic Information Systems 410
spatial thinking 12–13 TimeMap 417, 417–418
Spearman’s rank correlation coefficient 136 time-slicing 414–415, 415
Squier, E. G. 6, 8 t-norms 173
Srivastava, P. K. 171 Tobler, W. 336, 349
Stančič, Z. 215, 334 triangulated irregular network (TIN) 121, 314
Stanley, E. H. 118 triangulation networks 121, 282, 314
Stark, M. T. 219 Trier, D. 367
484 Index

Trigger, B. G. 212 in 316–318, 317–318; method of 314–325, 315,


t-tests 218, 222–224 317–321, 324; scaling up 322–323; 2 or 3D 318
Turner, A. 308 visibility networks 276
Turner, M. G. 118 vision, archaeological 461–467, 463, 466
two-sample comparisons for continuous data 214, visual graph analysis (VGA) 303, 304
218–219, 222–223 Vita-Finzi, C. 213
Type I errors 137, 141, 151 Volk, C. J. 32
Type II errors 137
Wadi Quseiba survey, Jordan 54, 54
Ulivieri, C. 367 Wadi Ziqlab survey, Jordan 55, 56
Ullah, I. 55, 215 Waelkens, M. 367
uncertainty in data 18–19, 70, 170–171, 185–186; Wallace-Hadrill, A. 53
functional 185 Warnick, Q. 468
Ur, J. A. 367 Warren, R. E. 216
Urban, D. L. 118 Wells, J. 22–23
urban cityscape of tours 419, 420 Welsch, R. 408
U.S. National Academies 12–13 Westervelt, J. D. 258
Usyskin, A. 467 Weymouth, J. W. 379
Wheatley, D. 20, 93, 215, 333
van der Leeuw, S. E. 247, 252 Wheaton, J. M. 32
Vander Linden, M. 132 Whitehead, K. 32
variables, regionalized 95–96 Whitley, T. G. 236
variograms 96–99, 96–100 Wilkinson, T. C. 34
Varoudis, T. 308 Wilkinson, T. J. 368
Vavra, M. 352 Willey, G. R. 3, 7, 8, 212
Vawser, A. 36 Williams, L. 56–57
Velestino Mati geophysical data 396, 397–398 Wilshusen, R. H. 20
Vercruysse, R. 20 Wilson, A. 417
Veregin, H. 170 Wilson, A. G. 242, 243, 280
Verhagen, P. 242, 353 Wood, B. 353
versioning, data 25, 34 Wood, Z. 353
Very High Resolution (VHR) Earth observation workflows, scripted 25
satellites 360 Worsaae, J. 6
Vescelius, G. S. 41 Wrathmell, S. 80, 83, 85
viewshed algorithms and LOS 316–318, 317–318;
applications of 323–325, 324; Chacoan tower Xue, J. Z. 249
kivas, Southwestern US 325–328, 326; enriching Yamada, I. 126
321–322
virtual reality (VR) 462 Yamin, R. 424
visibility analysis, GIS-based: applications of 323–325, Yulmetova, M. 32
324; case study on 325–328, 326; conclusions on
328; enriching basic LOS and binary viewshed Zadeh, L. A. 170, 172
determinations in 321–322; introduction to Zhou, X. 138
313–314; issues relating to elevation model in Zhu, J. 151
319–321, 319–321; LOS and viewshed algorithms Zubrow, E. B. W. 93, 94

You might also like