David Milne 1-S2.0-S0966692317300984-Main
David Milne 1-S2.0-S0966692317300984-Main
A B S T R A C T
This paper considers the implications of so-called ‘big data’ for the analysis, modelling and planning of transport
systems. The primary conceptual focus is on the needs of the practical context of medium-term planning and
decision-making, from which perspective the paper seeks to achieve three goals: (i) to try to identify what is truly
‘special’ about big data; (ii) to provoke debate on the future relationship between transport planning and big
data; and (iii) to try to identify promising themes for research and application. Differences in the information
that can be derived from the data compared to more traditional surveys are discussed, and the respects in which
they may impact on the role of models in supporting transport planning and decision-making are identified. It is
argued that, over time, changes to the nature of data may lead to significant differences in both modelling
approaches and in the expectations placed upon them. Furthermore, it is suggested that the potential widespread
availability of data to commercial actors and travellers will affect the performance of the transport systems
themselves, which might be expected to have knock-on effects for planning functions. We conclude by proposing
a series of research challenges that we believe need to be addressed and warn against adaptations based on
minimising change from the status quo.
1. Introduction Chatterton et al. (2015) used annual vehicle test data to understand
social variations in vehicle use. De Montjoye et al. (2015) used credit
In recent years there have been enormous technological advances in card data to reconstruct individual movements. Smart card data has
the capture and storage of data, affecting our potential to monitor both been widely used by many researchers (e.g. Pelletier et al., 2011; Tao
human behaviour and the physical world, and providing possibilities to et al., 2014; Tamblay et al., 2016), with a range of applications in
track and triangulate diverse data sets. A report by the OECD (2013) public transport planning. Location-based, social media check-in data
identified the following types of new data on ‘the human condition’: from services such as Foursquare and Twitter has been used by a range
of researchers (e.g. Hasan and Ukkusuri, 2014; Liu et al., 2014; Yang
• Data from government transactions (e.g. tax, social security) et al., 2014; Abdulazim et al., 2015; Hu and Jin, 2017) to attempt to
• Data related to official registration/licensing estimate travel activity patterns, while Jestico et al. (2016) have in-
• Commercial transactions by individuals and organisations vestigated the potential of the activity tracking app STRAVA for mea-
• Internet data from search and social networking activities suring cycling volumes. Social media data have also been used to ex-
• Tracking data amine unexpected situations and special events (Pender et al., 2014;
• Image data (e.g. aerial/satellite images, land-based video) Pereira et al., 2015; Gu et al., 2016). Automatic vehicle identification
has been used to understand complex travel activity patterns (Ozbay
All of these are potentially relevant to planning transport systems and Ercelebi, 2005; Siripirote et al., 2014). Perhaps most widely
since they either provide insights into the location, timing and fre- exploited for understanding travel/mobility patterns has been mobile
quency of activities that generate travel (such as employment, shopping phone data (e.g. Calabrese et al., 2013; Blondel et al., 2015; Widhalm
or social engagement), or they provide direct evidence of the volume, et al., 2015) and GPS data (e.g. Frignani et al., 2010; Lin and Hsu, 2014;
concentration and direction of person movements or vehicular move- Montini et al., 2014; Tang et al., 2015; Gong et al., 2016). Returning to
ments. These opportunities have been explored to varying degrees. the original OECD list of data sources we might add other sources of
Saadi et al. (2016) used social security data to infer trip patterns. transportation data not directly related to the ‘human condition’, such
⁎
Corresponding author.
E-mail address: [email protected] (D. Milne).
https://doi.org/10.1016/j.jtrangeo.2017.11.004
Received 9 February 2017; Received in revised form 7 October 2017; Accepted 6 November 2017
Available online 19 April 2018
0966-6923/ © 2017 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/BY/4.0/).
D. Milne, D. Watling Journal of Transport Geography 76 (2019) 235–244
As authors we approach this topic from a background in transport 2. Defining ‘big data’ from a transport planning perspective
modelling and have a particular interest in the opportunities, threats
and permissive impacts that these emerging data sources may have for There exist many possible general characterisations of big data.
travel, and for our ability to understand and predict it in support of Some of those included in previous literature are provided in Table 1,
planning and decision-making. The focus of our paper is on medium- and are explained further below.
term transport planning and not on providing real-time system or The simplest characterisation of big data, typically acknowledged in
personalised information, analytics and control, which have been cov- all definitions, relates to a greatly increased volume of information
ered elsewhere and which lead to a very different set of considerations compared to past expectations, and the fact that this can sometimes
(e.g. Zheng et al., 2013; Kitchin, 2014a, 2014b; Mori et al., 2015; Oh lead to practical difficulties (or complexity) for those who wish to ana-
et al., 2015). lyse it using existing methods and tools. Velocity, on the other hand,
The paper will also avoid focussing on a single type of big data and refers to an increased temporal frequency of observation. References to
its detailed applications, such as mobile phone records which have been variety, variability and veracity are all acknowledgements of the diversity
covered quite extensively elsewhere (Bohte and Maat, 2009; Calabrese likely to be inherent within non-purpose-oriented data that may mean it
et al., 2014; Steenbruggen et al., 2015; Rojas IV et al., 2016). Instead, does not conform to convenient structures and definitions, or that it
the paper will consider any kind of ‘big data’ (in the senses we shall may lack accuracy. Other possible features identified include the sug-
define) and will discuss: how it differs from more traditional transport gestion that big data may provide complete (exhaustive) coverage of a
data sources; what those differences mean for the information it can system or population, and that it may be fine grained in resolution and
provide; how different information has an impact on the analytical detail (a quality potentially related to velocity). The final area on which
techniques that make use of it (from the perspectives of both analysing some definitions focus is the potential ability of big data sources to be
past data and devising predictive models), and what the implications of used in combination to provide novel insights through the relational
all this are for the practice of transport planning. Our ultimate aim is to qualities of what is recorded (e.g. the ability to link observations of
try to identify what we believe are key questions and areas for debate, different activities via the common fields of time and space) and the
which might suggest foci for research regarding the potential impact of associated flexible nature of the information. This leads to the sugges-
big data on the future of transport planning and policy assessment. Our tion that big data could result in the end of theory because it implies a
starting point is a simple conceptual understanding of the relationship shift towards insight and understanding being driven by empirical
between data about the transport system and the practical discipline of evidence rather than starting from theoretical constructs.
transport planning. Our proposition is that, in the transport sector, Although thought provoking, the lack of agreement between these
analytical approaches—most of which could be referred to as model- suggested combinations of features does not provide a sufficiently clear
s—have traditionally played a vital role in linking observable phe- application-oriented definition for our purposes. From a transport
nomena to decision-making. Therefore, as the volume and nature of planning perspective, the features described in Sections 2.1–2.7 (we
observations changes, it is necessary to consider how that may impact propose) provide a more useful characterisation of emerging data
on models that are intrinsic to prevailing planning and decision-making sources that fit the ‘big data’ concept within our chosen context. We are
processes. not proposing that these features are necessary conditions that should
The paper we present is deliberately a mix between a literature all be met in order for data to be considered ‘big’ in a transport planning
review and a think-piece about the implications of such new data op- sense. Indeed, our aim is to move away from hard and fast definitions,
portunities for the theory and practice of transport geography. The because we suspect they are not helpful (especially as technologies
content is divided into three major sections. First (in Section 2) we change over time), and to focus instead on features that have the po-
consider how the term ‘big data’ might be usefully defined in a trans- tential to change how people think about data and how they use it.
port planning context, paying particular regard to what is distinctive
236
D. Milne, D. Watling Journal of Transport Geography 76 (2019) 235–244
2.1. Continuous monitoring this area have noted that privacy functions which provide a combina-
tion of anonymization, user consent and open access (e.g. the so-called
‘Continuous monitoring’, as we shall define it, is perhaps one of the ‘Foursquare-Twitter bridge’) potentially provide a convenient solution
most important features in terms of the new opportunities it affords for to data ownership problems, albeit at the cost of a possible significant
analysis and insight. In our view, it refers to situations where data, of reduction in sample size (Hu and Jin, 2017).
attributes such as personal and vehicle mobility, possess one or more of
the following facets: (i) a high temporal sampling rate (i.e. observations 2.3. Data collection may not have been designed for the purpose
are made so frequently that they are almost/seemingly continuous); (ii)
monitoring for which there are no gaps (i.e. as opposed to a high In addition to issues of ownership rights, privacy and costs asso-
temporal sampling rate covering 08:00–09:00 each working day but ciated with acquiring data for planning purposes, the reuse of in-
with no observations at other times); and (iii) data that are being re- formation that was not designed for the purpose provides both chal-
corded indefinitely (i.e. with no defined end-time of observation). lenges and opportunities.
There are many implications of such facets. For example, gapless his- Obvious problems related to data reuse are that the information
torical monitoring means that it is possible to select a sample of past available may not contain all the desired attributes or may not be
data for analysis as often as is useful, and potentially to go back in time structured in ways that are easily compatible with established meth-
as far as desired (to the start of monitoring, assuming data is retained odologies. In the context of data relating to spatial mobility and travel
indefinitely). Over time this creates a historical database that is always activity, an example of this is that movement trajectories based on cell
growing. Examples of use of this type of data related to transport might phones or Bluetooth devices are unlikely to contain information related
include that from mobile devices such as cell phone records and to the purposes of journeys or the modes used, both of which would be
Bluetooth data (Widhalm et al., 2015; Crawford et al., 2017b), elec- collected as standard in traditional travel surveys. This has, meant that
tronic ticketing data for public transport (Tamblay et al., 2016), data new techniques have needed to be developed to estimate those ele-
from fixed sensor, GPS and automatic vehicle identification (AVI) ments (Abdulazim et al., 2013; Bhaskar and Chung, 2013; Bwambale
technology such as loop detectors, automatic traffic counters, Traffic- et al., 2017; Crawford et al., 2017c). A further issue is that the samples
master and Automatic Number Plate Recognition (ANPR) cameras included in external datasets may not be fully or proportionally re-
(Chow, 2016), and location-based social network data such as Four- presentative of the populations and activities being considered during
square and Strava (Hu and Jin, 2017; Jestico et al., 2016). Even if there spatial and transport planning. For, despite Kitchen's (2013) assertion
are only resources to analyse a limited (but large) sample of observa- that big data may have an exhaustive quality, in the current era en-
tions for any particular study, this sample need not be time-constrained gagement with mobile technology tends to be skewed towards parti-
a priori (e.g. all observations in the last year), but might be sampled in cular types of (typically younger) people (Yang et al., 2014; Sun and Li,
some other way that allows temporal changes to be better understood 2015) and, for location-based social media data, technology usage le-
(e.g. automatic traffic counts for a city covering all Januaries in the last vels vary considerably between individuals, between different times of
ten years). Thus, such data differs from traditional transport-related day and a skewed towards particular types of activity (Sun and Li, 2015;
surveys because there are no beginning and ending points imposing Hu and Jin, 2017).
limited temporal windows during which information is available. Opportunities provided by non-purpose-oriented data at the current
time might include larger volumes of information and greater levels of
2.2. Data may not be owned by the data analyser spatial and temporal detail than have previously been available from
traditional surveys. In many situations, however, these advantages are
Data ownership and use are complex and controversial issues in likely to be temporary, as advances in technological developments over
situations where information related to people's activities and move- time—and associated reductions in cost—should enable transport
ments may be passively collected, are continuously monitored, could planning organisations themselves to conduct purpose-oriented surveys
potentially be used to identify individuals, and are retained indefinitely with similar features. In contrast, the opportunity that is unique to big
(Mayer-Schönberger, 2010; Cate and Mayer-Schönberger, 2013; Mayer- data is the potential to identify patterns that are unlikely to have been
Schönberger and Cukier, 2013). Traditional surveys of travel activity, observed through traditional methods of investigation. This is an area
based on active data collection, work on the assumption that any or- which is yet to have a major impact on research within spatial and
ganisation which commissions surveys has ownership rights over the transport planning. As an illustration of the potential which transport
information gathered and is also responsible for ensuring that the studies may aspire to achieve, we may look to the field of medical re-
personal rights of people surveyed are not infringed. As part of this, it is search. In this field, Mayer-Schönberger, 2016 reports work in which
typically the case that data would only be passed to third parties in a large volumes of routine monitoring data (e.g. heart-rate, pulse,
form where it could not, for example, be used to identify individuals. In breathing) were used to identify elevated risk of infection in premature
addition, the purposes for which a third party could use such data might babies at a much earlier stage than traditional methods (based on
be prescribed to exclude issues that could work against the interests of identifying known symptoms) were able to do. A key feature of this
the data owners or fall beyond the consent considered to have been study was that it provided sound evidence of a correlation in an area
provided by participants during the survey process. This creates a that traditional methods would potentially not have explored, and
challenge for the use of non-purpose-oriented data for transport plan- thereby opened up the possibility of corrective interventions before the
ning purposes, as by definition it is likely that data will need to be causal links were fully understood. In the area of human mobility, there
transferred away from the original owners and used for purposes that has traditionally been great focus on quantitative predictions of travel
were not originally envisaged. For example, it is not unusual for pri- activity and on the valuations of benefits resulting from planning in-
vately owned public transport operators to be unwilling to release terventions. However, these predictions and valuations frequently
ticketing data to transport planners on the grounds that it is commer- struggle to capture the complexities of (and temporal changes in)
cially sensitive; likewise, mobile phone companies are often only human populations and their behaviour, suggesting that the potential
willing to release movement trajectories in an aggregated form to dis- for new insights based on more diverse mobility-related datasets would
guise individual identities. For location-based social networks it may appear to be immense.
not always be clear whether data ownership rights lie with the host
organisation or the individual who decided to post their information, 2.4. Data may be acquired from outside transport as currently studied
and the existence of privacy settings may allow individuals to be se-
lective about which information is available. However, researchers in A natural extension of the opportunities provided by non-purpose-
237
D. Milne, D. Watling Journal of Transport Geography 76 (2019) 235–244
oriented data, discussed above, is the potential to access data beyond recently analysed long periods of recorded traffic flows in order to
the traditional scope of studying spatial mobility and travel activity. identify systematic sources of variation, such as day-of-week and sea-
The emergence of big data provides the opportunity to find new ex- sonal effects.
planatory variables that are either beyond the currently-received
wisdom, potentially requiring understandings and approaches from 2.7. Synthesis and relationship of big data to traditional information sources
outside the transport field, or that were not previously quantifiable. In
particular it allows for the possibility of wider exploration of data- Rather than attempting to define big data in the ways that other
mining approaches that may reveal new insights about patterns of be- authors have done, in this section so far we have presented a series of
haviour and their causes. features which we believe many emerging data sources may possess,
For example, involvement of clinical health specialists in analysing and have discussed the main issues they appear to raise for research
information from wearable health monitors might be used to link data about spatial mobility and travel activity. Our central argument is that
about individuals' physical activity and fitness levels to better under- ‘bigness’ in terms of the number of data elements, while potentially
stand the health impacts of different transport scenarios. Owen et al. significant from a practical logistical perspective, may be largely irre-
(2012) used data from such monitors to explore the relationship be- levant from a conceptual viewpoint for distinguishing what is special
tween method of travel and mean level of physical activity for children about big data. We believe this view is justified because the spatial and
on their journey to school. Similarly, Oliver et al. (2010) considered the transport planning context we are considering does not rely greatly on
potential of GIS, GPS and accelerometry data for studying transport- speed of analysis. This might be contrasted with an application in
related physical activity. Subsequently, Carlson et al. (2015) used a which, say, an automated vehicle was being directed through machine
similar range of data sources to examine the relation between neigh- vision, during which there is a need to process vary large amounts of
borhood ‘walkability’ and levels of active travel. In time these ap- image data very quickly to ensure safe operation. By contrast, we are
proaches may be used to challenge existing theories about travel be- primarily interested in understanding and planning for longer term
haviour, which typically ignore links to physical activity, and trends. Actually, we think our discussion has wider applicability: even
hypothesise new ones. in a case where the objectives include some form of dynamic man-
Separately, both De Montjoye et al. (2015) and Lenormand et al. agement of the transport system in response to real-time information, it
(2015) have used credit card transaction data to investigate spatial and is likely that the most effective strategies will involve solutions that are
temporal mobility and its relationships to spending patterns. While this at least partly based on experience of similar past events, rather than
type of analysis is always likely to lead to debates about ethical con- those that focus solely on a rapid large-scale analysis of the present.
siderations and what can be done to overcome them (Sánchez et al., However, it is important to acknowledge that the features we have
2015; De Montjoye and Pentland, 2015), research of this nature has the identified may not be exclusive to big data. Even within more tradi-
potential to provide new insights into the ways in which the activities tional data sources there are likely to be elements of continuous mon-
that mobility supports shape and influence travel behaviour. itoring (e.g. automated traffic counts, public transport ticket receipts),
situations where data is not owned by planning authorities (e.g. private
2.5. There may be an ability to link multiple contemporaneous data sources car park arrival and duration data), reuse of data collected for other
purposes (e.g. use of vehicle licensing data to estimate traffic-related
The fact that some data sources may be continuously monitored emissions) and combined use of multiple data sources including data
provides the potential for temporal overlap (and thus temporal ana- from outside the transport sector (e.g. the use of demographic data from
lysis) of different data sources that may previously have been con- the census and business directory data, together, to estimate spatial
sidered to have only a circumstantial relation. The data may, at one patterns of travel demand).
extreme, be related at the level of individuals (e.g. Philips et al., 2017) In addition, it is true that not all traditional survey data can truly be
or may be on a higher level of aggregation (e.g. spatial, demographic). considered ‘small’. A population census, while only carried out peri-
Building on the example of spatially structured data on financial odically, may collect a wide range of data from every citizen. Likewise,
transactions discussed in Section 2.4, such information might be ana- other more focussed data sources related to, for example, land and
lysed temporally alongside data relating to movements of people and of property purchase, vehicle ownership, or road traffic accidents may be
different types of vehicles and, potentially, even matched with data expected to represent a somewhat complete coverage of the phenom-
about weather conditions to lead to better understandings about sea- enon measured. There has been longstanding use of this sort of ‘medium
sonal variations in spatial activity and travel behaviour. sized’ digital data, in conjunction with digital spatial mapping, leading
While this is currently a relatively new research area, a good ex- to a considerable body of research that provides analytical insights
ample can be found in Pereira et al. (2015). This study combined in- about human mobility within GIS environments. An approach that has
ternet data regarding special events with electronic public transport proved particularly useful for investigating travel activity over time has
ticket tap-in/tap-out data (for a case study of Singapore), to develop a been exploratory data analysis (ESDA) (Buliung and Kanaroglou, 2004),
predictive model of passenger arrivals at event venue locations. They which uses large-scale traditional survey data in combination with an
succeeded in improving the quality of transport predictions under object oriented analysis and design (OOAD) methodology to produce
special event scenarios, which they claim could lead to a greater ability spatial patterns and correlations. While not really fitting the char-
to plan for and manage such situations in the future. acterisations of big data that we have proposed, this type of work might
be considered to have some ‘big’ qualities, and so is particularly in-
2.6. Data of sufficient scale to apply statistical inference techniques teresting as a comparator against which to discuss emerging data.
In particular, we have in mind the ability of such a methodology to
In the transport discipline there has traditionally been a paucity of link large (albeit traditionally surveyed) datasets of travel activity and
both research and applications using statistical techniques to under- other information, such as demographic, socio-economic and business-
stand behaviour. One reason has been the previous difficulty in ob- related survey data. This research approach continues to be popular in
taining reasonable amounts of repeated data in a similar environment the study of mobility and to have considerable value. For example,
(e.g. data about origin to destination movements), due to the costs and Rybarczyk and Wu (2010) used an ESDA approach in conjunction with
disruption involved in collecting it. Obtaining sufficient data on the multi-criteria analysis to propose better ways for planning urban cy-
travel behaviour of a city population, when the city and phenomena cling facilities in Milwaukee City, while da Silva et al. (2014) used
within it are changing in an uncontrolled way, is a problem to which ESDA approaches in conjunction with population census data and in-
big data may provide a solution. For example, Crawford et al. (2017a) formation about road infrastructure to investigate the definition of
238
D. Milne, D. Watling Journal of Transport Geography 76 (2019) 235–244
urban regions in Brazil. An explicit aim of the latter study was to de- information to replace traditional surveys within existing types of
velop methods for use in developing countries where more detailed analysis. As a result, the literature contains few examples of big data
information (including, by implication, big data) may not be available. being used to provide novel insights about mobility in ways that have
More recently, Buckwalter (2017) has used socioeconomic census data not previously been considered. Mayer-Schönberger (2016) presents a
in conjunction with ESDA to investigate mode choice for journeys to different conceptual perspective in which the role of big data is seen as
work in Pittsburgh, while Loidl et al. (2016) used an exploratory spatial “reshaping the scientific method” towards inductive approaches at the
and temporal analysis to identify patterns of bicycle accidents in Salz- expense of deductivism, implying a new study environment that makes
burg. use of a broad range of data to reveal the potentially unexpected rather
There are some significant similarities between the ESDA body of than focusing on the narrower scope of information that fits existing
research and some of the transport-related work using big data, parti- theories and methods. This corresponds to Anderson's (2008) “end of
cularly regarding studies that focus primarily on investigating spatial theory” definition, though Mayer-Schönberger is clear that a big data
patterns of activity. However, there are also some key differences revolution should not require us to “abandon the search for causes”,
concerned with the ways in which research based on big data tends to and that the process of discovery has always been iterative.
focus more on high levels of resolution, especially temporal resolution,
which is typically missing from traditional datasets. Against that, a 3. Using big data for analysis and modelling of transport systems
critical advantage of medium-sized digital data from traditional surveys
is the wealth of explanatory variables that tend to be available, rather The nature of the information provided by the big data sources
than potentially needing to be inferred, meaning that GIS-based ESDA discussed in this paper is likely to be rather different to that from data
analysis may be more powerful for investigating differences between derived via traditional population and transport-related surveys. This
people and for suggesting more sophisticated aspects of human beha- will have implications for data-driven analysis and modelling activities
viour. Overall, a big data revolution should certainly not decrease the that support and help justify transport planning and policy decisions.
need for this sort of research, though over time it may impact on the
types of data that are available to analyse, if it causes traditional sur- 3.1. Traditional transport planning data and its drawbacks
veys to go out of fashion.
Considering the comparative advantages of different types of data The data that has traditionally informed transport planning has
naturally leads to observations about both data collection processes and primarily been provided by manual surveys of people and their travel
the nature of information generated. It is reasonably well established behaviour, for estimating travel demand, by manual mapping, service
that traditional surveys of travel demand and behaviour are not only level and landscape surveys, for estimating transport supply, and by a
time consuming and expensive, they can also prove to be both dis- mixture of manual and automated surveys of movements and transac-
ruptive for the transport system and intrusive for individuals. Yang tions, for calibrating flows. In addition to the possibility that useful
et al. (2014) identify all these concerns related to the use of household insights into the performance of transport systems might come from
and roadside interview surveys for deriving matrices of origin to des- new types of analysis of previously unconsidered external data sources,
tination movements. They argue that this constrains the feasible volume the main expectation regarding the role of big data in transport plan-
of data to a limited sample at a fixed point in time that is subject to ning is that it will replace much of the information that has previously
potential sampling bias. Similarly, Abdulazim et al. (2013) consider been collected manually. Much of the focus is on digital data that is
traditional travel diary surveys to be “non-respondent friendly” because already being collected, some of it outside the transport sector and for
they require participants to put in significant effort to recall and record commercial purposes, such data arising as part of the increasing use of
their activities. They argue that this may affect data accuracy and that it computer-based systems for managing human activities and transac-
results in the approach being inappropriate for use beyond a few days, tions.
meaning that there is little scope for capturing variations. By contrast, Reasons for replacing manual data sources are not simply related to
approaches which automate data collection through mobile devices perceptions of the benefits big data might bring and include a number
offer the potential for the process to be almost invisible, facilitating of ‘push’ factors, such as the issues of cost, time, intrusion and disrup-
study over longer periods and aiding data accuracy through both in- tion already discussed in Section 2.7. Manual data collection has always
creased sampling and removal of reliance on human responses. been both expensive and time consuming, which has often restricted
Nevertheless a variety of logistical, technical, cost and sampling con- the volume of data that could be obtained to levels below statistically
cerns remain regarding GPS, mobile phone and Bluetooth data. This has defensible samples. In addition, some types of manual survey (such as
led some to view the most promising direction to be open access, lo- on-street origin to destination surveys of traffic movements) are dis-
cation-based, social network data, made available through platforms ruptive, making them politically unpopular and, potentially, prone to
such as Twitter (Yang et al., 2014). errors due to the influence on behaviours the data collection may in-
With regard to the information generated by different data sources, duce (e.g. drivers re-routing to avoid disruptions caused by roadside
Hu and Jin (2017) have carried out a particularly thorough audit of the interview surveys).
pertinent characteristics. They identify low levels of spatial and tem-
poral resolution as the primary drawback of traditional surveys, while 3.2. Features of big data in a transport planning context
low levels of sampling bias and the potential to collect data for a variety
of explanatory variables for travel (e.g. mode, journey purpose and The fact that big data offers opportunities to resolve problems with
social demographics) are presented as the main advantages. By con- traditional data, such as those discussed in Section 3.1, does not ne-
trast, they judge most of the big data approaches to offer higher levels cessarily mean that information will be better in all respects. Differ-
of resolution, but with some inevitable sampling bias and a frequent ences in the nature of information that might be expected as a result of a
need to infer explanatory characteristics. Yang et al. (2014) argue that move away from manual data collection include:
location-based social network data have some “unique advantages” that
may provide potential to overcome the shortcomings of other sources. • origins and focus of data;
These advantages are related to additional activity-related information • volume of data collected;
that is attached to check-in locations (providing some explanations for • range and differentiation within data;
journeys) and growing levels of penetration (reducing sampling bias). • sampling of data observations;
The limitation of most work to date that uses big data in a transport • nature of errors and omissions.
planning context is that it focuses almost entirely on the ability of the
239
D. Milne, D. Watling Journal of Transport Geography 76 (2019) 235–244
The origins and focus of data might be expected to affect the in- procedures are well designed, but they may lack an ability to question
formation provided in respect of its ability to describe phenomena of apparently illogical observations (which, for example, a human inter-
interest for transport planning. Data acquired from third parties and viewer carrying out a roadside origin to destination survey can do).
collected for other purposes may have been defined in ways that limit They are also likely to be much more prone to errors that might be
detail (e.g. constraints on spatial resolution to protect privacy) or in introduced during the handling and transfer of large volumes of raw
ways that reduce the scope of the information (e.g. by having no link to data into a usable format for analysis. Omissions are most likely to
individuals, so that it impossible to measure the repeatability of activity relate to technological failure and can lead to a complete loss of usable
and mobility patterns from day to day). observations for the duration of the problem, though that may be
It is normally assumed that big data will provide a significant in- compensated for by the greater volumes of data available overall.
crease in the volume of information available, but that expectation may However, in studies using GPS and mobile phone data, loss of signal has
present challenges particularly in the case of continuous monitoring. sometimes been found to be a significant constraint on data quality
Whereas the traditional emphasis in analysis and modelling of transport (Rojas IV et al., 2016).
systems has focussed on attempting to represent long-run average
conditions from relatively small amounts of data, it might be expected 3.3. Implications for planning and modelling transport systems
to change towards attempting to identify short-run stability within far
more comprehensive datasets that include variations by hour, by day, Current trends suggest that it is inevitable that automatically re-
by season and related to specific events. It also seems plausible that corded, digital data will come into mainstream use both for academic
increasing volumes of data will be accompanied by rising expectations study and for the practical planning of transport systems. However,
of what the information can be used for. In addition, it may result in alongside this trend, what also seems likely is that inputs from more
demands from decision-makers that models of transport systems should traditional ‘small data’ sources will still be necessary, in order to make
cater for variations to a much greater extent than has been the case up for the lack in many ‘big data’ sources of demographic information
before. and other unobserved elements related to individuals and activities, all
By contrast, the range of data collected may actually reduce with a of which add important meaning, context and motivation to the in-
shift away from manual surveys. Probably the most common focus of formation. At the same time, it is possible that features of digital data
work to replace manual data with big data in the transport sector is the may lead to a greater focus on generic and transferable understanding of
use of information produced by commercial mobile devices (e.g. travel, across different contexts, scenarios, cities, and over time (see, for
smartphone and Bluetooth movement trajectories), in order to replace example, the plethora of works inspired by complexity science on the
manual surveys for estimating travel demand (Toole et al., 2015). This search for ‘universal laws’, such as the city scaling laws studied by
involves the acquisition of data collected independent of a transport Cebrat and Sobczyński, 2016). Such a greater pooling of information
planning context which provides potentially very detailed information from different situations would challenge the traditional of transport
about the movements of people in space and time. We have already planning on particular case studies, and the understanding of phe-
touched on the nature of the information generate by the new data nomena for specific locations and times.
sources in Section 2.7. In this regard, although the volume of data might For the discipline of modelling transport systems, the greatest
be expected to be much greater than from (say) traditional roadside likelihood would appear to be that models will become more empiri-
origin to destination surveys, the range of information directly available cally-based as a result of such a data revolution. On the other hand, it
from passive sources is likely to be significantly reduced, with no ability may be the case that more data will facilitate a greater number of op-
to differentiate features that add meaning to our understanding of portunities to test theories against real-world evidence. Modelling
travel such as mode, vehicle type, vehicle occupancy, journey purpose idealisations, such as equilibrium, economic-man, gravity and value-of-
and various demographic features. In addition, the spatial range of the time, may begin to be seen as less important. That may, in turn, add fuel
information may be compromised by difficulties in identifying the to debates about economic evaluation and decision-making. It may as a
precise start and end points of journeys, which can only be inferred result open up new ways of understanding impacts of infrastructure
from movement patterns. Some significant progress has been made over changes, beyond conceptually limited calculations of travel time sav-
a period of time towards developing new analytical approaches to ad- ings for existing patterns of journeys over fixed and relatively short time
dress these issues (Bohte and Maat, 2009; Diao et al., 2015; Çolak et al., horizons. Indeed, new opportunities to examine longitudinal effects
2015), but it is acknowledged that significant issues remain especially could result in more focus on transient properties, periods of change
for widespread practical application, including within more detailed and drivers of change. In time it should become possible to understand
settings (Rojas IV et al., 2016). the influence of more factors within transport systems, including
It is also to be expected that, in many digital data scenarios, sam- longer-term issues and factors that have previously been very difficult
pling will not be random and the implications of that, both for in- to quantify, such as the effect of political cycles if data spans several
dividuals and their activities, will need to be dealt with. It seems very parliamentary periods. Overall, it would be no surprise to see modelling
likely that—in the current era at least—observations based on trans- become more data-driven with an influx of pattern-matching ap-
action and tracking data are likely to be skewed towards the most proaches, potentially at the expense of more subjective approximations.
economically active, most technologically equipped and, potentially, It also seems likely that there would be increasing cross-uses of data,
younger members of society. For example, in studies using mobile even for conventional modelling (such as use of engineering data to
phone data it has been acknowledged that there are problems asso- calibrate behavioural models and vice versa) as part of previously ig-
ciated with variations across the population in levels of phone owner- nored relationships between variables being identified. Related to this,
ship and use (Rojas IV et al., 2016). greater use of techniques such as machine learning and data analytics
Finally, the nature of errors and omissions should be expected to would be expected, to gain new insights into critical elements.
differ significantly between digital and manual datasets. In traditional However, changes are unlikely to be limited to analytical and
manual transport surveys, errors are most likely to occur due to in- modelling practices, with transport systems themselves likely to evolve
correct recording of observations, a problem that may be difficult to in response to new information. This is something that models and the
quantify and control for without duplication of effort (Watling et al., policies they are used to justify will need to account for. Certainly, it
2012). Omissions, on the other hand, are typically the result of con- seems inevitable that providers of transport systems (e.g. public trans-
straints on time and resources or of contextual problems related to the port companies and authorities responsible for road network manage-
feasibility of manual surveys. Automatically recorded digital datasets ment) will increasingly use real-time information as part of their op-
might be expected to eliminate recording errors if the data collection erations, possibly from competing information providers. This will
240
D. Milne, D. Watling Journal of Transport Geography 76 (2019) 235–244
affect the real ways in which we travel. For example, information that may be that the sampled individuals of that group are in some sense
makes predictions of incident impacts will make travellers more aware atypical of all members of it, in which case we would need to take care
of unreliability and, thus, it should be expected that they may in- in making any inference, but if not atypical we would also have a basis
creasingly factor it into their typical behaviour, which this behavioural for making implications about the minority population of the city.
adaptation in turn we will then need to understand for modelling and Furthermore, big data provides an opportunity to become less re-
planning. Alongside the use of data by providers, as individuals receive liant on stated preference approaches because there is more chance of
information that is more personally-tailored they will find it easier to obtaining revealed data of the same individuals in a variety of contexts,
make choices to satisfy their requirements. However, the greater the or of obtaining/inferring perceptions of non-chosen options. It also
number of information providers, the more difficult it may become to provides potential to develop transferable behavioural models with
coordinate that information and control policies based upon it. more explanatory factors, due to much larger sample sizes which may
A policy-related research area that builds on the ideas of data be applicable to a wider range of policy contexts, socio-political back-
technology in transport is ‘Mobility as a Service’ (MaaS), which relies drops and locales, rather than just marginal changes from the present,
on the understanding that digital information can be used to coordinate as is often the case in current studies.
inter-modal transport alternatives for individuals. The aim of MaaS is to There is also the possibility that new symbiotic relationships could
provide them with door-to-door service options for journeys that would emerge between data owners and planning agencies, under which in-
previously have involved a series of different information sources and, formation could be provided for mutual benefit. A longstanding ex-
potentially, financial transactions (Ambrosino et al., 2016). The basic ample of symbiosis potentially similar to the big data context is that,
dimensions of the MaaS concept are still being developed, but a since the beginning of flight, aircraft have made observations about the
common assumption of all of them is the use of personal mobile devices weather for their own safety reasons, but since World War I they have
in real time for travel information and associated transactions. The also provided information to aid wider understanding of meteorological
original underlying idea behind MaaS (Heikkilä, 2014) was to reduce processes. Automated reports of aircraft observations have been avail-
reliance on the private car in urban areas by matching the door-to-door able since 1979 and the subsequent growth in commercial airline ac-
service it provides with other modes. There was also, potentially, an tivity means that they now play a vital role in improving the perfor-
implicit aim to increase the coordinating power of public agencies with mance of weather prediction models across the globe (Moninger et al.,
responsibility for transport, towards achieving greater integration of 2003). This has provided benefits to meteorological agencies serving
services and payment across all the different providers. However, as the wider populations, but it has also clearly been a benefit to the airline
MaaS idea is being disseminated through different environments, industry too as both aircraft performance and safety have been im-
multiple interpretations are currently emerging, including the possibi- proved through better weather predictions. Initially this type of sce-
lity that a technology-based private provider of transport, such as Uber, nario may constitute ‘data reuse’ to provide additional benefits. How-
might become a major driving force of integration in some situations. ever, over time, such data may evolve with the explicit intention that it
These institutional and political choices may have a profound impact on can serve multiple purposes.
the effects of data on the transport system, as well as on its availability Changes within the data and modelling spheres are bound to have
and use for planning purposes. knock-on impacts for practical transport planners, in particular related
to the expectations to which they are subject. Intuitively, greater
4. The opportunity potential of big data for transport planning availability of detailed data from continuous monitoring seems likely to
lead to greater expectations of focussed planning for more specific si-
4.1. Opportunities with big data tuations than is the norm at present. For example, planners may be
expected to be capable of creating policies that deal with differences by
A major feature of big data that has played a significant part in its day of week, season and weather condition.
adoption in other fields is that it allows analysis at a more ‘raw’ level, In parallel, new data and analytical approaches should empower
free of assumptions sometimes made in converting raw data to a transport planners. The ability to detect unexpected trends and changes
manageable form (e.g. ‘mechanisms’ to convert inductive loop data to may give planning the opportunity to become more contingent. In ad-
vehicle counts). Continuous monitoring allows the study of new kinds dition, rather than needing to carry out all analysis of the potential
of variation (time-of-day, day-to-day, time-of-year, scenario-specific) to impacts of policy ideas in advance, potentially leading to considerable
correlate with data on events/weather, and to monitor unexpected resource costs and delays before anything is implemented, continuous
events or disasters (e.g. the bridge collapse studied by Zhu et al., 2010; monitoring may provide opportunities for more trial and error ap-
or the earthquake/tsunami studied by Hara and Kuwahara, 2015). proaches to policymaking, with data giving continuous feedback. This
More widespread monitoring may also allow finer disaggregation of may help facilitate policies that encourage gradual changes based on a
effects and more opportunity to study small and/or disadvantaged series of ‘nudges’ rather than sudden step-changes. The types of data
groups. As we have mentioned earlier, there are concerns about the anticipated will also be highly suited to visualisation, which would fit
representativeness of some of the new data sources (e.g. skewed to- well with moves towards more public participation in planning pro-
wards younger people, or biased away from certain groups), but what cesses. For example, mobile phone data has been used to produce both
might at first seem contradictory is that such sources could still open up temporal density maps (Ahas et al., 2015) and spatio-temporal trajec-
possibilities for studying minority groups, even if such groups are tories (Gao, 2015).
under-represented in the data, due to the sheer scale of the overall data Finally, a major failing of transport planning in the past has been a
set. As an example, suppose that a minority group forms 0.1% (1 in paucity of ex-post studies to check how forecast outcomes of the im-
1000) of the population of a large city. Through a traditional travel pacts of decisions compare to reality. A key reason for this has been a
survey, 1000 individuals are randomly sampled (i.e. an unbiased lack of sufficient appropriate data (Nicolaisen and Driscoll, 2014)
sample), meaning that on average we will only observe one individual which is often related to lack of funding, despite significant evidence
(and perhaps in a particular case no individual) from the minority that such work improves the quality of predictions (ITF, 2017). One
group. Now a new form of passive data provides information on reason ex-post studies may sometimes have been avoided is the threat
100,000 individuals, but in this data set it is known that the minority they pose to political capital give the inevitable risk that impacts of
group is under-represented, and so only makes up 0.05% (1 in 2000) of decisions may not appear as good as expected. Big data should provide
the sample. In spite of this, 0.05% of 100,000 means that on average 50 the potential to address this by providing much more evidence on
individuals of the minority group will be observed, which seemingly which such studies could be based. Indeed, sufficient data may be
gives sufficient data for some kind of focus on that group specifically. It openly available to allow well informed independent studies to be
241
D. Milne, D. Watling Journal of Transport Geography 76 (2019) 235–244
conducted. The use of independent organisations to audit transport generation, storage and transmission may not always be infinite.
planning decisions is one of the central recommendations of the most Evidence already exists related to the uptake of streaming and cloud
comprehensive international review to date about ex-post assessment in storage/retrieval that the energy implications are far from insignificant
the transport sector (ITF, 2017). and could potentially grow over time to become a comparable problem
to the energy used for physical movement (Mills, 2013), though there is
4.2. Limitations and difficulties associated with big data also potentially evidence of a trade-off between ICT and physical
movement (Gelenbe and Caseau, 2015). It may be useful to compare
Clearly, there are—and will continue to be—many technical, ana- this situation with the historic uptake of other technologies, such as the
lytical and computational issues associated with using the types of data motor car, where initial perceptions of a wholly positive future without
discussed in this paper. Great advances are already being made in any constraints on capacity have proved wide of the mark. Certainly, if
machine learning, data science, data analytics, and the ease with which data capacity does need to be constrained in future that may be ex-
we may interface with and use device data. However, a particular pected to put pressure on all actors to ensure multi-functionality.
question that remains to be resolved is how analytical methods can be
‘future-proofed’, if there is no guarantee that the same data will con- 5. Concluding remarks
tinue to be available indefinitely in the same form. A key risk of non-
purpose-oriented data is that the external providers may undergo a This paper has attempted to open up thinking on what the emer-
change of priorities in which data and how much data they collect, and gence of new digital big data sources may mean for transport planning
in their willingness to allow it to be used for other purposes (let alone and for the analytical research and modelling that supports it. In the
any access charges). Even if those problems do not arise, technological process it has been able to reach few firm conclusions. However, from
changes are bound to lead to a multitude of compatibility issues over our discussion, two points do emerge strongly.
time that could reduce the temporal power of data considerably, unless First, research regarding the potential of big data for transport
appropriate common standards can be adopted. planning needs to think about more than how big data can make it
For situations where data is not open access another key question is easier to pursue existing approaches (e.g. to derive a demand pattern in
how data owners will respond over time to the understanding that their the form of an origin to destination matrix, or to correlate trip rates to
information has value. It is already clear that mobile phone operators, trip lengths). Rather, it needs to engage in a fundamental reassessment
and others who acquire data through technology-based operations, are of what data can tell us about transport systems that help us better
interested in charging considerable sums to public planning agencies understand how they function and what we can do to influence them in
for access to their information as a substitute for traditional travel ac- positive ways.
tivity surveys. It is possible that data owners will go further and ex- Second, the areas of data, predictive models and planning are a
plicitly seek out new markets to sell information for commercial gain. triple that must be considered together. If, for whatever reason, one of
Though in some cases the retention of value over time may be depen- them undergoes a significant change, then they must all adapt. In the
dent on the ability of data owners to secure widespread public en- case of big data, it is that adaptation process which is likely to prove
gagement in ways that are useful for planning purposes, it nevertheless critical if we are to unlock the greatest potential for improvements in
seems likely that potential sources of data will increase in number and transport planning and policy-making that increased volumes of in-
decrease in cost. formation may allow, while addressing issues that may limit its range
The inherently invasive nature of many data sources, especially and quality.
those involving continuous monitoring, also leads to potential for From our discussion we believe there are a number of ‘big chal-
considerable issues related to privacy and its trade-off with data fide- lenges for big data’ that need to be addressed in order to understand
lity. Mobile sensors, such as cellular phones and other portable devices and gain most benefit from the transitions that lie ahead. These might
with GPS and Bluetooth capability, offer potential to provide a wealth be briefly summarised as:
of information about human mobility behaviour in time and space.
However, the information poses a risk that individuals could be iden- 1. developing a clear understanding of how data related to transport
tified and their detailed movements tracked, leading many data owners systems is likely to change compared to traditional ‘small data’
to filter information and reduce spatial and/or temporal accuracy be- sources;
fore allowing it to be reused. This typically reduces its potential for 2. tackling limitations to the information that emerging data sources
providing insights, prompting research to identify approaches that can can provide to attempt both to avoid loss of useful detail and to
protect privacy without losing the fine-grained qualities of the original maximise benefits from new features;
data (Sun et al., 2013). 3. opening up transport planning to new opportunities for using data,
Beyond basic concerns about individual privacy, the collection and analytical approaches and specialist understandings outside the
use of significant volumes of data brings with it potential for significant traditional scope of the discipline;
ethical issues. For example, data may be used for (and may even lead 4. identifying and measuring the impacts that new data sources have
to) the targeting of policies at different population sub-groups. For on real transport systems, through the private and commercial use
example, if information were to become available to demonstrate that of information and its impacts on travel patterns and related beha-
certain people have a particular genetic make-up that causes them to be viours;
more pre-disposed to have road traffic accidents, then that information 5. re-specifying analytical and predictive modelling approaches in re-
could be used both to aid technological advances to make them safer sponse to the modified data landscape and the new insights it fa-
and to discriminate against them in the commercial insurance market. cilitates; and
This leads to the question of whether there should be controls on the 6. reconsidering the relationships that data analysis and predictive
types of uses of new data and the findings based upon it. In the case of modelling have with transport planning, policy formulation and
insurance, the traditional underlying concept of ‘human solidarity decision making, to try to ensure that interventions are based on the
based on ignorance’ (Mayer-Schönberger and Cukier, 2013) may be put best understanding and information available.
at risk if appropriate safeguards are not adopted; a similar concept of
solidarity based on a ‘veil of ignorance’ has already been argued for We have discussed some issues relevant to these challenges, but
healthcare (ter Meulen, 2016). much more remains to be done. Perhaps the most important message
Finally, there may be environmental implications of assuming un- may be that, in our response to new data opportunities, we should avoid
constrained data opportunities in the sense that the capacity for data attempting to build new analytical approaches, models and planning
242
D. Milne, D. Watling Journal of Transport Geography 76 (2019) 235–244
243
D. Milne, D. Watling Journal of Transport Geography 76 (2019) 235–244
Liu, Y., Sui, Z., Kang, C., Gao, Y., 2014. Uncovering patterns of inter-urban trip and spatial Philips, I., Clarke, G., Watling, D., 2017. A fine grained hybrid spatial microsimulation
interaction from social media check-in data. PLoS One 9 (1), e86026. technique for generating detailed synthetic individuals from multiple data sources: an
Loidl, M., Traun, C., Wallentin, G., 2016. Spatial patterns and temporal dynamics of urban application to walking and cycling. Int. J. Microsimul. 10 (1), 167–200.
bicycle crashes: a case study from Salzburg (Austria). J. Transp. Geogr. 52, 38–50. Rojas IV, M.B., Sadeghvaziri, E., Jin, X., 2016. Comprehensive review of travel behavior
Maréchal, S., 2016. Modelling the acquisition and use of information sources during and mobility pattern studies that used mobile phone data. Transp. Res. Rec. 2563,
travel disruption. In: Paper presented at the 48th UTSG Annual Conference. 71–79.
University of the West of England and University of Bristol. Rybarczyk, G., Wu, C., 2010. Bicycle facility planning using GIS and multi-criteria deci-
Mayer-Schönberger, V., 2010. Beyond privacy, beyond rights-toward a "systems" theory sion analysis. Appl. Geogr. 30, 282–293.
of information governance. Calif. Law Rev. 98 (6), 1853–1885. Saadi, I., Boussauw, K., Teller, J., Cools, M., 2016. Trends in regional jobs-housing
Mayer-Schönberger, V., 2016. Big data for cardiology: novel discovery? Eur. Heart J. 37, proximity based on the minimum commute: the case of Belgium. J. Transp. Geogr.
996–1001. 57, 171–183.
Mayer-Schönberger, V., Cukier, K., 2013. Big Data: A Revolution that will Transform how Sánchez, D., Martínez, S., Domingo-Ferrer, J., 2015. Comment on “Unijque in the shop-
we Live, Work, and Think. John Murray, London. ping mall: on the reidentifiability of credit card metadata”. Science 351 (1274-a).
Mazzocchi, F., 2015. Could big data be the end of theory in science? EMBO Rep. 16, da Silva, A.N.R., Manzato, G.G., Pereira, H.T.S., 2014. Defining functional urban regions
1250–1255. http://dx.doi.org/10.15252/embr.201541001. in Bahia, Brazil, using roadway coverage and population density variables. J. Transp.
McCarthy, O.T., Caulfield, B., O'Mahoney, M., 2016. Technology engagement and Geogr. 36, 79–88.
privacy: a cluster analysis of reported social network use among transport survey Siripirote, T., Sumalee, A., Watling, D.P., Shao, H., 2014. Updating of travel behavior
respondents. Transp. Res. C 63, 195–206. parameters and estimation of vehicle trip-chain data based on plate scanning. J.
ter Meulen, R., 2016. Solidarity, justice, and recognition of the other. Theor. Med. Bioeth. Intell. Transp. Syst. 18, 393–409.
37 (6), 517–529. Steenbruggen, J., Tranos, E., Nijkamp, P., 2015. Data from mobile phone operators: a tool
Mills, M., 2013. The Cloud Begins with Coal: Big Data, Big Networks, Big Infrastructure for smarter cities? Telecommun. Policy 39, 335–346.
and Big Power. (Report produced by the Digital Power Group). Sun, Y., Li, M., 2015. Investigation of travel and activity patterns using location-based
Moninger, W.R., Mamrosh, R.D., Pauley, P.M., 2003. Automated meteorological reports social network data: a case study of active mobile social media users. ISPRS Int. J.
from commercial aircraft. Bull. Am. Meteorol. Soc. 84, 203–216. Geo Inform. 4, 1512–1529.
Montini, L., Rieser-Schüssler, N., Horni, A., Axhausen, K., 2014. Trip purpose identifi- Sun, Z., Zan, B., Can, X., Gruteser, M., 2013. Privacy protection method for fine-grained
cation from GPS tracks. Transp. Res. Rec 2405, 16–23. urban traffic modeling using mobile sensors. Transp. Res. B 56, 50–69.
Mori, U., Mendiburu, A., Álvarez, M., Lozano, J.A., 2015. A review of travel time esti- Tamblay, S., Galilea, P., Iglesias, P., Raveau, S., Carlos, J., 2016. A zonal inference model
mation and forecasting for advanced traveller information systems. Transportmetrica based on observed smart-card transactions for Santiago de Chile. Transp. Res. A
A Transp.Sci. 11, 119–157. Policy Pract. 84, 44–54.
Nicolaisen, S.M., Driscoll, P.A., 2014. Ex-post evaluations of demand forecast accuracy: a Tang, J., Liu, F., Wang, Y., Wang, H., 2015. Uncovering urban human mobility from large
literature review. Transp. Rev. 34 (4), 540–557. scale taxi GPS data. Phys. A Stat. Mech. Appl. 438, 140–153.
OECD, (2013). New data for understanding the human condition: International per- Tao, S., Rohde, D., Corcoran, J., 2014. Examining the spatial–temporal dynamics of bus
spectives. OECD Global Science Forum Report, February 2013. passenger travel behaviour using smart card data and the flow-comap. J. Transp.
Oh, S., Byon, Y.-J., Jang, K., Yeo, H., 2015. Short-term travel-time prediction on highway: Geogr. 41, 21–36.
a review of the data-driven approach. Transp. Rev. 35, 4–32. Toole, J., Çolak, S., Sturt, B., Alexander, L.P., Evsukoff, A., González, M.C., 2015. The
Oliver, M., Badland, H., Mavoa, S., Duncan, M.J., Duncan, S., 2010. Combining GPS, GIS, path most traveled: travel demand estimation using big data resources. Transp. Res. C
and Accelerometry: methodological issues in the assessment of location and intensity 58, 162–177.
of travel behaviors. J. Phys. Act. Health 7, 102–108. Watling, D.P., Milne, D.S., Clark, S., 2012. Network impacts of a road capacity reduction:
Owen, C.G., Nightingale, C.M., Rudnicka, A.R., van Sluijs, E.M.F., Ekelund, U., Cook, empirical analysis and model predictions. Transp. Res. A 46, 167–189.
D.G., Whincup, P.H., 2012. Travel to school and physical activity levels in 9–10 year- Widhalm, P., Yang, Y., Ulm, M., Athavale, S., González, M.C., 2015. Discovering urban
old UK children of different ethnic origin; child heart and health study in England activity patterns in cell phone data. Transportation 42, 597–623.
(CHASE). PLoS One 7 (2), e30932. Yang, F., Jin, P.J., Cheng, Y., Zhang, J., Ran, B., 2014. Origin-destination estimation for
Ozbay, S., Ercelebi, E., 2005. Automatic vehicle identification by plate recognition. World non-commuting trips using location-based social networking data. Int. J. Sustain.
Acad. Sci. Eng. Technol. 9, 222–225. Transp. 9 (8), 551–564.
Pelletier, M.-P., Trépanier, M., Morency, C., 2011. Smart card data use in public transit: A Zheng, Y., Liu, F., Hsieh, H.-P., 2013. U-Air: when urban air quality inference meets big
literature review. Transp. Res. C Emerg. Technol. 19, 557–568. data. In: Proceedings of the 19th ACM SIGKDD international conference on
Pender, B., Currie, G., Delbosc, A., Shiwakoti, N., 2014. Social media use during un- Knowledge discovery and data mining, New York, http://dx.doi.org/10.1145/
planned transit network disruptions: a review of literature. Transp. Rev. 34, 501–521. 2487575.2488188.
Pereira, F.C., Rodrigues, F., Ben-Akiva, M., 2015. Using data from the web to predict Zhu, S., Levinson, D., Liu, H.X., Harder, K., 2010. The traffic and behavioral effects of the
public transport arrivals under special events scenarios. J. Intell. Transp. Syst. 19 (3), I-35W Mississippi River bridge collapse. Transp. Res. A Policy Pract. 44, 771–784.
273–288.
244