AC S A R L P: Omprehensive Urvey On The Mbulance Outing AND Ocation Roblems
AC S A R L P: Omprehensive Urvey On The Mbulance Outing AND Ocation Roblems
A BSTRACT
In this research, an extensive literature review was performed on the recent developments of the
ambulance routing problem (ARP) and ambulance location problem (ALP). Both are respective
modifications of the vehicle routing problem (VRP) and maximum covering problem (MCP), with
modifications to objective functions and constraints. Although alike, a key distinction is emergency
service systems (EMS) are considered critical and the optimization of these has become all the
more important as a result. Similar to their parent problems, these are NP-hard and must resort
to approximations if the space size is too large. Much of the current work has simply been on
modifying existing systems through simulation to achieve a more acceptable result. There has been
attempts towards using meta-heuristics, though practical experimentation is lacking when compared
to VRP or MCP. The contributions of this work are a comprehensive survey of current methodologies,
summarized models, and suggested future improvements.
1 Introduction
"Emergency Medical Service" (EMS) systems are complex and fraught with problems requiring resolution. In these
scenarios, two important questions are where to place the services and how to route them. Specifically these two
problems are generalizations of the "Vehicle Routing Problem" (VRP) [1] and "Maximum Coverage Problem" (MCP)
[2]; dubbed themselves as the "Ambulance Routing Problem" (ARP) [3] and "Ambulance Location Problem" (ALP) [4].
Like the problems which they are based on, the two are NP-hard and subject to a number of constraints and potential
decisions. VRP and MCP have both been explored extensively in past research and a number of approaches have
already been surveyed [5, 6, 7]. Both can be viewed in terms of a graph problem G = (V, E), where V = {0, 1, ..., n}
is the node set and E the route set between facilities or destinations [8, 9].
In the VRP, the goal is to establish the optimal route (or set of routes) in which a fleet of vehicles can traverse to a set of
clients or customers. Formally, there are a fixed set of vehicles with a centralized distribution depot. All vehicles must
participate in deliveries to clients, with each one stating a certain demand or requirement. The goal is to organize the
vehicles a way that minimizes the cost of routing [1]. There are many variations of the problem although almost all
have common objectives of minimizing distance or cost in an optimal time window, while coordinating to customers
and a depot (see Figure 1). Specific variations are based on limiting the vehicle capacity, implementing time windows
for deliveries, allowing for multiple routes by a single vehicle, allowing for multiple trips, or providing additional
services [10, 11, 12, 13, 14, 15, 16]. A diagram of the different common VRPs and their connections can be seen in
Figure 1. Depending on the size and complexity the VRP can be modelled and solved with mathematical programming,
A PREPRINT - JANUARY 16, 2020
though the NP-hard nature of the problem limits the use. It is much more effective to utilize metaheuristic approaches
to gain a near optimal solution [17]. Some of these solutions include "Genetic Algorithms" (GA), "Tabu Search",
"Nearest Neighbor Search" (NNS), "Simulated Annealing", "Ant Colony Optimization" (ACO), and "Particle Swarm
Optimization" (PSO) [18, 19, 20, 21, 22, 23].
Figure 1: An expression of the basic vehicle routing problem with a single depot.
The MCP is a more specific instance of the "Set Coverage Problem", as one must choose at most k sets which will
cover a maximum number of elements [2]. Similar to VRP, there are variations; the most documented extensions being
being associated with weighted elements and placing limitations on cost. For the purposes of this study, a similar
corresponding example is the facility location problem [24]. There are at most k facilities that must provide coverage to
a certain number of customers (see Figure 3). The customers within the range of the facility are assigned to it and are
considered covered by the location. Each facility must maximize its reach and provide the greatest coverage, while
adhering to the objective. Actual delivery to these customers becomes a secondary problem, which in itself may be
solved as a VRP. Regardless, this can also be formulated in terms of integer linear programming (ILP), yet smaller
cases can get an exact solution with algorithms like branch-and-bound [25]. In the case of larger examples, acceptable
solutions can be gained by local-search, GA, or PSO [26, 27, 28].
As mentioned, both ARP and ALP are generalizations of the previously described problems. The ARP’s goal is to
determine the most effective routes for ambulances for emergency requests in disaster response situations or preplanned
2
A PREPRINT - JANUARY 16, 2020
missions [30]. In this problem there is an ambulance set and a patient set. There is a cost associated with routing a
patient with an ambulance and the goal is to minimize the total of all the assignments. For the ALP, it is primarily about
determining the most effective location of ambulances for serving a population. It can be applied to either the facilities
themselves or if there are a greater number of facilities than vehicles, such as in an air ambulance system, the placement
of the vehicles among the facilities. Similar to the MCP, there are a set of facilities and a set of pickup points (patients)
[4]. Based on this setup, the process must maximize the coverage provided by the ambulance facilities to all patient
pickup locations. This is only one example of a definition, as this problem has a multiple variations. Additionally,
there is the potential overlap between the two cases where they are interrelated and solved simultaneously. In this
situation the locations of ambulances are either based on the optimal routes or the optimal coverage is determined prior
to routing [31]. Both problems can be modelled with integer-programming and either achieve an exact or approximate
metaheuristic solution. The type varies greatly on the problem size, which is further based primarily on the area in
which the ambulances are covering. A city wide EMS system will generally be varied and may be easily solved, while
an air ambulance system may require increased consideration [32]. Timing is also a far greater factor, as EMS is a
critical system requiring fast response and decision making. For this reason exact solutions become difficult to justify,
emphasizing a greater need for approximate metaheuristic variations.
The vast majority of ARP and ALP are static, where either the locations never change or the planned missions are
predetermined [33]. Solutions to these problems are often generated with actual mission data and resolved for a specific
period of time. However, a live EMS system generally considers real-time rescheduling, as disasters can occur at any
point. In this case a dynamic solution must be developed for unexpected events or potential changes in the medical
requirements of a patient [34]. To be clear, this does not suggest that all services must be dynamic and in some cases it
may be completely unnecessary. A service may plan a day of missions without the expectation of change; although
such a case is very specific and more likely to occur in a patient transfer scenario. This is not a completely new area of
research as past dynamic solutions have been explored in the VRP [35, 36, 37].
The remainder of this survey is arranged as follows. Section 2 discusses the modelling, constraints, and decision
variables required for the ARP and ALP. Section 3 discusses potential solutions to the described problems, with
descriptions of past related works. Section 4 discusses possible future research areas and suggests where further study
should be explored. Finally, section 5 closes the paper providing a conclusion to the discussed topic.
As previously discussed, both the ARP and ALP have multiple forms conditional on the requirements and structure of
the EMS system. Given that there are multiple versions of the problem, there are several ways to model the scenarios.
All can be structured with integer-programming and solved subject to a set of constraints. While it is impossible to
discuss all possible models, there are a few generalizations that can be made to each. This research only discusses the
models which are most commonly used for resolving the ARP and ALP. This does not restrict the possibility of using
other models, although the ones discussed are the basis for the majority seen currently. It is important to note that there
are quite a few other variants and further research can be towards applying other VRP or MCP models.
3
A PREPRINT - JANUARY 16, 2020
Ambulance routing is highly connected to the location problem. The majority of ARP cannot actually be seen as
full-scale VRP since the routes are usually a single pickup and drop-off for each scenario. While a general VRP usually
has an entire route planned, an ambulance will usually return to the starting base following a drop-off. Exceptions exist
in services such as air ambulance systems [30]; however, the majority do not follow this trend. As such, solving an
ALP is sometimes more useful than solving an ARP. To be clear, solving an ALP will help to determine to optimal
routing points and may cause the routing problem to be negligible. That being said, this is not an absolute and there are
cases where ambulances may end up at new bases or perform intermediate stops based on scheduling. This is especially
typical in air ambulance systems where vehicles are limited, patient transfers are common, and the coverage area is
much larger [30]. The following is a simplified generalization of the ARP:
X X
min cij xij (1)
i∈I j∈P(J)
X X
subject to xij = 1, ∀ r (2)
i∈I j∈P(J)
X
xij ≤ 1, ∀ i (3)
j∈P(J)
The objective of the above model is to minimize the cost of routing all requests while utilizing the available ambulances.
In this scenario J is the set of requests and I are the set of ambulances. For this case there exists the possibility that
multiple requests will be serviced by a single vehicle; therefore, the model considers P(J) as the powerset of J. This
indicates that an individual request is represented by r, while an single subset of P(J) be shown as j. Additionally, cij
is the cost of serving request i with ambulance j, while xij is a binary variable for determining whether request i is
being served by ambulance j. Equation 2 ensures that every request is satisfied and covered. The second constraint
represented by Equation 3, states that each ambulance services at most a single set of requests. The final requirement
guaranteed by Equation 4 simply enforces that xij be limited to a binary decision. It should be noted that this scenario
can be further constrained and implies that requests are predetermined. This is not unusual in the case of patient
transfers, although is limited in a dynamic or diverging environment.
ai ≤ sik ≤ bi , ∀ i, k (7)
4
A PREPRINT - JANUARY 16, 2020
An MCP is a broad term, while an ALP is more specific and in some ways closer to the "facility location problem" in
its simplest form [39]. Other than this, there are many variations of this problem. A comparison of some of these of
the ALP models can be found in [33] including: "Maximal Covering Location" Problem (MCLP), "Double Standard
Model" (DSM), "Average Response Time Model" (ARTM), "Maximum Availability Problem" (MAP), "Maximum
Expected Covering Problem" (MECP), and "Expected Response Time Model" (ERTM). While it is not completely
comprehensive, the several variations are discussed with their specific requirements. The most basic version of the
problem is nearly identical to the MCP, with the key difference of maximizing the sum based on the weighted demand
points covered by a vehicle.
The base model is relatively simplistic and can be seen as the foundation for most ALPs. That being said there are
extensions that can be made depending on the specifics attempting to be solved. For instance, the DSM extends the
MCLP by attempting to cover each demand location by at least two ambulances within a small radius r. Another
variation of the problem seeks to minimize the response times (example: ARTM), transforming the problem from
a maximization to minimization. These versions utilize travel times tij which must be less than or equal to a target
response time. Each difference modifies the constraints or adds further fixed variables, such as how busy the ambulances
are at a certain period. Regardless, all models solving the ALP seek to optimize the EMS system’s coverage capability.
Other than than MCLP current research has concentrated on the DSM, MECP, ARTM, and ERTM models. Each one
has been formulated in Equations 10 through 35, displaying very similar characteristics to the simple MCLP.
In this optimization model (expressed in Equations 8 through 12), I represents the set of disaster or demand locations,
while J is the set of active or possible bases [30]. The specifics of the sets depends on the circumstances, although
can be summarized as starting and pickup point. The actual importance of a location is given a weight di , which is
maximized by the objective function (Equation 8). The integer variable xj indicates the number of vehicles which are at
base j and yi is a binary variable stating whether a demand location i is covered by an ambulance. Per Equation 9, every
demand location needs to be covered by an occupied base. Additionally, each base can have at most p ambulances;
constrained by Equation 10. With regards to Equations 11 and 12, the decision variables in this problem are limited to a
binary value of 0 or 1.
X
max di yi (8)
i∈I
X
subject to xj ≥ yi , ∀ i (9)
j∈Ji
X
xj ≤ p, ∀ j (10)
j∈J
The DSM is expressed in Equations 13 through 21, where the objective function seeks to maximize the demand coverage
within a radius by two ambulances [40]. yi1 verifies whether the demand is being covered by at least one ambulance
within a small radius, and yi2 checks if it is covered by two. Equation 15 uses α as a fraction of the demand locations
that must be covered within a small radius. For this model, all demand location must be covered by an ambulance
(Equation 14) and each must be covered twice (Equation 16). Per Equation 18, a demand location cannot be covered by
twice unless it has been covered at least once.
5
A PREPRINT - JANUARY 16, 2020
X
max di yi2 (13)
i∈I
X
subject to xj ≥ 1, ∀ i (14)
j∈Ji2
X X
di yi1 ≥ α di , ∀ i (15)
i∈I i∈I
X
xj ≥ yi1 + yi2 , ∀ i (16)
j∈Ji2
X
xj = p, ∀ j (17)
j∈J
Equations 22 through 26 show the formulation of the MECP [41]. For all of the demand locations, this model performs
maximization on the weighted expected coverage. Through this it also considers the probability that an ambulance
is available within a certain response time using a busy fraction represented by q. Whether k vehicles can cover i
demand is expressed by the binary variable yik . Equation 23 maintains that an occupied base must be able to cover a
demand location. Equation 24 prevents the number of ambulances from exceeding p, while Equations 25 and 26 limit
the decision variables to natural numbers or binary values.
p
XX
max di (1 − q)k−1 yik (22)
i∈I k=1
X p
X
subject to xj ≥ yik , ∀ i (23)
j∈Ji k=1
X
xj ≤ p, ∀ j (24)
j∈J
xj ∈ N, ∀ j (25)
yik ∈ {0, 1}, ∀ i, k ∈ {1, ..., p} (26)
The ARTM differs from previous discussed variations in that its a minimization problem that seeks to lower the average
response time relative to the nearest base [42]. Travel time between base j to demand i is indicated by tij , and calculated
within the objective function. A new binary variable called zij indicates whether a base j is nearest to a i demand point
and for this problem a base is limited to a capacity of one ambulance (Equation 21). Equation 28 prevents all zij from
not being set for every demand location i.
6
A PREPRINT - JANUARY 16, 2020
XX
min di tij zij (27)
j∈J i∈I
X
subject to zij = 1, ∀ i (28)
j∈Ji
X
xj ≤ p, ∀ j (29)
j∈J
xj ≥ zij , ∀ i, j (30)
xj ∈ {0, 1}, ∀ j (31)
zij ∈ {0, 1}, ∀ i, j (32)
p−1
XXX
min di tji (1 − q)q k−1 zijk (33)
j∈J i∈I k=1
XX
+ di tji q p−1 zijp
j∈J i∈I
X
subject to zijk = 1, ∀ i, k ∈ {1, ..., p} (34)
j∈J
p
X
xj ≥ zijk , ∀ i, j (35)
k=1
X
xj ≤ p, ∀ j (36)
j∈J
xj ∈ N, ∀ j (37)
zijk ∈ {0, 1}, ∀ i, j, k ∈ {1, ..., p} (38)
7
A PREPRINT - JANUARY 16, 2020
research for the ALP and ARP can be found in Table 1 and Table 2. The remainder of this section offers solutions for
various possible techniques and then provides current research implementing those methodologies.
Simulation is generally used as a method of capturing the complexity of system or showing its operations by formulating
it as a mathematical representation. It can be utilized as optimization technique where the system is modified and
compared against a real existing case [77]. Simulation involves several methodologies, many of which are used to
model complex processes. For instance, "discrete-event simulation" (DES) is used to model a system as a sequence of
events, where each occurs at a certain instant in time and represents a change in the overall system’s state [78]. "Monte
Carlo simulations" are related to DES; however, utilize random number generators to add a layer of nondeterminism
for modelling stochastic problems. They generate a random sample set of points based on each given input value.
"Hypercube" is a form of sampling that takes this further and uses multidimensional distribution for the generation of
randomized sample parameters [60]. It is connect to Monte Carlo, although the goal is to more evenly distribute the
points across the possible values. "Markov decision processes" are another example and model decisions in a stochastic
and sequential manner. In this technique there is an assumed finite number of states and actions which can be taken.
The state is randomly changed in response to choices made in the environment, where the goal is to maximize long-term
total reward [79].
A way of representing or resolving a simulated model is by representing it with "mathematical programming". If the
variables to be described are restricted to integers then it can be formulated as an "integer linear programming" (ILP)
problem. In this case, all the variables are locked to either binary or integer values, and the constraints and objective
functions are linear. A case where where some of the decision variables are not restricted also exists, and is referred to as
a "mixed-integer linear program" (MILP) [80]. There are a number of ways to resolves these mathematical formulations,
with optimization solvers being a common choice for achieving exact solutions. Some choices for optimizers include
"CPLEX", "Gurobi", "Fico Xpress", and "MOSEK".
8
A PREPRINT - JANUARY 16, 2020
9
A PREPRINT - JANUARY 16, 2020
variations so that the whole can be solved. This is especially relevant in a decision making process, where an individual
decision may be broken down into is respective steps.
Maxwell et al. observed an ambulance redeployment scenario to reduce response times through the repositioning of
idle vehicles [44]. The operations of the EMS were simulated using a DES and evaluated in the context of approximate
dynamic programming. The actual redeployment problem was designed as a Markov decision process and revealed
that the simulation performed better than current static policies. In their formulation they determined that the time of a
state could be very close to the subsequent one. As such, utilizing a one-step simulation would have given almost no
information on choosing a particular action. For this reason they utilized micro-simulations, which were independent
for each state and stopped only once a decision was reached. The micro-simulations did have an effect on the final
evaluation of the system, which is shown and separated by the particular policy in Table 4.
Table 4: Runtime of each ADP policy based on the amount of micro simulations.
Micro Simulations
ADP Policy 5 10 30 60 10
Decision Time (s) 0.013 0.025 0.077 0.152 0.253
Simulation Time (s) 15 30 89 177 289
Iteration Time (m) 7.7 15 45 87 145
Training Time (h) 3.2 6.3 19 39 60
Ni et al. viewed a real-time problem involving ambulance deployment, where simulation was used to devise a
redeployment strategy in order to minimize response times [45]. A simulation was developed as a stochastic dynamic
program (see Equation 39) where patient calls were serviced based on a "first-come-first-serve" method. In this X(s)
denotes the set of feasible actions, U are random uniform variables held in a vector for capturing random noise,
c(s, x, U ) denotes the respective decision cost, and the next state relative to the current state is given by f (s, x, U )
10
A PREPRINT - JANUARY 16, 2020
. In this model after pickup, an ambulance would take the patient to a hospital and after hospital transportation
could potentially service another call. Essentially the bounds were obtained and explored from a combination of
methodologies based on comparison in queues. The results of the simulation yielded possible improvements in the
deployment strategies of the ambulances.
Shuib and Zaharudin developed a custom model for maximizing the availability of ambulances [57]. The model in
question is known as "time-based ambulance zoning optimization" (TAZ_OPT), a goal programming approach which
reduced response times. The model’s primary purpose was to use a grid-based system to identify the satellite location for
ambulances, while also determining the specific number allocated to a location. In the model, they used the probability
of an ambulance reaching a disaster location within a target range to determine its coverage and ensure it would only
serve the specific area. Additionally, the model also optimized the future expected coverage based on the demand and
determined the busyness fraction of an active ambulance. The formulation of the goal programming approach can be
seen in Equation 40 whereby P0 and P1 are the first (Equation 41) and second (Equation 42) goals, and d− +
0 and d1 are
the over or under attainment of those respective goals. Respectively di is proportion of daily demand rate in grid i, Pij
is the probability of reaching grid i in the target response time from location j, Yij is a binary decision for whether an
ambulance at location j is the nearest location to i, m is the total amount of grids, and c is the maximum number of
vehicles that be put towards grid j.
min(P0 d− +
0 + P1 d 1 ) (40)
n
X
P0 = max di Pij Yij ∀j = 1, 2, ..., m (41)
i
Lahijanian et al. considered a double coverage concept, where the demand was covered by a minimum of two vehicles
[58]. Figure 5 shows a representation of this concept; where demand points are represented by circles and current
vehicle locations are represented by triangles. In this scenario, demand points must be covered twice within r1, while
secondary necessity is held within r2. Unlike prior works, they also considered the uncertainty of travel times between
patient and ambulance locations with "triangular fuzzy numbers". To resolve the model, they took a goal programming
approach and received a solution using "GAMS".
Khodaparasti and Maleki presented a new combined dynamic EMS model that sought to locate vehicles and stations for
disaster situations [59]. In this model, they considered uncertainty within the parameters and implemented a dynamic
structure utilizing a fuzzy numbers for inputs and outputs. They obtained both dynamic locations and were also able to
analyze the efficiency of the locations at the same time with data envelopment analysis. The advantage of this structure
11
A PREPRINT - JANUARY 16, 2020
was allowing the ability to make short-term decisions, rather than considering long-term periods only. The objective
function is unique and expressed by Equation 43, where it minimizes the cost of locating facilities. For this problem I
is the set of demand points, J is the set of candidate ambulance stations, T is the set of time periods, St is the set of
possible scenarios during period t, fjt is the operation costs of station j for period t, hist is amount of demand, dijst is
the distance from a demand point to a potential site, λ is the minimum satisfaction degree of the membership function,
and the binary decision variables are xjt (station) and yjkst (ambulance). In this function the first set of terms is related
to station operation costs, the second is the cost of an assignment, and the last is the efficiencies of the stations.
XX p
XXX X X
min fjt xjt + hist dijst yjkst + (1 − λ) (43)
t∈T j∈J i∈I j∈J t∈T s∈St k=1
Morohosi and Furuta analyzed the patterns in dispatching real ambulances and compared this with a simulated output
[60]. Then based on this they attempted to utilize a simple methodology for estimating the improvement from an
optimal solution location for Tokyo. For this purpose, they developed a hypercube simulation model which generated
randomized parameter values and incorporated time nonhomogeneity. A key feature identified in their simulation was
the introduction of a ambulance priority list for a given demand position, speeding the process up. Additionally, they
determined probability p for each state transition, where a dispatch occurs for the ambulance i nearest to the patient’s
location. Equation 44 expresses this where I is the set of ambulances, J is the set of demand points, λj (t) is the specific
amount of calls which happen at j per unit of time, µ represents the average service time of a vehicle, and Ski is a state
transition at time k for ambulance i. The simulation held when analyzed against actual data and was used to solve a
MECP. They found that the objective function, in this case, was over-estimating the coverage.
X X X
p=µ Ski /(µ Ski + λj (t)) (44)
i∈I i∈I j∈J
Lee et al. addressed a hybrid air and ground ambulance problem related to maximizing service coverage by simultane-
ously locating trauma centers and helicopter ambulances [61]. They confronted the issue associated with estimating the
busy fraction of available helicopters, which was required to develop an accurate model and maximize the objective
function of successfully transporting patients. There modified busy fraction calculation is denoted by Equation 45
where Hij is the set of heliport bases, Lh is the service time operated by all helicopters at a particular base, and K is
the set of helicopters. A constant of 24 hours was assumed for helicopter operations. Additionally, they considered a
location problem where the goal was to maximize the expected number of patients that could be transported within a
60-minute window from the incident’s occurrence. In this scenario, they utilized a combination of DES and integer
programming to iteratively update the busy fraction; improving the solution compared to other methodologies of the
same type.
P
h∈Hij Lh
pij,k = ∀ (i, j) Hi,j 6= ∅ (45)
24k
"Ambulance diversion" is a method of relieving congestion whereby ambulances bypass a location for another. Ramirez-
Nafarrate et al. investigated the effect caused by this diversion strategy, analyzing average waiting times [83]. A
description of patient flow can be seen in Figure 6, which is essentially separated by how a patient arrives. For their
simulation they assumed a "non-homogeneous Poisson process" for arrivals and separated patients based on severity.
Different policies were analyzed and included: diversion when all the beds were occupied, policies obtained utilizing
Markov decision process formulation, and not allowing diversion at all. The MDP formulation strategy greatly improved
the average waiting times when compared to non-diversion and simple policies. Capacity is still something requiring
further work as it effected both the fraction of time spent on diversion and the average waiting time. Essentially,
small changes in capacity significantly impacted the performance. They suggest that with this consideration one could
determine the optimal capacity for a limited budget.
Dibene et al. Modeled the problem to determine if the current placement of emergency services was optimal in Tijuana
Mexico. They relied on past data of emergency calls and then resolved a modified version of the DSM using integer
linear programming [40]. The model considered potential base locations, call demand and priority, demand scenarios,
demand location, and average travel times. To more closely relate to real world policy, they based their variables off the
United States EMS Act. The small radius time standard r1 is set to 10 minutes, the large radius time standard r2 is set
to 14 minutes, α which is the fraction of demand covered by within r1 is 0.95, the number of ambulances available p is
11, and the number of p ambulances per base j is 2. They determined that the services should be moved to cover the
demand and improve coverage.
12
A PREPRINT - JANUARY 16, 2020
In this research, a study was performed utilizing past population and incident data to determine the best position of
air ambulance services in the country of Norway. Røislien et al. then formulated and solved an MCLP, whereby the
number and location of bases were explored with a constrained time threshold [62]. The model determined the optimal
allocation of facilities to maximize the number of inhabitants and incidents that could be covered within a certain time
window. The experiment was resolved with CPLEX with results suggesting that population density alone was not an
appropriate option for determining placement and instead must focus on incident areas. A summary of the results are
seen in Table 5, where "greenfield analysis" (assuming no existing bases) and small base adjustment scenarios were
compared.
Table 5: Coverage comparison utilizing either population density or municipal incident data for existing base structure
and a greenfield scenario [62].
Data Time Threshold Bases Population Coverage Incident Coverage Base Scenario Population Coverage Incident Coverage
Population 45 5 93.32 78.31 Use existing 96.90 91.86
Population 45 6 96.29 83.55 Relocate one base 98.40 93.44
Population 45 10 100.00 100.00 Add one base 98.40 93.44
Incidents 45 7 96.18 94.73 Use existing 96.90 91.86
Incidents 45 8 98.22 97.91 Relocate one base 97.90 96.35
Incidents 45 10 100.00 100.00 Add one base 97.90 96.35
Population 30 9 91.97 69.30 Use existing 84.70 72.13
Population 30 12 96.36 81.88 Relocate one base 87.93 71.50
Population 30 22 100.00 100.00 Add one base 88.98 75.35
Incidents 30 14 93.81 92.04 Use existing 84.70 72.13
Incidents 30 16 94.76 96.27 Relocate one base 86.00 74.77
Incidents 30 22 100.00 100.00 Add one base 85.68 76.62
Van Den Berg et al. made an argument that to ensure rapid responses, the location of bases and distribution of vehicles
among them must be optimized [63]. They modeled and solved an MECP for urban and rural regions in Norway using
a fixed number of bases and variable ambulances. Analysis was performed in a greenfield scenario and also utilized the
current implemented base structure. The busy fraction for the problem was varied between four potential possibilities
including 0 (single coverage), 0.15 (night), 0.35 (evening), and 0.50 (day). They determined that four out of the five
bases were already within an optimal location, although coverage could still be improved up to 97.51% from the current
93.46% (see Table 6).
Lanzarone et al. analyzed a two-fold problem involving both ambulance locations and dispatching [42]. An ordered list
of ambulances was constructed for dispatching to each zone and if there was not an availability from the list then the
closest one available was dispatched. They utilized four "recursive optimization-simulation approaches" (ROSA), based
on the DES to determine the probability that an ambulance would be busy upon a call. In this scenario a vehicle was
either idle or busy, being able to be assigned to a call only if it was idle. Similarly, to the ERTM, this determination was
made to better minimize the response time in calls by being able to accurately dispatch to the scene. A comparison
13
A PREPRINT - JANUARY 16, 2020
Table 6: Coverage for differing base values for the current structure and greenfield scenario.
Number of Bases Current Coverage (%) Greenfield coverage (%)
1 45.14 46.96
2 68.99 78.24
3 82.14 90.26
4 91.89 95.84
5 93.46 97.51
6 98.36
7 98.55
15 99.40
between the models determined that more sophisticated models offered a more significant result. The modelling of this
problem utilized a heavily modified objective function (see Equation 46) based on the ERTM. It considers those not
included in the dispatching list and penalizes Tp if no vehicles are available. As such, an extended list of length |K|
z
was provided for each zone i. It continues after the base list, order from closest to farthest, and defined as yeij . The
z−1
remaining portion of the problem (1 − q)q accounts for the the probability of a vehicle in point z responding to the
call.
XXX X
(1 − q)q z−1 di tji yeij
z
+ di q |K| Tp (46)
i∈I z∈Z j∈J i∈I
Ingolfsson et al. described an ALP optimization model, minimizing the number of vehicles required and maximizing
coverage [64]. They considered different service levels, where a level was measured as the fraction of incidents reached
within a particular time frame; considering uncertainty by calculating response time as a combination of a random
delay summed with random travel time. It was thought that by modeling this randomness a more realistic model could
be developed. This design was able to be solved for large scale cities by general-purpose optimizers. The expected
coverage was assessed through an approximate hypercube model and optimally solved with the "Premium Solver
Platform". The conclusion of this research determined that the inclusion of this type of variability impacted the model
significantly and must be considered in actual policy.
Boujemaa et al. took the approach of using stochastic programming and considered cost minimization for transporta-
tion, allocation, and even station development [65]. A "Sample Average Approximation" method by integer linear
program was first used to approximate the model and then a Monte Carlo optimization method called Sample Average
Approximation solved random instances of ambulance location and allocation. The results showed a slightly higher
cost for the stochastic model compared to deterministic. The stochastic model could aid in avoiding the exposure of
patients to greater risk from not satisfying the constraints related to capacity seen in the deterministic variation.
Tavakkoli-Moghaddam et al. considered a pre-disaster situation with the idea of locating temporary emergency stations
and ambulance routes [46]. The disaster phases are divided based on different points of time; where this study
specifically emphasized preparedness. The model was a "bi-objective" variation where the optimal quantity and costs
were determined in conjunction with a minimized total response time. Patients were grouped according to priority
and then optimal relief stations and routes were calculated. Two objective functions were developed with the first,
expressed by Equation 47, minimizing the service completion time among "red code" (serious injuries) and "green
code" (slight injuries) patients. Equation 48 describes the second objective as minimizing the setup cost for temporary
emergency stations. wr and wg are respectively the priority given to green and red code patients, while Eg and Er are
the latest service completion times. sh is the cost to establish a hospital in the h region and uh is a binary variable for
actually establishing a facility. A study was performed, considering a disaster and the catastrophic results upon densely
populated regions. Their model was validated with a -constraint method, whereby one the objective functions was
optimized by using the other as a constraint.
min wg Eg + wr Er (47)
X
min sh uh (48)
h
Zhang and Zeng considered location and possible relocation of ambulances using a robust optimization model with
two stages combined with mixed-integer programming for capturing the sequential decision-making process [66]. For
this, an approximation extension was designed based on the "column-and-constraint" (C&CG) generation algorithm,
14
A PREPRINT - JANUARY 16, 2020
producing an optimal or near-optimal solution when used with real-world data. In this research, they additionally
considered the uncertainty associated with unavailability and relocation.
Mouhcine et al. proposed a distributed solution using ACO to determine the optimal paths for emergency vehicles [3].
The solution sought to minimize the travel time while constrained by obstacles such as traffic, speed limit, available
vehicles, and facility positioning. The system generated a dynamic path, determining the shortest paths based on a set
of intelligent agents. As seen in Figure 7, the routing age (RA) sends out worker agents (WA) to determine optimal
ambulance routes. Following each iteration, another path is determined and best ones are selected. Mouhcine et al.
suggested that further research needs to be done on the solution, as convergence to near-optimal is not guaranteed.
15
A PREPRINT - JANUARY 16, 2020
Javidaneh et al. attempted to resolve an ambulance routing problem by implementing an approach using ACO [47].
The problem assumed that the number and location of facilities, vehicles, and patients were known; as well as the
capacity of both facilities and vehicles. The goal of the solution was to route ambulances in the least amount of
time possible, while constrained by route lengths and hospital capacities. For this solution the algorithm inputted an
adjacency matrix representation where the graph displayed the location of people requiring assistance, the capacity
of the hospitals and ambulances, and the number of people at each facility. The output for this model was simply the
routes for each respective ambulance satisfying the objective function. Results of the algorithm were comparable to
real-world, displaying the usefulness of ACO in this type of optimization problem.
3.3.1 GA Description
"Genetic algorithms" (GA) are part of a group of stochastic optimization techniques, where a function is minimized
or maximized with randomness introduced into the process [86]. It is based on Darwin’s theory of evolution and
essentially simulates the idea of natural selection, where it is repeated until hopefully a near-optimal solution is found.
Though the steps may be altered in general they are selection, crossover, and mutation. As seen in Algorithm 2, it starts
with a random initialized population and then assesses the fitness (objective function value) to choose points (called
“parents”) contributing to the next evolution (lines 1 to 3). Two parents will combine, and the new “children” form the
next generation in the system (line 5). Lastly, random changes known as mutations will be added to the population to
add diversity to preceding generations and stop the optimization from ending up in a local minimum (line 6). At the end
of a generation the fitness is evaluated and the steps are repeated (line 7). A limitation with a GA is the fact that design
choices must be carefully made, as certain choices may lead to unfavorable results (a local optima) or even an incorrect
answer [87]. Additionally, GA’s are subject to the initial population, making later convergence a possible issue.
16
A PREPRINT - JANUARY 16, 2020
3.3.2 GA Solutions
Iannoni et al. embedded a spatially distributed queuing (hypercube) model into hybridized GA’s to optimize the
operations of ambulances [68]. The method considered the location of ambulance facilities along a highway, as well
as how to district the responses. The model was adapted to analyze either a "single dispatch" (model 1) or "single
and double dispatch" (model 2). Essentially, the model could either optimize the coverage areas to reduce response
times or resolve workload imbalances. It was validated with DES and found in real scenarios that policies could be
improved by facility relocation. Step 1 was to apply optimize the location of ambulance bases through a location GA.
Following this, step 2 started from the solution generated by step 1 and improved it through the modified districting
GA/hypercube model. A summary of the results are in Table 8 where T(x) is the primary minimization goal as the
mean region wide travel time, σp (x) is the imbalance of ambulance workloads expressed as standard deviation, and
Pt>10 (x) is the fraction of calls not serviced in 10 minutes.
Chuang and Lin confronted an ALP based on MECP and DSM [41]. They proposed a combination model known as
the "maximum expected covering location problem with double standard" (MEXCLP-DS) to solve in a probabilistic
situation and meet the coverage not provided by the previous individually. A formulation of this combination is
represented in Equations 50 through 53, and uses a similar set of variables to the MECP and DSM discussed in Section
2. Different varialbes include hk as the probability of a call occurring at node j and l as the total number of ambulances.
The objective function shown in Equation 50 maximizes the expected number of demands that can be covered. Equation
51 ensures that a proportion of the demand is covered based on α, Equation 52 states that the total amount of vehicles
at every node must be larger than the amount of facilities covering a node, and Equation 53 prevents the number of
vehicles at every node from being greater than the total number of vehicles. The models were resolved with a GA using
a combination of directed and stochastic searches, resulting in the demand being able to be 100% covered with an
8-minute standard arriving time.
l X
X m
max di (1 − q)q l−1 yij hj (50)
i=1 j=1
l X
X m l X
X m
subject to di yij hj ≥ α di hj (51)
i=1 j=1 i=1 j=1
m
X l X
X m
xj ≥ yij (52)
j=1 i j
Xm
xj ≤ l (53)
j=1
Tlili et al. discussed an ARP for emergency medical services and utilized real-world data [48]. They modeled the
problem in what they referred to as a simple and open ARP, where the goal was to serve a greater number of patients,
while utilizing the same amount resources and minimizing the total travel distance. In terms of formulation the problem
used a traditional VRP-based design; however, also considered a total budget constraint. This is shown in Equation
54, stipulating that the travel cost cannot exceed to total budget C for all ambulances p. To resolve this a GA was
17
A PREPRINT - JANUARY 16, 2020
implemented; however, utilized a customized recombination operator. In small instances, the solution was solved
efficiently.
XX
k
xijk cij ≤ Cmax ,∀ 1 ≤ k ≤ p (54)
i∈I j∈J
Sasaki et al. made an argument that with changing demographics, ambulance response times are decreasing steadily
[69]. They used predictive techniques to determine future EMS cases and then compared current and future optimal
locations. The model was resolved with a modified grouping GA to determine potential sets of ambulance locations.
Without a GA, a calculated brute force method would have needed to consider 4.415e + 35684 permutations. The
results concluded that response times could be decreased by about a minute if the updated locations are considered.
Fogue et al. concentrated on non-emergency patient transport services for patients that did not require urgent care yet
still required hospital transportation [49]. For this problem, a human operator usually developed a suboptimal solution,
which they aimed to overcome with a novel algorithm referred to as the non-urgent transport routing algorithm (NURA).
This method utilized a GA as part of its core to explore the solution space and generate detailed routes for ambulance
services. A scheduling algorithm was then generated the specific plan for each mission. The results of the algorithm
appeared to outperform human experts in similar conditions and reduced the average waiting time (see Table 9).
Pacheco et al. addressed a static ALP for a fleet in Tijuana, Mexico [70]. In the research, they modified the model as a
multi-source Weber problem to ensure it was designed as continuous and included EMS time thresholds. In this problem
the aim is to locate m facilities and allocate the demand to those chosen. The objective, as shown in Equation 55, was
to reduce the sum of the weighted distances from facilities to demand point. K is the set of demand points, wk is the
associated weight value, dk is the distance from k to a facility, (x, y) are location coordinates, pk is a particular point
where an emergency occurs, and vi is a binary variable allowing for a selection between the travel time of d(pi , (x, y))
and the time tolerance threshold. For the solution, a GA was implemented and compared against the current placement;
results indicating improvement over existing selections for expected travel time tk between two locations (see Table 10).
m
XX
max wk d(pk , (xj , yj ))vkj (55)
k∈K j=1
Xm
subject to vkj = 1, ∀ k (56)
j=1
Benabdouallah and Bojji modeled the problem as a dynamic double standard model (DDSM) to garner the best
distribution of ambulances to bases [71]. They then compared the resulting coverage optimized by a GA and ACO
approach. The goal of these models was to minimize the lateness of ambulances to disaster locations and were based on
random instances within a two-period day. The GA approach garnered a better-minimized fitness when compared to the
exact solution extracted by CPLEX, with the latter not being able to solve beyond a certain threshold (see Table 11).
18
A PREPRINT - JANUARY 16, 2020
Ramirez-Nafarrate et al. proposed a simulation to explore ambulance diversion (AD), where an overcrowded emergency
department requested an ambulance to bypass their location [50]. Their research was based on DES with the goal
of analyzing the effect of AD on patient flow within the care system. The simulation was utilized to determine the
parameters required for the diversion policies and reduce the expected travel time. Additionally, the research proposed a
GA for designing and implementing new policies, the results of which displayed significant improvement over current
patient flow methods.
"Local search" forms the basis for several optimization techniques and in its simplest form is itself a process for
solving computationally intensive problems. A local search-based algorithm moves between multiple possible solutions
within a search space, making small localized changes as it progresses. It will only stop once it reaches an appropriate
near-optimal solution or another specific stopping condition (such as time) occurs. A key issue with local search in its
basic form is that it commonly only attains a local optima [88]. As such, there are multiple algorithms which have been
adapted and extended this solution.
Variations of local search manipulate a set of data within a certain space known as a neighbourhood. Three common
neighbourhood-based techniques are "large neighborhood search" (LNS), "variable neighborhood search" (VNS), and
"nearest neighbor search" (NNS). LNS gradually improves an initial solution by modifying an incumbent answer
over an exceptionally large area [89]. The purpose of this is to minimize the potential for getting locked into local
minimums as the improvements are much greater. VNS updates a current solution by exploring distant neighbourhoods
and adjusting if there is a more optimal value [90]. In this case multiple neighbourhoods are explored and will repeat
until an optimum is discovered. NNS is not just one, but a group of algorithms optimizing by determining points within
a certain set that are more similar or can be adjusted based on another defined criterion [91].
"Tabu Search" further extends the idea of the neighbourhood-based algorithms by excluding recently explored areas
within a search space and possibly allowing moves that would not improve the objective. A list of previously visited
answers are held within a Tabu list, whose goal is to prevent a search from becoming locked into a local optimum. In
general it only records recent moves and will not allow a solution which has been explored within a particular period.
This can optionally be ignored if it would otherwise improve the fitness. The list usually clears certain answers from it
after a period of time, although the size of the list can vary depending on the problem [92]. The pseudo-code of the
Tabu search steps are shown in Algorithm 3. Each solution is represented by a state, where a state is the organization of
variables into a specific solution. From lines 1 to 3, an initial solution state is generated and saved to the initialized
empty Tabu list. Then from lines 4 to 10 new neighbourhood solutions are explored and the solutions are updated if
they have not been added to the current list. Once a new improved state is discovered it is placed inside of the Tabu list
at line 8 and the cycle continues until the user defined stopping condition is satisfied.
19
A PREPRINT - JANUARY 16, 2020
Schmid and Doerner developed a "multi-period dynamic" variation of the ALP, allowing for the repositioning of vehicles
and based on the DSM [43]. The model was structured with MILP and the optimal was solved for simultaneous points,
while also considering time-dependent data. A modification to the standard DSM objective function can be seen in
Equation 59. For this modification, a DSM has to be simultaneously considered for every instance t ∈ T . The set
of demand locations is represented by V and the set of vehicle locations is W . Since relocation is possible, β is a
penalty for performing this task. In terms of achieving near-optimal, Schmid and Doerner resolved the model with
VNS; randomizing starting points, performing a shaking phase to explore the solution space, and improving further with
local search. On average the algorithm was able to find a comparable solution to the CPLEX optimizer in a drastically
decreased time.
X X X
max di x2,t
i −β t
rij (59)
t∈T i∈V i,j∈W
Dağlayan and Karakaya proposed a solution for effectively scheduling ambulances following a disaster and designed it
as a "capacitated vehicle routing problem" (CVRP) [52]. The research aimed to minimize the number of routes and
reduce average travel time to hospitals. For this task, they developed a GA and evaluated it against an NNS heuristic.
The algorithms were compared against three types of scenarios: light damage (1 to ambulance capacity), medium
damage (1 to 8), and heavy damage (ambulance capacity to 8). Following simulation-based experiments on common
CVRP benchmarks, the results showed that the GA outperformed the NN method, achieving shorter tour distances (see
20
A PREPRINT - JANUARY 16, 2020
Figure 13). A potential downside to this method is in the fact that it was limited to only a single facility and ambulance,
though they admitted further study needs to be completed regarding this.
Table 13: Improvement of the GA over NNS for known CVRP Benchmarks.
Benchmark Average Tour Length Improvement of GA over NNS (%)
ulysses-n22-k4 21.88
bays-n29-k5 12.77
A-n38-k5 9.65
A-n40-k5 9.63
E-n51-k6 9.05
P-n55-k10 8.85
P-n60-k10 7.27
B-n64-k9 6.12
E-n76-k9 6.08
P-n101-k4 5.62
Gendreau et al. designed and modeled an ALP after the double coverage variation [72]. The objective was to maximize
the coverage using two ambulances, constrained by actual requirements imposed by EMS service laws. Real and
randomly generated data points were used in conjunction with a Tabu Search heuristic, approaching near-optimal
results in a reasonable computing time compared to the CPLEX optimizer. While the algorithmic solution approaches
near-optimal, the calculation time is only marginally better than CPLEX. The computational resources available could
have been a factor as this paper is now older and would most likely have a better solution if utilized on a modern CPU.
Erdogan et al. developed two models for maximizing the coverage by scheduling ambulance crews [73]. The first
sought to maximize the aggregate expected coverage which was the ratio between the expected number of calls covered
and the total number of calls. This is represented in Equation 60 where δij is number of expected calls covered at hour i
by adding vehicle j, yij is a binary variable for setting whether the total amount of ambulance crews during the first
hour i is at minimum j, ei is the average number of calls for hour i, and h is number of hours in the planning horizon.
The second was a "lexicographic biobjective" model (Equation 61) where the goal was to maximize both the minimum
expected coverage over every preceding hour and the aggregate expected coverage. In this model, w is equal to the
minimum expected coverage over every hour. The models were uniquely optimized with a parallel variation of the Tabu
Search, outperforming similar previous approaches.
Ph P
i=1 j=1 δij yij
max Ph (60)
i=1 ei
Ph P !
i=1 j=1 δij yij
max w, Ph (61)
i=1 ei
Oberscheider and Hirsch aimed to ensure efficient transport for non-emergency patients utilizing real-patient data from
lower Austria’s Red Cross [53]. The work claimed it contrasted with prior variations as they considered non-static
service times that depended on the combination of patients, their transport mode, the vehicle type, and the pickup or
delivery locations. Their model is based on the "static multi-depot heterogeneous dial-a-ride problem" (MD-H-DARP)
and to solve it they first generated all combinations of patient transports given a set of constraints. A "set partitioning"
action was then completed upon the previous generation and an initial solution was generated. They then inputted these
combinations into a Tabu Search and further optimized the routing. The strategy was determined to be an improvement
over manual scheduling.
Repoussis et al. presented a MILP model to provide operation guidance, routing, and scheduling for mass casualty
incidents [54]. The aim was to allocate minimal resources while reducing the total flow and response times. To resolve
this problem, they developed a "hybrid multi-start local search" method which involved a MIP-based construction
heuristic followed by an iterated Tabu Search. The algorithm initially uses a "greedy randomized scheme" to find
and fix a portion of the assignment and then the reduced problem is optimally solved. These initial solutions form a
high-quality upper bounds and are the inputs into the iterated Tab Search algorithm for further improvement. This
process is repeated until the condition for termination is met.
21
A PREPRINT - JANUARY 16, 2020
22
A PREPRINT - JANUARY 16, 2020
the position and Equation 63 for updating the velocity. In these equations pbest is the local best for a particle, gbest is
global best based on the fitness evaluation, w is the inertia weight for exploring the space, r1 and r2 are random binary
numbers, w and c are weight factors, v is the velocity of a particle, and s is a particle’s position.
sk+1
i = ski + vik+1 (62)
3.7 Clustering
23
A PREPRINT - JANUARY 16, 2020
The results, as shown in Table 16, showed that the Density-Based algorithm reduced the average distance up to 50%;
proving the most effective of all the selections.
ci
1 X
vi = xi (64)
ci j=1
ci
1 X
vi = P xi wi (65)
wi j=1
Li et al. confronted a location-based problem by utilizing real traffic information to minimize the average traveling
time to an incident [76]. They incorporated times based on actual GPS data and resolved the model with a local
search heuristic known as "partitioning around medoids" (PAM). A medoid is an object within a cluster that has a
minimal dissimilarity to others around it. Algorithm 5 presents this solution where the input is a road network G, a
set of emergency requests R, a travel time matrix M , and an initial set of k ambulance stations Fini . The output of
the algorithm is a set of k ambulance stations. In it each current medoid p was replaced by every vertex p0 , and the
average travel time of the replacement was estimated (lines 4 to 12). Following the replacement, the one with the largest
reduction is maintained forward (line 14). Li et al. further improved the Algorithm 5 by selecting k initial stations, then
pruning unpromising vertex replacements. This was performed until the travel time could not be further reduced. The
results of this solution minimized travel time by 29.9% compared to the original locations that were in use.
4 Future Research
This area is not nearly as well explored as the VRP or MCP. While there are a number of current methodologies being
tested, the research appears concentrated on simulation and well-known methods. Additionally, a significant amount of
the current released works have not explored beyond their initial conference publishing. Significant real-world data is
available through organizations like Ornge, yet their implemented research is locked at long running ILP solutions [30].
In this case there have been contributions through studies like that by Pond et al. [32]. This example is still preliminary
and similar to other related literature has room for improvement. The following summarizes some of the future research
possibilities:
24
A PREPRINT - JANUARY 16, 2020
As already suggested throughout this survey, there is significant research in static variations of these problems. While
this research is quite significant, most active systems must consider dynamic changes in the actual implementation.
Static problems are more acceptable in the case of ALP, yet are difficult for ARP as diversion of vehicles or unexpected
incidents are a regular occurrence. This opens the possibility for further research into dynamic scheduling and possibly
even locating shifting for vehicles. Some research has suggested further building off the advantage of the parallelism
seen in evolutionary algorithms, as these types of techniques function better in a dynamic shifting environment [96].
These methods can be modified to become self-adaptive, whereby a set of configurations with solution parameters are
encoded into the individual solutions of the dynamic problem. Another possibility is modifying the search space from
discrete to continuous [97]. The theory is that this transfer allows the methods used to better mimic manual requests
and it is noted that prediction based methods have already been applied to continuous dynamic problems. In the case of
the VRP, current research has shown an improvement in accuracy when compared against variants using a discrete
encoding.
There are multiple references to both VRP and MCP, since they have a very similar correspondence to ARP and ALP. In
general there are only small modifications done to the objective functions and the introduction of additional constraints.
This allows for the transference of existing algorithms from one to the other and the likelihood of being able to complete
research using these techniques. These method have already been proven in this domain and with minor alterations there
is little reason not to apply them to these instances. Additionally, there is a lack of hybrid methodologies in the current
research. This type of optimization uses either multiple algorithms simultaneously for solving a problem or combining
multiple algorithms together to utilize the advantage of either individually [98]. Admittedly, there is a difficulty with
successfully executing this as hybridizing usually requires a substantial understanding of the respective algorithms.
Even with detailed knowledge, there is still a level of randomization and trial and error. Many algorithms may be
incompatible with each other, while simpler methods like local search may be quite easy to join to more complex
versions. This has been already applied successfully to problems like the VRP [99]. However, even in this case the
research is still very preliminary at best and limited to evolutionary approaches.
A somewhat surprising missing link in the list of techniques are those applied to the domain of machine learning. In
general, neural network-based methods are not appropriate for all optimization problems; although, the lack of research
in this area does offer a great deal of opportunity. One method seeing some gradual introduction is around reinforcement
learning. Other similar domain methods, like supervised learning require a set of labelled training data to be utilized
successfully. This may be difficult to achieve in certain cases as the exact solutions needed for training may be unknown
and difficult to achieve. Reinforcement learning resolves this issues by eliminating the requirement of a labelled set.
This area uses a state-action pair, where the goal is to train a model to take actions which maximize a reward. This can
almost be seen as a sort of game as the system is trained to take the best possible move-set in order to be successful
[100]. This is not necessarily a new area; however, recent research has built upon this concept with the introduction of
deep reinforcement learning [101]. Using deep learning techniques like convolutional neural networks and combining
them with algorithms like Q-learning, extremely difficult problems can be solved. Two issues with this method are
the timing required for training and the formulation of the states to actions. Machine learning models are notorious
for taking a significant amount of time to train and with changing requirements, this may make it difficult to maintain
an acceptable model. On the other hand, the input and output will need to be encoded properly to be able to properly
implement this technique. A great deal of understanding will be needed for the problem space and the data must be
designed in a way to ensure the results are meaningful. Research in this area is not without a starting point, and there
has been some work completed related to the VRP and other combinatorial optimization problems [102, 103].
5 Conclusion
The design of EMS systems are incredibly complicated, yet very important given the critical nature of their environment.
Both location and routing problems have their own respective issues and vary greatly in design. Primary focus has been
towards conferences, which restricts more advanced findings and limits a great deal of this research to preliminary steps.
Additionally, both of these problems are NP-hard and utilize similar meta-heuristics for their solutions. As seen in
Table 1 and Table 2, much of the work has been completed on static variations of the problem with heavy emphasis on
simulation or biological algorithms. In the case of the former, there is a severe limitation on this as simulation does
not necessarily lead to the best possible solution. Its primary purpose is to emulate an existing or theoretical system,
making true optimization a problem and generalization not a guarantee. This is not to suggest simulation is useless, and
to the contrary, may show faults within an existing system. Rather than being utilized solely, it should be viewed as an
25
A PREPRINT - JANUARY 16, 2020
initial step before actual algorithms are introduced for improvement. Overall, there is still significant research required
to ensure optimal EMS systems, with current literature providing ground work for advancement and comparison.
References
[1] Paolo Toth and Daniele Vigo. The vehicle routing problem. SIAM, 2002.
[2] Alexander A. Ageev and Maxim I. Sviridenko. Approximation algorithms for maximum coverage and max
cut with given sizes of parts. In Gérard Cornuéjols, Rainer E. Burkard, and Gerhard J. Woeginger, editors,
Integer Programming and Combinatorial Optimization, pages 17–30, Berlin, Heidelberg, 1999. Springer Berlin
Heidelberg.
[3] E. Mouhcine, Y. Karouani, K. Mansouri, and Y. Mohamed. Toward a distributed strategy for emergency
ambulance routing problem. pages 1–4. IEEE, April 2018.
[4] Luce Brotcorne, Gilbert Laporte, and Frédéric Semet. Ambulance location and relocation models. European
Journal of Operational Research, 147:451–463, 06 2003.
[5] Suresh Nanda Kumar and Ramasamy Panneerselvam. A survey on the vehicle routing problem and its variants.
Intelligent Information Management, 04, 01 2012.
[6] M. Bhuvaneswari, Sumathy Eswaran, and S.P. Rajagopalan. A survey of vehicle routing problem and its solutions
using bio-inspired algorithms. International Journal of Pure and Applied Mathematics, 118:259–264, 01 2018.
[7] Ramasamy Panneerselvam and Ramasamy Panneerselvam. Literature review of covering problem in operations
management. International Journal of Services, Economics and Management, 2:267–285, 01 2010.
[8] Bruce Golden, Saahitya Raghavan, and Edward Wasil. The vehicle routing problem: Latest advances and new
challenges, volume 43. 01 2008.
[9] Hanaa A. E. Essa, Yasser M. Abd El-Latif, Salwa M. Ali, and Soheir M. Khamis. A new approximation algorithm
for k-set cover problem. Arabian Journal for Science and Engineering, 41(3):935–940, Mar 2016.
[10] C.-Y Liong, I. Wan, and Khairuddin Omar. Vehicle routing problem: Models and solutions. Journal of Quality
Measurement and Analysis, 4:205–218, 01 2008.
[11] Tomislav Erdelić and Tonci Caric. A survey on the electric vehicle routing problem: Variants and solution
approaches. Journal of Advanced Transportation, 2019:1–48, 05 2019.
[12] Sam Thangiah, Jean-Yves Potvin, and Tong Sun. Heuristic approaches to vehicle routing with backhauls and
time windows. Computers & Operations Research, 23:1043–1057, 03 1998.
[13] Nasser A. El-Sherbeny. Vehicle routing with time windows: An overview of exact, heuristic and metaheuristic
methods. Journal of King Saud University - Science, 22(3):123 – 131, 2010.
[14] Jing Fan. The vehicle routing problem with simultaneous pickup and delivery based on customer satisfaction.
Procedia Engineering, 15:5284 – 5289, 2011. CEIS 2011.
[15] Feiyue Li, Bruce Golden, and Edward Wasil. The open vehicle routing problem: Algorithms, large-scale test
problems, and computational results. Computers & Operations Research, 34(10):2918 – 2930, 2007.
[16] José Brandão and A Mercer. The multi-trip vehicle routing problem. Journal of the Operational Research
Society, 49:799–805, 08 1998.
[17] Tackling the complexity of designing multiproduct multistage batch plants with parallel lines: the application
of a cooperative optimization approach. In Anton Friedl, Jir̆í J. Klemes̆, Stefan Radl, Petar S. Varbanov, and
Thomas Wallek, editors, 28th European Symposium on Computer Aided Process Engineering, volume 43 of
Computer Aided Chemical Engineering, pages 979 – 984. Elsevier, 2018.
[18] Abdul Kadar, Abdul Kadar Muhammad Masum, Md Faruque, Mohammad Shahjalal, Md Iqbal, and Iqbal Sarker.
Solving the vehicle routing problem using genetic algorithm. International Journal of Advanced Computer
Science and Applications, 2, 08 2011.
[19] S.C. Ho and D. Haugland. A tabu search heuristic for the vehicle routing problem with time windows and split
deliveries. Computers & Operations Research, 31(12):1947 – 1964, 2004.
[20] A.A. Sarwono, T.J. Ai, and S.S. Wigati. Combination of nearest neighbor and heuristics algorithms for sequential
two dimensional loading capacitated vehicle routing problem. IOP Conference Series: Materials Science and
Engineering, 166:012029, jan 2017.
26
A PREPRINT - JANUARY 16, 2020
[21] Lijun Wei, Zhenzhen Zhang, Defu Zhang, and Stephen C.H. Leung. A simulated annealing algorithm for the
capacitated vehicle routing problem with two-dimensional loading constraints. European Journal of Operational
Research, 265(3):843 – 859, 2018.
[22] John E. Bell and Patrick R. McMullen. Ant colony optimization techniques for the vehicle routing problem.
Advanced Engineering Informatics, 18(1):41 – 48, 2004.
[23] H. Shen, Y. Zhu, T. Liu, and L. Jin. Particle swarm optimization in solving vehicle routing problem. In 2009
Second International Conference on Intelligent Computation Technology and Automation, volume 1, pages
287–291, Oct 2009.
[24] Darshan Chauhan, Avinash Unnikrishnan, and Miguel Figliozzi. Maximum coverage capacitated facility location
problem with range constrained drones. Transportation Research Part C: Emerging Technologies, 99:1 – 18,
2019.
[25] Alberto Caprara, Paolo Toth, and Matteo Fischetti. Algorithms for the set covering problem. Annals of Operations
Research, 98(1):353–371, Dec 2000.
[26] Asaf Levin. Approximating the unweighted k-set cover problem: Greedy meets local search. In Thomas Erlebach
and Christos Kaklamanis, editors, Approximation and Online Algorithms, pages 290–301, Berlin, Heidelberg,
2007. Springer Berlin Heidelberg.
[27] Soumen Atta, Priya Ranjan Sinha Mahapatra, and Anirban Mukhopadhyay. Solving maximal covering location
problem using genetic algorithm with local refinement. Soft Computing, 22(12):3891–3906, Jun 2018.
[28] S. Balaji and N. Revathi. A new approach for solving set covering problem using jumping particle swarm
optimization method. 15(3):503–517, September 2016.
[29] Ralf Schleiffer, Jens Wollenweber, Hans-Jürgen Sebastian, Florian Golm, and Natasha Kapoustina. Application
of genetic algorithms for the design of large-scale reverse logistic networks in europe’s automotive industry.
volume 37, 01 2004.
[30] Timothy A. Carnes, Shane G. Henderson, David B. Shmoys, Mahvareh Ahghari, and Russell D. MacDonald.
Mathematical programming guides air-ambulance routing at ornge. INFORMS Journal on Applied Analytics,
43(3):232–239, 2013.
[31] Geoffrey N. Berlin and Jon C. Liebman. Mathematical analysis of emergency ambulance location. Socio-
Economic Planning Sciences, 8(6):323 – 328, 1974.
[32] Geoffrey T. Pond and Greg McQuat. Optimizing fleet staging of air ambulances in the province of ontario. In
David Fagan, Carlos Martín-Vide, Michael O’Neill, and Miguel A. Vega-Rodríguez, editors, Theory and Practice
of Natural Computing, pages 215–224, Cham, 2018. Springer International Publishing.
[33] P. L. van den Berg, J. T. van Essen, and E. J. Harderwijk. Comparison of static ambulance location models. In
2016 3rd International Conference on Logistics Operations Management (GOL), pages 1–10, May 2016.
[34] Verena Schmid. Solving the dynamic ambulance relocation and dispatching problem using approximate dynamic
programming. European Journal of Operational Research, 219(3):611 – 621, 2011. Feature Clusters.
[35] Gianpaolo Ghiani, Francesca Guerriero, Gilbert Laporte, and Roberto Musmanno. Real-time vehicle routing:
Solution concepts, algorithms and parallel computing strategies. European Journal of Operational Research,
151(1):1 – 11, 2003.
[36] Victor Pillac, Michel Gendreau, Christelle Guéret, and Andrés L. Medaglia. A review of dynamic vehicle routing
problems. European Journal of Operational Research, 225(1):1 – 11, 2013.
[37] R. Montemanni, L. M. Gambardella, A. E. Rizzoli, and A. V. Donati. Ant colony system for a dynamic vehicle
routing problem. Journal of Combinatorial Optimization, 10(4):327–343, Dec 2005.
[38] Zuzana Borčinova. Two models of the capacitated vehicle routing problem. Croatian Operational Research
Review, 8:463–469, 12 2017.
[39] Xueping Li, Zhaoxiao Zhao, and Tami Wyatt. Covering models and optimization techniques for emergency
response facility location and planning: A review. Mathematical Methods of Operations Research, 74:281–310,
12 2011.
[40] Juan Carlos Dibene, Yazmin Maldonado, Carlos Vera, Mauricio de Oliveira, Leonardo Trujillo, and Oliver
Schütze. Optimizing the location of ambulances in tijuana, mexico. Computers in Biology and Medicine, 80:107
– 115, 2017.
[41] Chun-Ling Chuang and Rong-Ho Lin. A maximum expected covering model for an ambulance location problem.
Journal of the Chinese Institute of Industrial Engineers, 24(6):468–474, 2007.
27
A PREPRINT - JANUARY 16, 2020
[42] Ettore Lanzarone, Enrico Galluccio, Valérie Bélanger, Vittorio Nicoletta, and Angel Ruiz. A recursive
optimization-simulation approach for the ambulance location and dispatching problem. In Proceedings of
the 2018 Winter Simulation Conference, WSC ’18, pages 2530–2541, Piscataway, NJ, USA, 2018. IEEE Press.
[43] Verena Schmid and Karl F. Doerner. Ambulance location and relocation problems with time-dependent travel
times. European Journal of Operational Research, 207(3):1293 – 1303, 2010.
[44] M. S. Maxwell, S. G. Henderson, and H. Topaloglu. Ambulance redeployment: An approximate dynamic
programming approach. pages 1850–1860. IEEE, Dec 2009.
[45] Eric Cao Ni, Susan R. Hunter, Shane G. Henderson, and Huseyin Topaloglu. Exploring bounds on ambulance
deployment policy performance. In Proceedings of the Winter Simulation Conference, WSC ’12, pages 45:1–
45:12. Winter Simulation Conference, 2012.
[46] R. Tavakkoli-Moghaddam, P. Memari, and E. Talebi. A bi-objective location-allocation problem of temporary
emergency stations and ambulance routing in a disaster situation. pages 1–4. IEEE, April 2018.
[47] Ali Javidaneh, Mahnaz Ataee, and Ali Alesheikh. Ambulance routing with ant colony optimization. 09 2019.
[48] Takwa Tlili, Sofiene Abidi, and Saoussen Krichen. A mathematical model for efficient emergency transportation
in a disaster situation. The American Journal of Emergency Medicine, 36(9):1585 – 1590, 2018.
[49] Manuel Fogue, Julio A. Sanguesa, Fernando Naranjo, Jesus Gallardo, Piedad Garrido, and Francisco J. Martinez.
Non-emergency patient transport services planning through genetic algorithms. Expert Syst. Appl., 61(C):262–
271, November 2016.
[50] Adrian Ramirez-Nafarrate, John W. Fowler, and Teresa Wu. Design of centralized ambulance diversion policies
using simulation-optimization. In Proceedings of the Winter Simulation Conference, WSC ’11, pages 1251–1262.
Winter Simulation Conference, 2011.
[51] Luca Talarico, Frank Meisel, and Kenneth Sörensen. Ambulance routing for disaster response with patient
groups. Computers & Operations Research, 56:120 – 133, 2014.
[52] Hazan Dağlayan and Murat Karakaya. An optimized ambulance dispatching solution for rescuing injures after
disaster. Universal Journal of Engineering Science, Sep 2016.
[53] Marco ""Oberscheider and Patrick"" Hirsch. Analysis of the impact of different service levels on the workload
of an ambulance service provider. BMC Health Services Research, 16:487, 09 2016.
[54] Panagiotis P. Repoussis, Dimitris C. Paraskevopoulos, Alkiviadis Vazacopoulos, and Nathaniel Hupert. Op-
timizing emergency preparedness and resource utilization in mass-casualty incidents. European Journal of
Operational Research, 255(2):531 – 544, 2016.
[55] Takwa Tlili, Marwa Harzi, and Saoussen Krichen. Swarm-based approach for solving the ambulance routing
problem. Procedia Comput. Sci., 112(C):350–357, September 2017.
[56] C. R. Kamireddy, Bingisateesh, and B. N. Keshavamurthy. Efficient routing of 108 ambulances using clustering
techniques. In 2016 IEEE International Conference on Computational Intelligence and Computing Research
(ICCIC), pages 1–6. IEEE, Dec 2016.
[57] A. Shuib and Z. A. Zaharudin. Taz_opt: A goal programming model for ambulance location and allocation.
pages 945–950. IEEE, Dec 2011.
[58] B. Lahijanian, M. H. F. Zarandi, and F. V. Farahani. Double coverage ambulance location modeling using fuzzy
traveling time. pages 1–6. IEEE, Oct 2016.
[59] S. Khodaparasti and H. R. Maleki. A new combined dynamic location model for emergency medical services in
fuzzy environment. pages 1–6. IEEE, Aug 2013.
[60] Hozumi Morohosi and Takehiro Furuta. Hypercube simulation analysis for a large-scale ambulance service
system. In Proceedings of the Winter Simulation Conference, WSC ’12, pages 108:1–108:8. Winter Simulation
Conference, 2012.
[61] Taesik Lee, Soo-Haeng Cho, Hoon Jang, and John G. Turner. A simulation-based iterative method for a trauma
center: Air ambulance location problem. In Proceedings of the Winter Simulation Conference, WSC ’12, pages
85:1–85:12. Winter Simulation Conference, 2012.
[62] Jo Røislien, Pieter L van den Berg, Thomas Lindner, E Zakariassen, O Uleberg, Karen Aardal, and J. Theresia
van Essen. Comparing population and incident data for optimal air ambulance base locations in norway. 2018.
[63] Pieter L. van den Berg, Peter Fiskerstrand, Karen Aardal, Jørgen Einerkjaer, Trond Thoresen, and Jo Røislien.
Improving ambulance coverage in a mixed urban-rural region in norway using mathematical modeling. PLOS
ONE, 14(4):1–14, 04 2019.
28
A PREPRINT - JANUARY 16, 2020
[64] Armann Ingolfsson, Susan Budge, and Erhan Erkut. Optimal ambulance location with random delays and travel
times. Health care management science, 11:262–74, 10 2008.
[65] R. Boujemaa, S. Hammami, and A. Jebali. A stochastic programming model for ambulance location allocation
problem in the tunisian context. pages 1–6. IEEE, Oct 2013.
[66] R. Zhang and B. Zeng. Ambulance deployment with relocation through robust optimization. IEEE Transactions
on Automation Science and Engineering, 16(1):138–147, Jan 2019.
[67] M. Benabdouallah, C. Bojji, and O. E. Yaakoubi. Deployment and redeployment of ambulances using a heuristic
method and an ant colony optimization — case study. pages 1–4. IEEE, Nov 2016.
[68] Ana Paula Iannoni, Reinaldo Morabito, and Cem Saydam. An optimization approach for ambulance location and
the districting of the response segments on highways. European Journal of Operational Research, 195(2):528 –
542, 2009.
[69] Satoshi Sasaki, Alexis Comber, Hiroshi Suzuki, and Chris Brunsdon. Using genetic algorithms to optimise current
and future health planning - the example of ambulance locations. International journal of health geographics,
9:4, 01 2010.
[70] S. M. Pacheco, O. Schütze, C. Vera, L. Trujillo, and Y. Maldonado. Solving the ambulance location problem in
tijuana-mexico using a continuous location model. pages 2631–2638. IEEE, May 2015.
[71] Meryam Benabdouallah and Chakib Bojji. Comparison between ga and aco for emergency coverage problem
in a smart healthcare environment. In Proceedings of the 2017 International Conference on Smart Digital
Environment, ICSDE ’17, pages 48–55, New York, NY, USA, 2017. ACM.
[72] Michel Gendreau, Gilbert Laporte, and Frédéric Semet. Solving an ambulance location model by tabu search.
Location Science, 5(2):75 – 88, 1997.
¨
[73] G"unes Erdogan, Erhan Erkut, Armann Ingolfsson, and Gilbert Laporte. Scheduling ambulance crews for
maximum coverage. Journal of the Operational Research Society, 61:543–550, 04 2008.
[74] Karl Doerner, Walter Gutjahr, Richard Hartl, Michaela Karall, and Marc Reimann. Heuristic solution of an
extended double-coverage ambulance location problem for austria. CEJOR. Central European Journal of
Operations Research, 13, 01 2005.
[75] W.A.L.W.M. Hatta, Cheng Siong Lim, Amar Faiz Zainal Abidin, Mohd Azizan, and Soo Siang Teoh. Solv-
ing maximal covering location with particle swarm optimization. International Journal of Engineering and
Technology, 5:3301–3306, 08 2013.
[76] Yuhong Li, Yu Zheng, Shenggong Ji, Wenjun Wang, Leong Hou U, and Zhiguo Gong. Location selection for
ambulance stations: A data-driven approach. In Proceedings of the 23rd SIGSPATIAL International Conference
on Advances in Geographic Information Systems, SIGSPATIAL ’15, pages 85:1–85:4, New York, NY, USA,
2015. ACM.
[77] Michel Bierlaire. Simulation and optimization: A short review. Transportation Research Part C: Emerging
Technologies, 55:4 – 13, 2015. Engineering and Applied Sciences Optimization (OPT-i) - Professor Matthew G.
Karlaftis Memorial Issue.
[78] J.R. Seay and F. You. 4 - biomass supply, demand, and markets. In Jens Bo Holm-Nielsen and Ehiaze Augustine
Ehimen, editors, Biomass Supply Chains for Bioenergy and Biorefining, pages 85 – 100. Woodhead Publishing,
2016.
[79] M.L. Littman. Markov decision processes. In Neil J. Smelser and Paul B. Baltes, editors, International
Encyclopedia of the Social & Behavioral Sciences, pages 9240 – 9242. Pergamon, Oxford, 2001.
[80] William A. Poe and Saeid Mokhatab. Chapter 4 - process optimization. In William A. Poe and Saeid Mokhatab,
editors, Modeling, Control, and Optimization of Natural Gas Processing Plants, pages 173 – 213. Gulf Profes-
sional Publishing, Boston, 2017.
[81] Mehrdad Tamiz, Dylan Jones, and Carlos Romero. Goal programming for decision making: An overview of the
current state-of-the-art. European Journal of Operational Research, 111(3):569 – 581, 1998.
[82] Jean-Michel Réveillac. 4 - dynamic programming. In Jean-Michel Réveillac, editor, Optimization Tools for
Logistics, pages 55 – 75. Elsevier, 2015.
[83] Adrian Ramirez-Nafarrate, A. Baykal Hafizoglu, Esma S. Gel, and John W. Fowler. Comparison of ambulance
diversion policies via simulation. In Proceedings of the Winter Simulation Conference, WSC ’12, pages
86:1–86:12. Winter Simulation Conference, 2012.
[84] Bülent Çatay. Ant Colony Optimization and Its Application to the Vehicle Routing Problem with Pickups and
Deliveries, pages 219–244. Springer Berlin Heidelberg, Berlin, Heidelberg, 2009.
29
A PREPRINT - JANUARY 16, 2020
[85] Holger H. Hoos and Thomas Stützle. 2 - sls methods. In Holger H. Hoos and Thomas Stützle, editors, Stochastic
Local Search, The Morgan Kaufmann Series in Artificial Intelligence, pages 61 – 112. Morgan Kaufmann, San
Francisco, 2005.
[86] Barrie M. Baker and M.A. Ayechew. A genetic algorithm for the vehicle routing problem. Computers &
Operations Research, 30(5):787 – 800, 2003.
[87] R. Leardi. 1.20 - genetic algorithms. In Steven D. Brown, Romá Tauler, and Beata Walczak, editors, Comprehen-
sive Chemometrics, pages 631 – 653. Elsevier, Oxford, 2009.
[88] Birger Funke, Tore Grünert, and Stefan Irnich. Local search for vehicle routing and scheduling problems: Review
and conceptual integration. Journal of Heuristics, 11(4):267–306, Jul 2005.
[89] David Pisinger and Stefan Ropke. Large Neighborhood Search, pages 399–419. 09 2010.
[90] Pierre Hansen, Nenad Mladenović, and José A. Moreno Pérez. Variable neighbourhood search: methods and
applications. Annals of Operations Research, 175(1):367–407, Mar 2010.
[91] Nitin Bhatia and Vandana. Survey of nearest neighbor techniques. International Journal of Computer Science
and Information Security, 8, 07 2010.
[92] Stefan Edelkamp and Stefan Schrödl. Chapter 14 - selective search. In Stefan Edelkamp and Stefan Schrödl,
editors, Heuristic Search, pages 633 – 669. Morgan Kaufmann, San Francisco, 2012.
[93] Clara Marina Martínez and Dongpu Cao. 2 - integrated energy management for electrified vehicles. In
Clara Marina Martínez and Dongpu Cao, editors, Ihorizon-Enabled Energy Management for Electrified Vehicles,
pages 15 – 75. Butterworth-Heinemann, 2019.
[94] Rui Xu and D. Wunsch. Survey of clustering algorithms. IEEE Transactions on Neural Networks, 16(3):645–678,
May 2005.
[95] J. Zhang. An efficient density-based clustering algorithm for the capacitated vehicle routing problem. In 2017
International Conference on Computer Network, Electronic and Automation (ICCNEA), pages 465–469, Sep.
2017.
[96] Nasser R. Sabar, Ashish Bhaskar, Edward Chung, Ayad Turky, and Andy Song. A self-adaptive evolutionary
algorithm for dynamic vehicle routing problems with traffic congestion. Swarm and Evolutionary Computation,
44:1018 – 1027, 2019.
[97] Michał Okulewicz and Jacek Mańdziuk. A metaheuristic approach to solve dynamic vehicle routing problem in
continuous search space. Swarm and Evolutionary Computation, 48:44 – 61, 2019.
[98] Xin-She Yang. Chapter 15 - other algorithms and hybrid algorithms. In Xin-She Yang, editor, Nature-Inspired
Optimization Algorithms, pages 213 – 226. Elsevier, Oxford, 2014.
[99] Dragan Simić and Svetlana Simić. Hybrid artificial intelligence approaches on vehicle routing problem in
logistics distribution. In Emilio Corchado, Václav Snášel, Ajith Abraham, Michał Woźniak, Manuel Graña, and
Sung-Bae Cho, editors, Hybrid Artificial Intelligent Systems, pages 208–220, Berlin, Heidelberg, 2012. Springer
Berlin Heidelberg.
[100] Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. Reinforcement learning: A survey. J. Artif.
Int. Res., 4(1):237–285, May 1996.
[101] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and
Martin A. Riedmiller. Playing atari with deep reinforcement learning. ArXiv, abs/1312.5602, 2013.
[102] Mohammadreza Nazari, Afshin Oroojlooy jadid, Lawrence Snyder, and Martin Takáč. Deep reinforcement
learning for solving the vehicle routing problem. 02 2018.
[103] Hanjun Dai, Elias B. Khalil, Yuyu Zhang, Bistra Dilkina, and Le Song. Learning combinatorial optimization
algorithms over graphs. In Proceedings of the 31st International Conference on Neural Information Processing
Systems, NIPS’17, pages 6351–6361, USA, 2017. Curran Associates Inc.
30