0% found this document useful (0 votes)
12 views

Pedestrians and Cyclists Intention Estimation For

Most frequently, an FPGA is used as an implementation platform in applications of graphics processing, as its structure can effectively exploit both spatial and temporal parallelism. Such parallelization techniques involve fundamental restrictions, namely being their dependence on both the processing model and the system’s hardware constraints, that can force the designer to restructure the architecture and the implementation. Predesigned accelerators can significantly assist the designer to sol

Uploaded by

Phú Triệu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Pedestrians and Cyclists Intention Estimation For

Most frequently, an FPGA is used as an implementation platform in applications of graphics processing, as its structure can effectively exploit both spatial and temporal parallelism. Such parallelization techniques involve fundamental restrictions, namely being their dependence on both the processing model and the system’s hardware constraints, that can force the designer to restructure the architecture and the implementation. Predesigned accelerators can significantly assist the designer to sol

Uploaded by

Phú Triệu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Siméon Capy et al.

/ International Journal of Automotive Engineering


Vol.14, No.1(2023)
Review Article 20234115

Pedestrians and Cyclists’ Intention Estimation for the Purpose of


Autonomous Driving
- A Systematic Review -

Siméon Capy 1) Gentiane Venture 1) Pongsathorn Raksincharoensak 1)


1) Tokyo University of Agriculture and Technology, Graduate School of Engineering
2-24-16 Naka-cho, Koganei, Tokyo, 184-8588, Japan (E-mail: [email protected])

Received on September 28th 2022

ABSTRACT: This article provides a systematic review of research articles on pedestrians and cyclists’ intention recognition
to be integrated into autonomous vehicles, especially for decision making and motion planning. We firstly describe why the
intention recognition of pedestrians and cyclists is suitable and necessary for autonomous vehicles and why they cannot only
rely on traffic regulation laws. Then, we summarise, amongst others, the methodology and sensors used by eighteen peer-
reviewed research articles published in relevant conferences and journals. We performed a systematic review of articles of the
last 10 years from the following databases: IEEE Xplore, Science Direct, ACM digital library, Springer Link, MDPI and Web
of Science. We observe from the collected articles that most of them are relying on several sensors, with a predominance
including video. They mostly try to obtain the probability of crossing or the trajectory of the pedestrian/cyclist, mostly using
a Recurrent Neural Network. In addition to their algorithmic contribution, 4 studies also provide a dataset. We conclude this
article by talking about the remaining open challenges.

KEY WORDS: Active Safety, Autonomous Vehicles, Vulnerable Road Users, Motion Prediction and Intention Prediction

1. Introduction Table 1: Traffic law differences for P&Cs

If the development of successful Autonomous Vehicles (AV)


has already been done on motorways (1), the urban areas are more
challenging. Indeed, even if the speed limit is lower, the traffic flow
is intermittent with a lot of stops, with a shorter braking distance
between vehicles. In addition, the intersections between roads are
complicated in structures and the traffic in intersections are not as
smooth as on motorways, with yield/stop signs or traffic lights and
need negotiation with other traffic participants. Finally, one of the
biggest challenges is the uncertainty that vulnerable road users, i.e.,
pedestrians or cyclists (P&C) are adding.
Indeed, the behaviour of P&Cs is harder to predict, as they
have the tendency to less follow the rules, which forces AVs to
estimate the action of P&Cs, like a human driver, will do (2)–(4).
Another point is the difference between each country regarding
traffic code and the culture-related behaviour of each person.
Finally, Fig. 1 shows the importance to take countermeasures to
reduce the number of deaths for P&Cs, especially in Japan. Even if
in France it is lower, 72% of deadly accidents involving pedestrians
happen in cities, where the intention estimation for AVs could
reveal its full potential.
In this paper, we will first identify the reasons why prediction
is required, especially in urban environments. Thereupon, we will
2. The Needs of Prediction
compare 3 countries: Japan, Germany and France. Then, we will
present the methodology used to fetch the different studies, before One key point while driving is to consider the differences
presenting them. We will finally discuss the open challenges. between countries whether in law or in culture. AVs have to adapt
to this to have safer driving. And, even if some international treaties
exist to standardise the rules, e. g. the Vienna Convention on Road
Signs and Signals, not all countries ratified it, notably the USA or
Japan. Apart from the driving side, signs and crossroad design can

©2023 Society of Automotive Engineers of Japan, Inc. This is an open access article under the terms of the Creative Commons
Copyright  2015 Society of Automotive Engineers of Japan, Inc. All rights reserved
Attribution-NonCommercial-ShareAlike license.
10
Siméon Capy et al. / International Journal of Automotive Engineering
Vol.14, No.1(2023)

vary; in addition, the way the law punishes some misbehaviour of safety of pedestrians by reducing the death rate in case of
P&Cs could have an impact on them, see Table 1. For example, accident (12). In those zones, the road marking sing tend to disappear
Hell et al. showed that the presence of a Pedestrian Flashing Green as well as traffic lights. The pedestrians can cross where they want,
light (PFG) has an impact on the behaviour of Japanese pedestrian and the bicycle can drive the wrong way. Hence, the prediction of
speed, contrary to their German counterpart (5). the action of P&Cs by AVs is primordial because ones cannot rely
The AVs will have to deal with different behaviours, but also on established traffic structures. In addition to that, in some
interpret some laws. Because, even if in Japan it is technically European countries some Encounter zones (zone de rencontre in
forbidden to drive bicycles on sidewalks (sidewalks with no signs French or Begegnungszone in German) can be found. In those
indicating a sharing space with pedestrians), the police do not take zones, the speed limit is 20 km/h and the pedestrians have the
action and let people do it; and most Japanese cyclists drive on priority over the vehicles, and they can walk anywhere on the road.
sidewalks, even if an on-carriageway cycle lane exists. Finally, with the development of cycling, especially in Europe,
Some specific urban planning can be found in cities, such as some innovations appeared such as the Turn-Right for Cyclist that
30 km/h zone 1 that aims to facilitate the soft mobility, and the allows a cyclist to turn right to a red traffic light if the way is free

(a) Japan, data from the National Police Agency

(b) Metropolitan France, data from the Observatoire national interministériel de la sécurité routière and the
Institut national de la statistique et des études économiques

Figure 1: Death in traffic accidents for every 100,000 persons per type of user

1 20 mph zone in the UK or USA

Copyright  2015 Society of Automotive Engineers of Japan, Inc. All rights reserved

11
Siméon Capy et al. / International Journal of Automotive Engineering
Vol.14, No.1(2023)

Figure 3: Examples of motion planning and control


(a) Germany (b) France considering pedestrian motion prediction proposed by
Raksincharoensak (18)
Figure 2: Sign for the Turn-Right for Cyclist. It can be
replaced by a bicycle-shaped flashing orange light autonomous emergency braking systems (AEBS) for pedestrians
are currently available on market, the current systems cannot avoid
(like a give way sign), see Fig. 2. In the USA every road user can collisions in all cases especially when a pedestrian darts out from a
do so, but in Europe or Japan, it is not allowed to turn right if the blind area or suddenly moves into the driving corridor of the
traffic light is red. vehicle with a short time margin to collision or when a pedestrian
suddenly changes his or her walking direction and enters the
2.1. Pedestrian behaviour
driving corridor of the vehicle.
Studies of pedestrian behaviours are important for Road Traffic In such risky traffic situations, driving school instructors,
Safety (RTS) as they are one of the most vulnerable road users, and referred to as expert drivers in this paper, mitigate the risk of
what has been learnt can be used for AVs algorithms. The collision by risk-predictive driving such as reducing speed and
unpredictable behaviours of the pedestrian are responsible for most taking sufficient lateral distance in advance of passing pedestrians
accidents involving them in the US; for example, two-thirds of who may dart out into the driving corridor. If a pedestrian actually
pedestrians did not look around before crossing the road (2). The darts out in a critical case, emergency braking can successfully
authors also mentioned that over two-thirds of pedestrian fatalities
avoid the collision with the contribution of risk predictive driving.
occur outside of intersections. This statement reinforces the need
for AVs to have a prediction of pedestrian movements and not only Therefore, an assistance system for drivers to perform risk
rely on the road design. predictive driving in risky traffic situations will be a promising way
Hell et al. (5) compared the difference in behaviours of for improving road safety.
pedestrians between Japan, Germany and France. For example, To develop the abovementioned risk predictive driver assistance
French pedestrians are more likely to cross at red (55%) than the system, a reasonable and interpretable risk assessment method is
Japanese one (1.5%). The Japanese collectivist culture tends to
essential as well as motion planning based on the risk assessment
reduce the violations, thus if the number of waiting for pedestrians
increases, the number of illegal crossings drops by 70% in Japan, results. Reasonable risk assessment strongly relates to how the
and only 37% in France (13)(14). The mimicry effect amongst future position of pedestrians or cyclists can be predicted. As
pedestrians is stronger in France than in Japan (15). The latter examples of the active safety research in this area,
studies (15)(14) also show a difference according to the gender of Raksincharoensak proposed an integrated lateral and longitudinal
pedestrians: men are more likely to take risks than women. motion planning and control for overtaking a frontal pedestrian
considering the future motion of pedestrian under the assumption
2.2. Cyclist behaviour
that the pedestrian may change its direction and velocity and do
According to Billot-Grasset et al (16), a lot of accidents involving jaywalk into the driving corridor (18). Inoue et al proposed a risk
cyclists occur because of a lack of attention of the cyclist. The potential optimization method considering multiple objects in
infrastructure can also be responsible for their poor visibility. In urban road environments (19). Yamaguchi et al proposed a
addition, faulty maintenance of the bicycle (e.g., the brakes) is also simulation model to analysis the intention of pedestrian in crossing
responsible for some accidents. Finally, alcohol consumption and the roads which can be integrated into the design of autonomous
over speed can also be responsible for accidents.
driving control (20). Wang et al studied an interactive dynamic speed
Even if specific layouts can increase the safety of cyclists, such
planning to cope with sudden crossing of pedestrians in parking
as segregated cycle lanes or greenways, the intersections with roads
area using spatiotemporal interaction, describing the pedestrian
are a major safety concern. As demonstrated by Anderson et al.,
crossing risk assessment. These systems potentially reduce the
even the presence of flashing signals does not make necessarily the
frequency of unnecessary harsh emergency braking in vehicle
drivers slow down (17). Moreover, most cyclists do not push the
control (21).
button to announce their intention to cross, as it is inconvenient for
In designing the autonomous collision avoidance system, it is
them to do so. Again, like pedestrians, AVs cannot only rely on
important to know the representative characteristics of the
road rules to interact with cyclists and need to interpret and
pedestrian and cyclist motions to set the activation strategy for the
perceived their intention.
system. There was a study by Raksincharoensak et al in the past
2.3. Application to Next Generation of Advanced Driver that analyses the pedestrian and cyclist motions from video images
Assistance Systems collected from the drive recorders in taxi vehicles in Japan (22).
Figure 4 shows the pedestrian and cyclist darting-out speed and
Driving on urban roads accompanies potential risks, such as their appearance timing. These key parameters are strongly
when passing by a blind area or poor visibility area which contains relevant to the design of the activation timing of active safety
risks of collision with potential darting-out pedestrians. Although systems in intelligent vehicles.

Copyright  2015 Society of Automotive Engineers of Japan, Inc. All rights reserved

12
Siméon Capy et al. / International Journal of Automotive Engineering
Vol.14, No.1(2023)

Figure 4: Pedestrian motion Analysis from real-world driving data Figure 5: Process of the search strategy

the intention estimation for cyclists. Step 3: we eliminated 6 papers,


one because it was written in Spanish, one was doing only detection
3. Methodology
of the pedestrian, and the 4 others were not accessible. The final
list counts 20 articles, with the 4 added articles, see Table 2.
We followed the guidelines of systematic reviews described
in (23,24,25).
4. Pedestrians/Cyclist Intention Estimation
3.1. Research question
4.1. The keys of the intention estimation
The main focus of this article is to analyse the algorithms and
sensors used in recent research to perform P&C intention One of the main tasks AVs would have to achieve to deal with
estimation, for the purpose of autonomous driving. pedestrians is to compute the Crossing/Not Crossing probability
(P[C/NC]). As mentioned in the previous section, it can append
3.2. Search process either on defined areas such as crosswalks or anywhere due to the
unpredictiveness of pedestrian behaviour. Thus, Kotseruba et al. (51)
The main concepts used to compose the search string are intention pointed out that, in the PIE dataset, even if the majority of crossings
estimation, movement prediction, pedestrian and cyclist. Then, we occur at crosswalks, there are still 30% that are not; reinforcing the
define the search string as (‘Pedestrian’ OR ‘Cyclist’) AND need for intention prediction. C/NC task can be divided into four
(‘intention estimation’ OR ‘movement prediction’). We use this essentials parts as described by Völz et al. (31):
string to search potential articles in relevant digital databases (IEEE i. P&C detection,
Xplore, Science Direct, ACM digital library, Springer Link, MDPI ii. Tracking,
and Web of Science). The systematic search performed includes iii. Pose estimation,
publications from the last 10 years, until February 2022. The iv. Intention classification.
number of papers after the process is 197. For cyclists, the most common task is trajectory prediction.
Indeed, cyclists can be roadway users as well and will not cross the
3.3. Inclusion and exclusion criteria road like a pedestrian. Anticipating their movements is also
essential, such as detecting arm signal (26). However, some
We considered the following inclusion criterion: Papers that variations can exist amongst countries, and some cyclists can forget
pro- pose a solution to estimate cyclist and pedestrian intention to use them. Only one of the selected studies does not achieve
estimation by relying, at least, on one vehicle sensor. We have also trajectory estimation regarding cyclists but is detecting the position
included 4 articles that the search did not find and that experts of the arm instead. The trajectory estimation can also be an
recommended (26,27,28,29). alternative to the C/NC task for pedestrians; over the selected
We then applied the following exclusion criterion: (i) if the studies, 8 do it that way, and 10 use the C/NC output.
article is not in English. (ii) the paper is not accessible. (iii) the Some cues are essential to achieve a good estimation, most of
study is only doing detection of the P&C. them are similar between pedestrians and cyclists. Then, Kotseruba
et al. (51) pointed out that the orientation towards the road is a key
3.4. Selection of papers point regarding the intention estimation. The gaze and the head
orientation as well, and Hemeren et al. (53) mention it as well for the
The summary of the methodology is presented in Fig. 5. A total cyclists. Concerning the latter, the speed is also a valuable clue.
of 197 articles were found by the search engines. We then applied As a mirror study, Utriainen et al. (3) described the features an
the following steps: Step 1: we skimmed the list of off-topic articles AV should have to encounter cyclists safely. Even if most of them
after reading the title and abstract. The list was reduced to 41 papers. are applicable to conventional cars (CV) as well, the feature 3 is
Step 2: we applied the inclusion criterion after reading the articles. interesting: “the [AV] should indicate its intentions to cyclists in a
After the application of it, the number of fetched papers is 22. Some clear and a consistent manner”. Albeit some cues are applicable to
articles were not selected because the solutions used external both, AVs and CVs, such as decreasing the speed or using the turn
cameras (such as UAVs or traffic CCTVs), or pedestrian embedded signal; eye contact and hand gesture are also valuable but not
sensors. We can also mention Utriainen et al.’s paper (3) that is not doable, at first sight, with AVs. Hence, in their prototype, Alvarez
proposing a technical solution but provides guidelines to achieve et al. (34), used a screen with fake eyes on their car to give an

Copyright  2015 Society of Automotive Engineers of Japan, Inc. All rights reserved

13
Siméon Capy et al. / International Journal of Automotive Engineering
Vol.14, No.1(2023)

Table 2: Comparison of the studies to estimate pedestrians/cyclists' intention. The detail of the acronyms is given after the conclusion. P: pedestrian;
C: cyclist; V: vehicle. When the word and is used to describe the method, it means the authors developed several algorithms to compare them; the sign
+ is used to describe the several components of the same algorithm.

Study P/C/V Sensor Method Output Dataset Remarks


Video RNN encoders- Trajectory est.
Rasouli et al. (2019) (30) P PIE Creation of the dataset
VS decoders(LSTM) P[C/NC]
Völz et al. (2016) (31) P Lidar CNN (LSTM) and DNN P[C/NC] self Focusing on crosswalk
Extend TPB with MLP + P[C/NC]
Wu et al. (2021) (32) P Video PIE
RNN
Video P[C/NC] Focusing on crosswalk;
Hashimoto et al. (2015) (33) P DBN self
TLS TLS is set manually
3D video P[C/NC]
Alvarez et al. (2020) (34) P Stacked RNN (GRU) PIE Tested with real vehicle
VS
Future
CGAN (LSTM encoders- Daimler (36)
Sun et al. (2019) (35) P Video position
decoders)
estimation Context (37)
Hough transformation and
Stolz et al. (2018) (38) C Radar DoM self
RANSAC
Video CVAE + Scene graph + Trajectory est.;
Girase et al. (2021) (39) P/C/V LOKI Creation of the dataset
Lidar GRU Intention
P[C/NC] NTSEL
Kataoka et al. (2018) (40) P Video Convnet + DeCAF + SVM Creation of the datasets
NDRDB
Focus of the interaction be-
P[C/NC] OpenDS- tween 2 pedestrians
Muscholl et al. (2021) (41) P Video DBN
CTS2 across the street; Creation
of the dataset
JAAD(43)
Poibrenski et al. (2020) (42) P Video CVAE + RNN Trajectory est. ETH
UCY
NuScenes(45)
Video Lyft(46) Tested with virtual
Kalatian et al. (2022) (44) P RNN (Aux-LSTM) Trajectory est.
Lidar Waymo(47) environ-ment as well
PIE
Kooij et al. (2019) (48) P/C 3D video DBN (SLDS) Trajectory est. Context
Ferguson et al. (2015) (49) P Lidar Changepoint-DPGP Trajectory est. self Tested with rover
Video
Lidar Trajectory est.
Wu et al. (2019) (50) P DBN + DSF self
IMU P[C/NC]
VS
Future
Video RNN encoder-
Kotseruba et al. (2020) (51) P position PIE
VS decoder(GRU)
estimation
Faster RCNN (P) P[C/NC] (P) JAAD (P)
Fang et al. (2019) (26) P/C Video + RF Creation of CASR
[Arm pos] (C) CASR (C)
Mask R-CNN (C)
Saleh et al. (2018) (27) C Video RNN (U- and B-LSTM) Trajectory est. CTD(52)
3D video P[C/NC]
Schulz et al. (2015) (28) P LDCRF Daimler
VS
Video Use the GPS to know if
Møgelmose et al. (2015) (29) P Munkres + IPM Trajectory est. self P.enters hazardous zone
GPS

indication to pedestrians. However, Nunez et al. also showed that (1). In addition, the solution of Wu et al. (50) is the most complete
cyclists do not have different behaviours between AVs and CVs, by using the data of 4 different sensors. Finally, one can notice that
but rely first on other cues, such as speed (54). Then, the algorithm three solutions solely rely on remote sensing methods (lidar or
should not focus strongly on the gaze. radar).
Five studies are explicitly using the head orientation of the The wide usage of video can be explained, on the one hand,
P&C to determine their intention (32,41,44,48,28). Five additional thanks to the accessibility. The digital camera became an affordable
studies check the body posture/orientation (30,34,38,50,26). We cannot and small enough device over the last three decades. On the other
notice any correlation with the output of each methodology, studies hand, much research has already been done about image
looking for both, P[C/NC] and trajectory estimation, can use this recognition, and not only in the field of autonomous driving. For
data. On the other hand, we can notice when the input does not instance, in 2002, Zhao et al. proposed a real-time head orientation
come from a video sensor but from a lidar, the orientation is harder estimation method (55), some years later, other studies were realised
to get and then not used. The only study using radar, from Stolz et to achieve human detection (56,57). And in 2017, the OpenPose
al. (38), can access the direction since it is done for a cyclist to algorithm, which is now widespread to achieve human pose
determine the DoM. estimation (HPE) has been realised (58). All of these methods use
RGB images as input, and most of the literature is doing so.
4.2. The sensors used One can notice a limitation of the video that appears in dark
conditions, especially during the night. The pedestrians might be
Using video is the most common input used with 85% of the less visible to the car. This can also happen during the day if the
studies using it, but less than half (8) solely rely on it. They can contrast between the pedestrian and the background is low.
complete either with a lidar (3), using the vehicle speed (5) or GPS

Copyright  2015 Society of Automotive Engineers of Japan, Inc. All rights reserved

14
Siméon Capy et al. / International Journal of Automotive Engineering
Vol.14, No.1(2023)

Nonetheless, some research using different sensors exists. Fürst pedestrians. Since this dataset has been published only in October
et al. for example, combined RGB images with a lidar to improve 2021, it explains that no other studies are using it.
the HPE, especially regarding the depth (59). Even if most of the
HPE methods rely on images, we can find some without. Li et al.
5. Discussion and Open Challenges
developed an algorithm to achieve the HPE by using only a
radar (60). Concerning the twenty reviewed studies, only Völz et
al. (31) and Ferguson et al. (49) that rely on lidar to get the trajectory, The detection of pedestrians and their avoidance will be a
and(38) that uses radar to catch the direction of movement of a critical security issue for future AVs. People will expect them to be
cyclist, do not use a camera. at least as efficient as human drivers, or even more, to understand
Finally, none of the studies is using data from the infrastructure, the intention of P&Cs. Jing et al. (62) reviewed the acceptance
such as traffic light state, traffic condition, data from other cars... parameter of the AVs. Safety is the most common one, people
Hashimoto et al. are using the state of the traffic light, but this generally feel that AVs are less safe than CVs, and the trust towards
information is set manually in the dataset (33). The lack of usage of AVs is correlated to the perceived risk. Interestingly, they also
such technology can explain that no studies use them. They are pointed out that the perception of the risk is higher regarding AVs
generally used only for experimental purposes, such as in the than CVs for the passenger, and it is the opposite for pedestrians.
experiment of PSA and Vinci (1) where the car is using data from To judge the efficiency of the algorithms, they should be tested
the motorway’s toll gate. in real conditions. However, only one study from Alvarez et al. (34)
did an experiment with a real vehicle, but not in a real condition
4.3. The methods (the car drove only on the campus). The next challenge will be to
test the solutions in real cities. We can also consider the intention
One of the first steps of all those methods is to identify the P&C, estimation algorithm as an Advanced Driver Assistance System
as described in the last section with human detection or HPE (ADAS). In that case, the efficiency of the system is less critical
algorithms. We can see 3 patterns here, with 5 studies using HPEs, because the human driver is still the real master. ADASs that
9 studies simply using detection (to get the position or bounding automatically stop the vehicle to avoid collision with P&Cs already
box) and 5 studies that get the trajectory. Withal, Kataoka et al. are exist (63). One can imagine another layer that alerts the driver when
using optical flow to detect pedestrians (40). the system detects the crossing intention of a pedestrian. It will also
About half of the studied methods are using RNN as the main be a way to test the algorithms in real conditions and improve them
method to achieve intention recognition. As mentioned by Kalatian before using them with AVs. Furthermore, Jing et al. also pointed
et al. (44), the high correlation, temporally and spatially, of the out that drivers are more inclined to trust AVs if they experienced
pedestrian pattern makes RNN a good candidate. Indeed, they are ADASs in the past.
good to deal with trajectories and dynamical systems (61). The way Linked to the lack of testing, the algorithms are just giving a
to achieve the network is mostly using LSTM with some variation probability or a trajectory. But none of the studies describes a
in the application since the network needs to remember previous strategy further. Should the car stop when the probability reaches a
states to deal with the trajectory. threshold? Should it maybe decrease the speed? Should it try to
The other common method is to use a DBN. They are an change the trajectory? Or maybe, should it use the klaxon or flash
extension of Bayesian networks, made for dynamic systems. They the lights? By using the algorithms as an ADAS, those questions
can determine the probability for a specific event to happen, are answered by the drivers, but they will also need to be managed
according to its evolution through time. That makes it suitable to by the AVs.
obtain the probability of the event crossing to happen according to We can also notice that Wu et al. (32) combines the results of two
the evolution of the pedestrian through time. algorithms (“strategies”) to obtain the probability of crossing, the
For examples, Fang et al. (26) used a faster RCNN with a random other collected studies are proposing a unique algorithm (some of
forest (RF) algorithm to detect if the pedestrian wants to cross or them did several algorithms, but they work separately). Using
not. They use mononuclear vision and body posture recognition to redundancy is a way to reduce the false-negative by having
extract some angles and distance features from the skeleton, and different point of view, and to make the system more robust. This
they send them to the RF classifier. They found that using more is another open challenge, to try the efficiency of different
time frames improve the accuracy of the classification (between 2 algorithms combined versus unique ones. This solution might be
and 8 %) They can predict the crossing of a standing pedestrian 8 limited by the computational resource available.
frames after the movement has started (~250 ms). Finally, they As seen in section 4.2., none of the reviewed solutions is using
found that there are only 3-keypoint angles for the prediction data from the infrastructure (V2I). However, in the future, the so-
(between shoulders and legs, and between shoulders and wait). called smart roads are expected to share many data (64), such as
We can also point out the article of Völtz et al. (31), that uses traffic, temporary work zones, weather conditions, accident
Lidar data to feed their algorithms. They compared two ways using position, TLS and other rules. Those data will be more precise and
a RCNN and DNN, to a baseline algorithm: a SVM. Both of their reliable than using third-party application data (like Google Maps,
solutions outperformed it, however they pointed out that the RCNN Waze...) for the traffic/accident/weather/etc. or using video
has some difficulties to catch significant movement changes. recognition for signs/TLS. In addition, to detect when a P&C is
crossing the road, the AV will be able to also rely on data provided
4.4. Datasets by the road (65). Finally, in the context of smart cities, the vehicles
could share data together (V2V), and then each vehicle will be able
The most common dataset used is the PIE created by Rasouli et to use the point of view of the nearby vehicles. Hence, even if the
al. (30). They decided to create it because the previous existing pedestrian is not visible (because of another vehicle, a tree, a
dataset, such as JAAD, were lacking the information about the building...) the car will be able to adapt.
intention of the pedestrian, necessary to train the algorithms. Hence,
5 studies use PIE and the other common way is to use its own one
6. Conclusion
(5 studies). However, now, the LOKI dataset (39) seems to be the
most complete, having, in addition to video, LiDAR data. The
intention is also available for the vehicle and not only for the Recent research and development of autonomous vehicle
functions are shifting from highway to urban road driving scenario.

Copyright  2015 Society of Automotive Engineers of Japan, Inc. All rights reserved

15
Siméon Capy et al. / International Journal of Automotive Engineering
Vol.14, No.1(2023)

last is the use of several algorithms in parallel that could help to


decrease the false-positive rate.

Abbreviations

List of abbreviations used in this paper:

ADAS Advanced Driver Assistance System


CGAN Conditional Generative Adversarial Nets
CVAE Conditional Variational AutoEncoder
DBN Dynamic Bayesian Network
DeCAF Deep Convolutional Action Feature
DoM Direction of Movement
DPGP Gaussian process mixture model
DSF Driving Safety Field
GRU Gated Recurrent Unit
IMU Inertial Measurement Unit
(a) Driving scenes that need motion prediction IPM Inverse Perspective Mapping
LDCRF Latent-Dynamic Conditional Random Fields
LSTM Long Short Term Memory
B-LSTM Bidirectional LSTM
U-LSTM Unidirectional LSTM
MLP Multi-Layer Perceptron
NN Neural Network
CNN Convolutional NN
DNN Dense NN
RNN Recurrent NN
RANSAC RANdom Sampling And Consensus
RF Random Forest
SLDS Switching Linear Dynamical System
SVM Support Vector Machine
(b) P&Cs darting-out scenes TLS Traffic Light State
TPB Theory of Planned Behaviour
Figure 6: Example of urban driving condition with P&Cs V2I Vehicle to Infrastructure
requiring intention estimation V2V Vehicle to Vehicle
Advanced driver assistance system function development is VS Vehicle Speed
intensively conducted in order to enhance safety and convenience
of mobility. To make the cities and AVs thrive in symbiosis, the References
latter should manage with the vulnerable road users, i.e., the P&Cs,
and especially getting their intention. Indeed, those users can have
less predictable behaviours that are not based on traffic laws. Most (1) Le véhicule autonome franchit une nouvelle étape grâce à la
of it, even at a low speed, an accident can be fatal (66). That is why collaboration entre le Groupe PSA et VINCI Autoroutes (July
researchers for AVs are trying to go further than just detecting the 2019). URL: https : / / www . media . stellantis . com/fr-
P&Cs, but also guessing their intentions to overcome their fr/psa-archive/press/le-vehicule- autonome - franchit - une -
unpredictive behaviours. nouvelle - etape - grace - a - la - collaboration - entre - le -
We could see that most of the collected studies rely on several groupe-psa-et-vinci-autoroutes.
sensors. Video is the prevalent one, even if it has some limitations, (2) Shuchisnigdha Deb, Lesley Strawderman, Janice DuBien,
especially during the night. The common way to deal with the Brian Smith, Daniel W. Carruth, and Teena M. Garrison:
evolution of the P&Cs through time is to use RNN algorithms, to “Evaluating pedestrian behavior at crosswalks: Validation of a
obtain, either the probability of crossing or the trajectory estimation. pedestrian behavior questionnaire for the U.S. population”.
To train and test the algorithms, datasets are used or created. We Accident Analysis & Prevention 106 (2017), pp. 191–201.
can notice that 4 studies created their own one and made them ISSN: 0001-4575.
available, as an additional contribution to their work. (3) Roni Utriainen and Markus Pöllänen: “How automated vehi-
Finally, some open challenges still remain. One is regarding the cles should operate to avoid fatal crashes with cyclists?” Ac-
robustness of the algorithm and its functioning in real conditions. cident Analysis & Prevention 154 (2021), p. 106097.
None of the collected studies did it. The perceived risk of AVs (4) Nan Jiang, Mi Shi, Yilong Xiao, Kan Shi, and Barry Watson:
should be low to expect trust from people. Another is regarding the “Factors affecting pedestrian crossing behaviors at signalized
development of smart roads and cities; one can expect the intention crosswalks in urban areas in Beijing and Singapore”. In:
estimation algorithms to be more robust with the shared data. The ICTIS 2011: Multimodal Approach to Sustained

Copyright  2015 Society of Automotive Engineers of Japan, Inc. All rights reserved

16
Siméon Capy et al. / International Journal of Automotive Engineering
Vol.14, No.1(2023)

Transportation Sys- tem Development: Information, Intersection”, Transactions of JSAE, Vol. 52, No.6, 2021,
Technology, Implementation. 2011, pp. 1090–1097. No.20214932, pp.1360-1367.
(5) Lorena Hell, Janis Sprenger, Matthias Klusch, Yoshiyuki (21) Yongsheng Wang, Jinxin Liu, Fachao Jiang, Yugong Luo,
Kobayashi, and Christian Mu¨ller: “Pedestrian Behavior in “Interactive Dynamic Speed Planning Based on Pedestrian
Japan and Germany: A Review”. Crossing Risk Assessment for Automated Valet Parking”,
(6) “Road Traffic Act, articles 7, 63-4, 119 and 121”. Japanese Proceedings of 15th International Symposium on Advanced
Law Translation. Accessed: 2022-02-02. URL: http : / / Vehicle Control (AVEC’22), Japan, No.We2C-01, pp.1-6.
www . japaneselawtranslation . go . jp / law / (22) Pongsathorn Raksincharoensak, Katsumi Moro and Masao
detail/?vm=04&id=2962&re=02. Nagai, “Reconstruction of Pedestrian/Cyclist Crash-Relevant
(7) “Aktueller Bußgeldkatalog für Fußgänger 2022 – Was kosten Scenario and Assessment of Collision Avoidance System
Verstöße zu Fuß?” VFR Verlag für Using Driving Simulator”,Proceedings of 11th International
Rechtsjournalismus GmbH. Accessed: 2022-02-01. URL: Symposium on Advanced Vehicle Control (AVEC’12),
https : / / www . bussgeldkatalog.org/fussgaenger/. Korea, APSS7-4, pp.1-6.
(8) “Code de la route, articles R412-38 and R412-43”. Journal of- (23) David Budgen and Pearl Brereton: “Performing systematic
ficiel de la République française. Accessed: 2022-02-01. lit- erature reviews in software engineering”. In: Proceedings
URL: https://www.legifrance.gouv.fr/. of the 28th international conference on Software engineering.
(9) “Code pénal, article 131-13”. Journal officiel de la République 2006, pp. 1051–1052.
française. Accessed: 2022-02-01. URL: https : / / www . (24) Barbara Kitchenham: “Procedures for Performing Systematic
legifrance.gouv.fr/. Reviews”. Keele, UK, Keele Univ. 33 (Aug. 2004).
(10) “Verkehrsregeln fürs Fahrrad”. VFR Verlag für (25) Kai Petersen, Sairam Vakkalanka, and Ludwik Kuzniarz:
Rechtsjour- nalismus GmbH. Accessed: 2022-02-03. URL: “Guidelines for conducting systematic mapping studies in
https : / / www . bussgeldkatalog . org / verkehrsregeln - software engineering: An update”. Information and Software
fahrrad/. Technology 64 (2015), pp. 1–18. ISSN: 0950-5849.
(11) “Règles de circulation pour les cyclistes”. Ministère (26) Zhijie Fang and Antonio M Lo´pez: “Intention recognition of
de l’Intérieur. Accessed: 2022-02-02. URL: https : / / pedestrians and cyclists by 2d pose estimation”. IEEE Trans-
www . securite - routiere . gouv . fr / reglementation - liee - actions on Intelligent Transportation Systems 21.11 (2019),
aux - modes - de - deplacements/velo/regles- de- circulation- pp. 4773–4783.
pour-les-cyclistes. (27) Khaled Saleh, Mohammed Hossny, and Saeid Nahavandi:
(12) Hajime Seya, Kazuki Yoshida, and Satoru Inoue: “Cyclist Trajectory Prediction Using Bidirectional Recurrent
“Verification of Zone-30-policy effect on accident reduction Neural Networks”. In: AI 2018: Advances in Artificial Intel-
using propen- sity score matching method for multiple ligence. Ed. by Tanja Mitrovic, Bing Xue, and Xiaodong Li.
treatments”. Case Studies on Transport Policy 9.2 (2021), pp. Cham: Springer International Publishing, 2018, pp. 284–295.
693–702. ISSN: 2213-624X. ISBN: 978-3-030-03991-2.
(13) Marie Pelé et al.: “Cultural influence of social information (28) Andreas Th Schulz and Rainer Stiefelhagen: “Pedestrian in-
use in pedestrian road-crossing behaviours”. Royal Society tention recognition using latent-dynamic conditional random
open science 4.2 (2017), p. 160739. fields”. In: 2015 IEEE Intelligent Vehicles Symposium (IV).
(14) Marie Pelé, Jean-Louis Deneubourg, and Ce´dric IEEE. 2015, pp. 622–627.
Sueur: “Decision-making processes underlying pedestrian (29) Andreas Møgelmose, Mohan M Trivedi, and Thomas B
behaviors at signalized crossings: part 2. Do pedestrians show Moeslund: “Trajectory analysis and prediction for improved
cultural herding behavior?” Safety 5.4 (2019), p. 82. pedestrian safety: Integrated framework and evaluations”. In:
(15) Marie Pelé et al.: “Influence de la culture sur le comporte- 2015 IEEE intelligent vehicles symposium (IV). IEEE. 2015,
ment social de traversée des piétons : de Strasbourg (France) à pp. 330–335.
Nagoya (Japon)”. In: Jan. 2015. (30) Amir Rasouli, Iuliia Kotseruba, Toni Kunic, and John K
(16) Alice Billot-Grasset, Emmanuelle Amoros, and Martine Tsot- sos: “Pie: A large-scale dataset and models for
Hours: “How cyclist behavior affects bicycle accident config- pedestrian in- tention estimation and trajectory prediction”. In:
urations?” Transportation research part F: traffic psychology Proceedings of the IEEE/CVF International Conference on
and behaviour 41 (2016), pp. 261–276. Computer Vi- sion. 2019, pp. 6262–6271.
(17) Christopher E Anderson, Amanda Zimmerman, Skylar (31) Benjamin Völz, Karsten Behrendt, Holger Mielenz, Igor
Lewis, John Marmion, and Jeanette Gustat: “Patterns of Gilitschenski, Roland Siegwart, and Juan Nieto: “A data-
cyclist and pedestrian street crossing behavior and safety on driven approach for pedestrian intention estimation”. In: 2016
an urban greenway”. International journal of environmental ieee 19th international conference on intelligent transporta-
research and public health 16.2 (2019), p. 201. tion systems (itsc). IEEE. 2016, pp. 2607–2612.
(18) Pongsathorn Raksincharoensak: “Motion Planning and (32) Haoran Wu, Sifa Zheng, Qing Xu, and Jianqiang Wang:
Control Based on Risk Field for Risk Predictive Driving Assist “Ap- plying the Extended Theory of Planned Behavior to
System Design”, Proceedings of 15th International Symposium Pedestrian Intention Estimation”. In: 2021 IEEE Intelligent
on Advanced Vehicle Control (AVEC’22), Japan, No.Tu1C-02, Vehicles Sym- posium (IV). IEEE. 2021, pp. 1509–1514.
pp.1-4. (33) Yoriyoshi Hashimoto, Yanlei Gu, Li-Ta Hsu, and Shunsuke
(19) Shintaro Inoue, Naoki Muto, Toshiki Kinoshita, Minami Sato, Kamijo: “Probability estimation for pedestrian crossing inten-
Kazuyuki Fujita, “Risk Predictive Path Planning Considering tion at signalized crosswalks”. In: 2015 IEEE International
Multiple Targets by Using Risk Potential Optimization Theory, Conference on Vehicular Electronics and Safety (ICVES).
Proceedings of 15th International Symposium on Advanced IEEE. 2015, pp. 114–119.
Vehicle Control (AVEC’22), Japan, No.Tu1C-03, pp.1-4. (34) Walter Morales Alvarez, Francisco Miguel Moreno, Oscar
(20) Takuma Yamaguchi, Hayato Kuroda, Hiroyuki Okuda, Sipele, Nikita Smirnov, and Cristina Olaverri-Monreal: “Au-
Tatsuya Suzuki, Kentaro Haraguchi, Ryo Wakisaka and tonomous driving: Framework for pedestrian intention estima-
Kazunori Ban, “Modelling and Analysis for Interactive tion in a real world scenario”. In: 2020 IEEE Intelligent Vehi-
Crossing Decision of Pedestrian at Non-Signalized cles Symposium (IV). IEEE. 2020, pp. 39–44.

Copyright  2015 Society of Automotive Engineers of Japan, Inc. All rights reserved

17
Siméon Capy et al. / International Journal of Automotive Engineering
Vol.14, No.1(2023)

(35) Yasheng Sun, Tao He, Jie Hu, Haiqing Huang, and Biao (50) Renfei Wu et al.: “Modified driving safety field based on
Chen: “Intent-Aware Conditional Generative Adversarial trajectory prediction model for pedestrian–vehicle collision”.
Network for Pedestrian Path Prediction”. In: 2019 IEEE Sustainability 11.22 (2019), p. 6254.
International Conference on Artificial Intelligence and (51) Iuliia Kotseruba, Amir Rasouli, and John K Tsotsos: “Do
Computer Applica- tions (ICAICA). IEEE. 2019, pp. 155– they want to cross? understanding pedestrian intention for
160. behav- ior prediction”. In: 2020 IEEE Intelligent Vehicles
(36) Nicolas Schneider and Dariu M. Gavrila: “Pedestrian Path Symposium (IV). IEEE. 2020, pp. 1688–1693.
Prediction with Recursive Bayesian Filters: A Comparative (52) Ewoud AI Pool, Julian FP Kooij, and Dariu M Gavrila: “Us-
Study”. In: Pattern Recognition. Ed. by Joachim Weick- ert, ing road topology to improve cyclist path prediction”. In:
Matthias Hein, and Bernt Schiele. Berlin, Heidelberg: 2017 IEEE Intelligent Vehicles Symposium (IV). IEEE. 2017,
Springer Berlin Heidelberg, 2013, pp. 174–183. ISBN: 978- 3- pp. 289–296.
642-40602-7. (53) Paul E Hemeren, Mikael Johannesson, Mikael Lebram,
(37) Julian Francisco Pieter Kooij, Nicolas Schneider, Fabian Fredrik Eriksson, Kristoffer Ekman, and Peter Veto: “The use
Flohr, and Dariu M. Gavrila: “Context-Based Pedestrian Path of visual cues to determine the intent of cyclists in traffic”. In:
Prediction”. In: Computer Vision – ECCV 2014. Ed. by David 2014 IEEE International Inter-Disciplinary Conference on
Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars. Cognitive Methods in Situation Awareness and Decision Sup-
Cham: Springer International Publishing, 2014, pp. 618–633. port (CogSIMA). IEEE. 2014, pp. 47–51.
ISBN: 978-3-319-10599-4. (54) Juan Pablo Nun˜ez Velasco, Anouk de Vries, Haneen Farah,
(38) Martin Stolz, Mingkang Li, Zhaofei Feng, Martin Kunert, Bart van Arem, and Marjan P Hagenzieker: “Cyclists’ cross-
and Wolfgang Menzel: “Direction of movement estimation of ing intentions when interacting with automated vehicles: A
cy- clists with a high-resolution automotive radar”. In: 2018 virtual reality study”. Information 12.1 (2021), p. 7.
IEEE MTT-S International Conference on Microwaves for (55) Liang Zhao, G. Pingali, and I. Carlbom: “Real-time head ori-
Intelli- gent Mobility (ICMIM). IEEE. 2018, pp. 1–4. entation estimation using neural networks”. In: Proceedings.
(39) Harshayu Girase et al.: “LOKI: Long Term and Key In- International Conference on Image Processing. Vol. 1. 2002,
tentions for Trajectory Prediction”. In: Proceedings of the pp. I–I.
IEEE/CVF International Conference on Computer Vision. (56) N. Dalal and B. Triggs: “Histograms of oriented gradients for
2021, pp. 9803–9812. human detection”. In: 2005 IEEE Computer Society Confer-
(40) Hirokatsu Kataoka, Yutaka Satoh, Yoshimitsu Aoki, Shoko ence on Computer Vision and Pattern Recognition
Oikawa, and Yasuhiro Matsui: “Temporal and fine-grained (CVPR’05). Vol. 1. 2005, 886–893 vol. 1.
pedestrian action recognition on driving recorder database”. (57) William Robson Schwartz, Aniruddha Kembhavi, David
Sensors 18.2 (2018), p. 627. Har- wood, and Larry S. Davis: “Human detection using
(41) Nora Muscholl, Matthias Klusch, Patrick Gebhard, and Tanja partial least squares analysis”. In: 2009 IEEE 12th
Schneeberger: “EMIDAS: explainable social interaction- International Con- ference on Computer Vision. 2009, pp. 24–
based pedestrian intention detection across street”. In: Pro- 31.
ceedings of the 36th Annual ACM Symposium on Applied (58) Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh:
Computing. 2021, pp. 107–115. “Re- altime Multi-Person 2D Pose Estimation Using Part
(42) Atanas Poibrenski, Matthias Klusch, Igor Vozniak, and Affinity Fields”. In: Proceedings of the IEEE Conference on
Chris- tian Mu¨ller: “M2p3: multimodal multi-pedestrian path Computer Vision and Pattern Recognition (CVPR). July 2017.
predic- tion by self-driving cars with egocentric vision”. In: (59) Michael Fürst, Shriya T. P. Gupta, René Schuster, Oliver
Proceed- ings of the 35th Annual ACM Symposium on Wasenmu¨ller, and Didier Stricker: “HPERL: 3D Human Pose
Applied Computing. 2020, pp. 190–197. Estimation from RGB and LiDAR”. In: 2020 25th Inter-
(43) Iuliia Kotseruba, Amir Rasouli, and John K Tsotsos: “Joint national Conference on Pattern Recognition (ICPR). 2021, pp.
attention in autonomous driving (JAAD)”. arXiv preprint 7321–7327.
arXiv:1609.04741 (2016). (60) Guangzheng Li, Ze Zhang, Hanmei Yang, Jin Pan, Dayin
(44) Arash Kalatian and Bilal Farooq: “A context-aware pedes- Chen, and Jin Zhang: “Capturing Human Pose Using
trian trajectory prediction framework for automated vehicles”. mmWave Radar”. In: 2020 IEEE International Conference on
Transportation research part C: emerging technologies 134 Pervasive Computing and Communications Workshops (Per-
(2022), p. 103453. Com Workshops). 2020, pp. 1–6.
(45) Holger Caesar et al.: “nuscenes: A multimodal dataset for au- (61) Samir B. Unadkat, Mãlina M. Ciocoiu, Larry R.
tonomous driving”. In: Proceedings of the IEEE/CVF con- Medsker, David G. Hagner, Mohamad H. Hassoun, and Paul
ference on computer vision and pattern recognition. 2020, pp. B. Watta: “RECURRENT NEURAL NETWORKS, Design
11621–11631. and Appli- cations”. In: ed. by LR Medsker and LC Jain. CRC
(46) R Kesten et al.: Lyft level 5 perception dataset 2020. 2019. Press, 2001. Chap. 1, 10.
(47) Pei Sun et al.: “Scalability in perception for autonomous (62) Peng Jing, Gang Xu, Yuexia Chen, Yuji Shi, and Feng- ping
driv- ing: Waymo open dataset”. In: Proceedings of the Zhan: “The Determinants behind the Acceptance of
IEEE/CVF conference on computer vision and pattern Autonomous Vehicles: A Systematic Review”. Sustainability
recognition. 2020, pp. 2446–2454. 12.5 (2020). ISSN: 2071-1050.
(48) Julian FP Kooij, Fabian Flohr, Ewoud AI Pool, and Dariu M (63) Erik Coelingh, Andreas Eidehall, and Mattias Bengtsson:
Gavrila: “Context-based path prediction for targets with “Collision Warning with Full Auto Brake and Pedestrian De-
switching dynamics”. International Journal of Computer Vi- tection - a practical example of Automatic Emergency Brak-
sion 127.3 (2019), pp. 239–262. ing”. In: 13th International IEEE Conference on Intelligent
(49) Sarah Ferguson, Brandon Luders, Robert C Grande, and Transportation Systems. 2010, pp. 155–160.
Jonathan P How: “Real-time predictive modeling and robust (64) Andrea Pompigna and Raffaele Mauro: “Smart roads: A state
avoidance of pedestrians with uncertain, changing intentions”. of the art of highways innovations in the Smart Age”. Engi-
In: Algorithmic Foundations of Robotics XI. Springer, 2015, neering Science and Technology, an International Journal 25
pp. 161–177. (2022), p. 100986. ISSN: 2215-0986.

Copyright  2015 Society of Automotive Engineers of Japan, Inc. All rights reserved

18
Siméon Capy et al. / International Journal of Automotive Engineering
Vol.14, No.1(2023)

(65) Salvatore Trubia, Alessandro Severino, Salvatore Curto,


Fabio Arena, and Giovanni Pau: “Smart Roads: An Overview
of What Future Mobility Will Look Like”. Infrastructures 5.12
(2020). ISSN: 2412-3811.
(66) Erik Rosén, Helena Stigson, and Ulrich Sander: “Litera-
ture review of pedestrian fatality risk as a function of car impact
speed”. Accident Analysis Prevention 43.1 (2011), pp. 25–33.
ISSN: 0001-4575. URL: https : / / www . sciencedirect . com /
science / article / pii / S0001457510001077.

Copyright  2015 Society of Automotive Engineers of Japan, Inc. All rights reserved

19

You might also like