TL;DR
This paper presents a POMDP-based pedestrian collision avoidance system that enhances autonomous driving safety by reducing unnecessary braking in occluded scenarios, integrating with existing AEB systems.
Contribution
It introduces a novel POMDP formulation for pedestrian detection under occlusions and proposes an integrated approach with AEB for improved safety and efficiency.
Findings
Reduces unnecessary emergency braking in occlusion scenarios.
Provides a robust collision avoidance policy under uncertainty.
Offers a rigorous evaluation methodology for autonomous braking systems.
Abstract
Safe autonomous driving in urban areas requires robust algorithms to avoid collisions with other traffic participants with limited perception ability. Current deployed approaches relying on Autonomous Emergency Braking (AEB) systems are often overly conservative. In this work, we formulate the problem as a partially observable Markov decision process (POMDP), to derive a policy robust to uncertainty in the pedestrian location. We investigate how to integrate such a policy with an AEB system that operates only when a collision is unavoidable. In addition, we propose a rigorous evaluation methodology on a set of well defined scenarios. We show that combining the two approaches provides a robust autonomous braking system that reduces unnecessary braking caused by using the AEB system on its own.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18| Parameter | Value |
|---|---|
| Simulation time step | |
| Pedestrian position tracking standard deviation | |
| Pedestrian velocity tracking standard deviation | |
| Object tracking delay | |
| Brake delay | |
| POMDP planner | |
| Belief update frequency | |
| Decision frequency | |
| Pedestrian maximum speed | |
| Autonomous Emergency Braking System | |
| Threshold collision probability | |
| Threshold collision risk | |
| CPAF | CPAN-25 | CPAN-75 | CPCN | |
| Ego velocity [] | 10-60 | 10-60 | 10-60 | 10-60 |
| Ped velocity [km/h] | 8 | 5 | 5 | 5 |
| Occlusion | No | No | No | Yes |
| Impact point [%] | 0–50 | 0–50 | 0–50 | 0–50 |
| AEB | POMDP | POMDP + AEB | |
|---|---|---|---|
| Collisions | |||
| Emergency Brakes | |||
| [] |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Pedestrian Collision Avoidance System for Scenarios with Occlusions
Markus Schratter,1 Maxime Bouton,2 Mykel J. Kochenderfer,2 Daniel Watzenig1 1Markus Schratter and Daniel Watzenig are with the Virtual Vehicle Research Center, Graz 8010, Austria, {markus.schratter,daniel.watzenig}@v2c2.at. Daniel Watzenig is also with Institute of Automation and Control, Graz University of Technology, Graz 8010, Austria 2Maxime Bouton and Mykel J. Kochenderfer are with the Department of Aeronautics and Astronautics, Stanford University, Stanford CA 94305, USA, {boutonm,mykel}@stanford.edu.
Abstract
Safe autonomous driving in urban areas requires robust algorithms to avoid collisions with other traffic participants with limited perception ability. Current deployed approaches relying on Autonomous Emergency Braking (AEB) systems are often overly conservative. In this work, we formulate the problem as a partially observable Markov decision process (POMDP), to derive a policy robust to uncertainty in the pedestrian location. We investigate how to integrate such a policy with an AEB system that operates only when a collision is unavoidable. In addition, we propose a rigorous evaluation methodology on a set of well defined scenarios. We show that combining the two approaches provides a robust autonomous braking system that reduces unnecessary braking caused by using the AEB system on its own.
I INTRODUCTION
Autonomous vehicles must navigate safely through urban environments where parked cars and other physical obstacles occlude other road users. In this work, we focus on avoiding collisions with pedestrians crossing behind an occluded area on the side of the road. Some systems rely on autonomous emergency braking (AEB) systems to prevent collision. They attempt to predict the trajectory of the pedestrian and compare a metric such as the time to collision (TTC) to decide when to brake [1]. Although comparing TTC to a threshold to trigger braking can be effective [2], it tends to be overly conservative because of the uncertainty in the sensors and environment. There is a high risk of starting unnecessary strong braking.
To provide robustness to uncertainty in the sensors and environment, previous work proposed modeling similar scenarios with occluded cars and pedestrians as partially observable Markov decision processes (POMDPs) [3, 4, 5, 1]. Their experiments showed that POMDPs provide an effective framework for modeling uncertainty in the sensors and environment, but they assumed a different acceleration range than AEB systems. The resulting POMDP policies were designed for comfortable maneuvers and would not be able to deliver extreme deceleration when needed. Other techniques to handle planning in occluded areas rely on set based approaches [6, 7, 8]. Such methods are often well suited to achieve robust prediction and compute a safe driving velocity. However they do not offer a principle framework for combining planning and partial observability.
This paper demonstrates the benefit of augmenting a POMDP policy with an AEB system that can use the full braking power of the vehicle. We present an approach where the problem is formulated as a POMDP to derive a policy robust to uncertainty in the pedestrian state and to handle hidden pedestrian behind an occlusion. The POMDP planner is designed for comfortable maneuvers in a middle acceleration range and is responsible for taking into account uncertainty due to occlusions. The policy adapts the velocity of the vehicle when the side of the road is occluded. To handle rare critical situations where a pedestrian appears behind an occluded area while the vehicle is at high speed, an AEB system intervenes with a strong brake intervention when a collision is unavoidable. The AEB system is responsible for strong interventions when a collision is unavoidable. In situations with poor visibility, the AEB system is not able to avoid or mitigate collisions on its own. The POMDP planner enables the system to anticipate this uncertainty and to prevent an emergency stop. By combining the two systems, our algorithm is able to maintain a reasonable driving speed in occluded areas without increasing the accident rate compared to relying on the AEB system on its own. Safety is not compromised because the AEB system can take control at any time.
Finally, we propose a rigorous evaluation methodology on a set of well defined scenarios from the EuroNCAP test protocol (fig. 1). Previous work on evaluating autonomous braking systems at unsignalized crosswalks relied on data-driven models of the pedestrian [9]. \Citeauthorchen2017 argue that the EuroNCAP scenarios would allow an overly conservative system to be validated [9]. To avoid such an issue, we augmented the suite of scenarios with a situation involving occluded areas at the side of the road with no pedestrian crossing. It is expected that an efficient system would drive at a reasonable speed in these situations.
II Problem Formulation
This section outlines how to model our problem as a partially observable Markov decision process and solve for an approximately optimal solution.
II-A Background
Sequential decision making problems under uncertainty can be modeled as partially observable Markov decision processes (POMDPs). It is a mathematical framework defined by the tuple where is a state space, an action space, an observation space, a transition model, an observation model, a reward function, and a discount factor. From a state , the agent takes an action and the state evolves to a state with probability . In a POMDP, the agent has uncertain knowledge about the state of the environment. Therefore, the agent maintains a belief about its internal knowledge of the state. The belief can be updated after taking an action and an observation about the current state using the following equation, where is the transition function:
[TABLE]
In this work we used a discrete Bayesian updater, which updates the discretized belief with a measured continuous observation.
The solution of a POMDP is an optimal policy , which maximizes the expected discounted sum of immediate rewards from any given belief. The optimal policy can be extracted from the optimal utility function . In general, computing the exact optimal utility function for a POMDP is intractable and must rely on approximation techniques instead. Two approaches are ued to compute the optimal utility function: offline and online methods [10]. In this paper, we use an offline QMDP [11] approach to compute the optimal policy. The QMDP method solves the problem under the assumption that the state becomes fully observable after one time step. With this assumption the value iteration algorithm can solve the optimal state-action utility function assuming full observability.
II-B Scenario modeling
The road is represented in the Frenet frame. By applying an appropriate coordinate transform, our planner can be applied directly to different road configurations [12]. For simplicity, we illustrate our approach on a straight road segment.
II-B1 Action space
The POMDP planner is able to control the acceleration profile in the longitudinal direction and can position the vehicle inside the driving lane in the lateral direction along the given path. In the lateral direction, the planner can modify the vehicle position inside the lane. A finite set of strategic maneuvers for the lateral control are defined: no acceleration or an acceleration to the left or right side: {\{0\text{,}\mathrm{m}\text{/}{\mathrm{s}}^{2}1\text{,}\mathrm{m}\text{/}{\mathrm{s}}^{2}-1\text{,}\mathrm{m}\text{/}{\mathrm{s}}^{2}\}}. Strategic maneuvers for the longitudinal control such as accelerating, maintaining constant speed and braking with different strengths are represented by a finite set of acceleration and deceleration actions: {\{1\text{,}\mathrm{m}\text{/}{\mathrm{s}}^{2}0\text{,}\mathrm{m}\text{/}{\mathrm{s}}^{2}-1\text{,}\mathrm{m}\text{/}{\mathrm{s}}^{2}-2\text{,}\mathrm{m}\text{/}{\mathrm{s}}^{2}-4\text{,}\mathrm{m}\text{/}{\mathrm{s}}^{2}\}}.
II-B2 State space
The state space represents all the variables taken into account for solving the problem. It encodes information on the ego vehicle and the pedestrian. To handle complex street courses the road is represented in the Frenet frame. The ego vehicle is represented in the state space with its longitudinal velocity () and its lateral position inside the lane (1\text{,}\mathrm{m}\text{/}$$). The position of the pedestrian is represented in the longitudinal direction () and lateral direction (5\text{,}\mathrm{m}. The longitudinal range is the result of the required distance to stop based on the defined maximum velocity, the maximum deceleration of the system and a longitudinal safety gap. In addition, we consider the velocity () and orientation (\pm$$$) of the pedestrian. [Figure 2](#S2.F2) illustrates the state representation with one crossing pedestrian from the left side as an example. All the variables in the state space are discretized and result in 29271.5\text{\times}{10}^{6}$.
II-B3 Transition model
The transition model of the ego vehicle depends on the current action and state of the ego vehicle and consists of a point mass model. For the transition of the pedestrian, we use a simple reachability model [13], which depends on the current pedestrian state and calculates further positions for the pedestrian based on a set of possible acceleration values. It is assumes that the pedestrian can take any of those acceleration with uniform probability. The velocity of the pedestrian is bounded up to .
II-B4 Observation model
The observation model characterizes what the ego vehicle can sense about the state space. We can reasonably assume that the position and velocity of the ego vehicle are nearly perfectly observable. The observation space is similar to the state space. The observation model can be described as follows:
- •
An object in a non-occluded area will always be detected.
- •
An occluded object behind an obstacle will not be detected.
- •
If an object is detected, the measured quantities like the position, the velocity and the orientation of the pedestrian follow a normal distribution around the true state. The parameters of the distribution depends on the perception system model.
II-B5 Reward model
The reward model defines the objective of the POMDP planner. The ego vehicle receives a penalty for colliding with a pedestrian. We define an additional reward signal to keep the velocity sufficiently high and stay in the center of the lane. If the ego vehicle drives with the desired velocity and stays in the center of the lane, no reward is received. A penalty term decreases linearly with the velocity difference and the lateral offset to the lane center. Longitudinal and lateral actions cause a penalty to avoid too many interventions. The resulting behavior of the POMDP planner can be modified by choosing different values for penalties and rewards. The values for those penalties and rewards can be tuned through simulation on defined scenarios to balance avoiding collisions and efficiency, as described in section IV.
II-C Solving the optimal policy for multiple road users
The POMDP model describds in the previous section handles only one pedestrian. To extend the capabilities of the resulting policy we use a utility decomposition method [14]. Every pedestrian is considered independently and the global utility function is approximated as the minimum belief action utility over each individual pedestrian.
[TABLE]
where is the utility function obtained from solving the POMDP considering a single pedestrian. Taking the minimum will result in taking the action associated to the most critical pedestrian. Figure 3 shows an example of a policy obtained by solving the POMDP model. The color shows the longitudinal action given by the policy for a given longitudinal and lateral distance in the Frenet frame . Because the state space is multi-dimensional, we fixed the ego velocity, the pedestrian velocity and orientation to visualize the policy for every and in the state space. The Frenet frame is relative to the vehicle. A decreasing means that the pedestrian is closer to the car and -2\text{,}\mathrm{m}\text{/}$$ means the pedestrian is on the right side. We can observe that with a higher velocity, earlier braking is necessary the closer the pedestrian is. Moreover, if the pedestrian is further left, the braking intervention happens later.
III Autonomous Emergency Braking System
The Autonomous Emergency Braking (AEB) system works in combination with the POMDP planner. It uses the generated driving trajectory from the POMDP planner to calculate the risk for a collision. If a collision is unavoidable the AEB system triggers an emergency stop, which has the highest priority and overrules the POMDP planner.
The AEB system runs at a higher update rate to detect critical situations as fast as possible, especially when a pedestrian appears behind an occluded area. The system uses, like the POMDP planner, the Frenet frame to generalize the problem to a straight road. The algorithm is described in pseudo code in algorithm 1. The input for the AEB system are the pedestrians position and velocity in the Frenet frame as well as the ego vehicle trajectory given by the higher level planner.
In the first step, the time-to-brake (TTB) is calculated based on the ego vehicle trajectory. Then a prediction model gives a distribution over possible future states for the pedestrian [15]. This distribution, as well as information on the future ego vehicle state, is used to compute a probability of collision . is the estimated fraction of future pedestrian states overlapping future ego vehicle states. The red circle in fig. 4 at 2.5\text{,}\mathrm{s}\text{/}$$, represents the distribution of possible future states given by the prediction model. The performance of the algorithm is directly related to the quality of the prediction.
If is above some threshold, we carry an additional check using the following risk metric:
[TABLE]
where TTC is the time to collision. If the risk is higher than a defined threshold, an emergency stop is triggered. The implementation of the Autonomous Emergency Braking system is available at [16].
IV Experiments
We compare three different approaches to get an overview of the advantages and disadvantages of the different systems:
- •
Autonomous Emergency Braking
- •
POMDP planner
- •
POMDP planner with AEB system
The parameters used for the evaluations are specified in table I. To evaluate the performance of different implementations, we compare them using scenarios from the EuroNCAP test protocol for vulnerable road users. The aim of the EuroNCAP test protocol is to cover the most amount of accidents. The scenarios in the test protocol are all critical and result in a collision. About 75 % of all pedestrian accidents are covered with these crossing scenarios [17]. A detailed description of the scenarios can be found at [18]. In the existing version of the test protocol, every scenario has only one defined collision point. To cover a wider variation of collisions along the front of the vehicle (collision grid), we increased the amount of collision points for every defined scenario, see Table II.
The EuroNCAP test protocol defines different velocities for the ego vehicle. To simplify the analysis, we assume an ego velocity of . Additionally, we added three scenarios to analyze the robustness of the different approaches. In these scenarios, no intervention is required to prevent collision. If the AEB system causes a full brake, it would be a false positive. For two of the scenarios, a pedestrian is crossing the road from the right side; in one scenario the pedestrian is to the right and in the other to the left at the passing point. With the third scenario, we evaluate efficiency in occluded areas. The CPCN scenario (with an obstacle on the right side) is used where no pedestrian is crossing the road. With this scenario we can detect an overly conservative algorithm that would reduce the speed too drastically in the presence of an occluded area. Figure 4 shows examples of the EuroNCAP CPCN scenario with an obstacle on the side where the POMDP planner with the AEB system is active.
IV-A Evaluation metric
Multiple metrics are available to evaluate performance [19, 9]. We calculate the mean velocity , the mean acceleration , the mean collision velocity , the number of collisions, and the amount of emergency brake interventions, over all of the EuroNCAP scenarios.
IV-B Tuning of the reward function
The behavior of the POMDP planner is influenced by the reward function. The reward parameters must be tuned to fulfil the safety and efficiency requirements. Determining good parameters for the reward function can be challenging. We ran a parameter search and evaluated the resulting policies with the defined scenarios. The objective is to compare the collision rate, amount of emergency braking, and the mean velocity of the ego vehicle. The following parameters are tuned:
- •
Penalty for longitudinal action (throttle/brake)
- •
Velocity reward to keep the velocity close to
- •
Probability of pedestrian appearance (which is a parameter of our transition model)
Figure 5 shows results for different reward functions. We measured the mean velocity, amount of collisions and amount of emergency braking interventions for the resulting policies. The most critical cases are scenarios with an obstacle on the side because reducing the velocity is required to avoid collision at . It is important to notice that the probability of the pedestrian appearance behind an obstacle has a significant influence on the amount of collisions and emergency braking interventions and the mean velocity, which decreases with a higher probability for the pedestrian appearance.
V Results
Before comparing the results for the different approaches, we analyze different reward configurations. Figure 6 shows different settings of reward parameters for the POMDP planner with and without the AEB system. In this experiment, the reward for lane keeping and the penalty for a longitudinal action are fixed and the probability of a pedestrian appearance varies. The top plot shows the relation between collisions and mean velocity. A collision-free configuration is possible with both approaches. Combining the POMDP planner with the AEB system results in a higher mean velocity due to the capability of the AEB system to request a stronger brake intervention. The bottom plot shows the number of emergency braking interventions, which decreases when the probability of pedestrian appearance is higher. As the number of interventions decreases, the system behaves more conservatively when passing occluded areas. Based on the results from Figure 6, we chose the reward parameters that lead to no collisions and the highest mean velocity.
Figure 7 shows the results for the EuroNCAP scenarios without occlusions. The velocity profile of the AEB system is shown on the top, and the velocity profile of the POMDP planner at the bottom. There is no difference between the POMDP planner with and without AEB because there are no occluded areas. The two scenarios, False Positive and , are scenarios where the pedestrian is at the passing point of the ego vehicle to the left and to the right, respectively. In both cases, the AEB system does not trigger. The POMDP planner behaves differently, reducing the velocity, especially when the pedestrian is directly in front of the vehicle, as shown in scenario False Positive (). For all of the three collision scenarios, the POMDP planner slows the vehicle and allows the pedestrian to cross. Afterwards, the ego vehicle accelerates to reach the desired velocity of .
Figure 8 shows the velocity profiles for the scenario with an occlusion. In the top plot, no pedestrian crosses the road. The AEB system does not decelerate. The two POMDP planners reduce the velocity because of the occluded area. The POMDP planner needs to drive slower than the POMDP planner with the AEB system. The reason for this is illustrated by the bottom figure where a pedestrian crosses the road. The POMDP planner with the AEB system is able to drive faster, but an emergency braking intervention is needed to avoid collision. The POMDP planner decelerates in front of the occluded area. Driving under allows it to avoid collision with the occluded pedestrian. In this case, the AEB system is not capable of avoiding collision with a velocity of . When driving at high speed, the time to react is not sufficient to stop the vehicle. Figure 8 shows the velocity profile for a POMDP planner with a deactivated AEB system, which we refer to as not adapted. In this case, the velocity before the obstacle is too high and the deceleration is not sufficient, which illustrates the benefit of the underlying AEB systems.
Table III summarizes the performance of the three approaches. The AEB system is not able to avoid all collisions, but the two POMDP planners avoid all collisions. The implementation combining the POMDP planner and the AEB system is able to pass obstacles faster. The mean velocity is higher, but four emergency braking interventions are triggered.
VI Conclusion
This paper discussed a POMDP approach for a pedestrian collision avoidance system that is capable to handle scenarios with sensor occlusions. The system is able to handle multiple pedestrians while maintaining computational scalability. In addition, an Autonomous Emergency Braking system was implemented to extend the capability in critical situations and increase the driving velocity in non critical situations even in occluded areas. We used scenarios from the EuroNCAP test protocol for vulnerable road users to evaluate our approach. The experiments showed that different behaviors can be obtained, ranging from a conservative behavior without any emergency brake interventions to a behavior where emergency brakes are always needed to avoid collisions. In the latter case the vehicle passes obstacles on the side of the road at a faster speed.
The investigation showed that defining appropriate parameters for the reward function of the POMDP planner is challenging. The POMDP planner is designed to control the lateral behavior of the vehicle, and it would be interesting to investigate this capability in more depth. The implementation of the POMDP planner in combination with the Autonomous Emergency Braking System is publicly available [20].
ACKNOWLEDGMENT
This project received funding from the Electronic Component Systems for European Leadership Joint Undertaking under grant agreement No 737469. This Joint Undertaking receives support from the European Unionś Horizon 2020 research and innovation program and Germany, Austria, Spain, Italy, Latvia, Belgium, Netherlands, Sweden, Finland, Lithuania, Czech Republic, Romania, Norway. In Austria the project was also funded by the program “IKT der Zukunft” and the Austrian Federal Ministry for Transport, Innovation and Technology (bmvit). The publication was written at VIRTUAL VEHICLE Research Center in Graz and partially funded by the COMET K2 – Competence Centers for Excellent Technologies Programme of the Federal Ministry for Transport, Innovation and Technology (bmvit), the Federal Ministry for Digital, Business and Enterprise (bmdw), the Austrian Research Promotion Agency (FFG), the Province of Styria and the Styrian Business Promotion Agency (SFG).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Benjamin Volz et al. “Inferring Pedestrian Motions at Urban Crosswalks” In IEEE Transactions on Intelligent Transportation Systems , 2018, pp. 1–12 DOI: 10.1109/TITS.2018.2827956 · doi ↗
- 2[2] Michiel M Minderhoud and Piet HL Bovy “Extended time-to-collision measures for road traffic safety assessment” In Accident Analysis & Prevention 33.1 Elsevier, 2001, pp. 89–97
- 3[3] Maxime Bouton, Alireza Nakhaei, Kikuo Fujimura and Mykel J. Kochenderfer “Scalable Decision Making with Sensor Occlusions for Autonomous Driving” In IEEE International Conference on Robotics and Automation (ICRA) , 2018, pp. 2076–2081 DOI: 10.1109/ICRA.2018.8460914 · doi ↗
- 4[4] Sarah M. Thornton et al. “Value Sensitive Design for Autonomous Vehicle Motion Planning” In IEEE Intelligent Vehicles Symposium (IV) , 2018, pp. 1157–1162 DOI: 10.1109/IVS.2018.8500441 · doi ↗
- 5[5] Sebastian Brechtel, Tobias Gindele and Rüdiger Dillmann “Probabilistic decision-making under uncertainty for autonomous driving using continuous POMD Ps” In IEEE International Conference on Intelligent Transportation Systems (ITSC) , 2014, pp. 392–399 DOI: 10.1109/ITSC.2014.6957722 · doi ↗
- 6[6] Markus Koschi, Christian Pek, Mona Beikirch and Matthias Althoff “Set-Based Prediction of Pedestrians in Urban Environments Considering Formalized Traffic Rules” In IEEE International Conference on Intelligent Transportation Systems (ITSC) , 2018, pp. 2704–2711 DOI: 10.1109/ITSC.2018.8569434 · doi ↗
- 7[7] Piotr Franciszek Orzechowski, Annika Meyer and Martin Lauer “Tackling Occlusions & Limited Sensor Range with Set-based Safety Verification” In IEEE International Conference on Intelligent Transportation Systems (ITSC) , 2018, pp. 1729–1736 DOI: 10.1109/ITSC.2018.8569332 · doi ↗
- 8[8] Silvia Magdici and Matthias Althoff “Fail-safe motion planning of autonomous vehicles” In IEEE International Conference on Intelligent Transportation Systems (ITSC) , 2016, pp. 452–458 DOI: 10.1109/ITSC.2016.7795594 · doi ↗
