How to Learn from Risk: Explicit Risk-Utility Reinforcement Learning for   Efficient and Safe Driving Strategies

Lukas M. Schmidt; Sebastian Rietsch; Axel Plinge; Bjoern M. Eskofier,; Christopher Mutschler

arXiv:2203.08409·cs.LG·August 3, 2022

How to Learn from Risk: Explicit Risk-Utility Reinforcement Learning for Efficient and Safe Driving Strategies

Lukas M. Schmidt, Sebastian Rietsch, Axel Plinge, Bjoern M. Eskofier,, Christopher Mutschler

PDF

TL;DR

This paper introduces SafeDQN, a reinforcement learning method that produces safe, interpretable, and efficient driving strategies by explicitly modeling risk and utility separately, enhancing transparency and safety in autonomous driving.

Contribution

SafeDQN is a novel RL approach that explicitly separates risk and utility modeling, enabling interpretable and safe autonomous driving policies.

Findings

01

SafeDQN achieves interpretable and safe driving policies in various scenarios.

02

State-of-the-art saliency techniques help assess risk and utility.

03

SafeDQN balances safety, interpretability, and efficiency effectively.

Abstract

Autonomous driving has the potential to revolutionize mobility and is hence an active area of research. In practice, the behavior of autonomous vehicles must be acceptable, i.e., efficient, safe, and interpretable. While vanilla reinforcement learning (RL) finds performant behavioral strategies, they are often unsafe and uninterpretable. Safety is introduced through Safe RL approaches, but they still mostly remain uninterpretable as the learned behaviour is jointly optimized for safety and performance without modeling them separately. Interpretable machine learning is rarely applied to RL. This paper proposes SafeDQN, which allows to make the behavior of autonomous vehicles safe and interpretable while still being efficient. SafeDQN offers an understandable, semantic trade-off between the expected risk and the utility of actions while being algorithmically transparent. We show that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.