Test Where Decisions Matter: Importance-driven Testing for Deep Reinforcement Learning
Stefan Pranger, Hana Chockler, Martin Tappler, and Bettina, K\"onighofer

TL;DR
This paper introduces an importance-driven testing framework for deep reinforcement learning policies that prioritizes critical states to efficiently identify safety violations and provide formal safety guarantees.
Contribution
The paper presents a novel model-based method for ranking state importance in RL, enabling targeted testing and formal safety verification with reduced effort.
Findings
Efficiently discovers unsafe behaviors with low testing effort.
Provides formal safety guarantees over the entire state space.
Divides state space into safe and unsafe regions upon convergence.
Abstract
In many Deep Reinforcement Learning (RL) problems, decisions in a trained policy vary in significance for the expected safety and performance of the policy. Since RL policies are very complex, testing efforts should concentrate on states in which the agent's decisions have the highest impact on the expected outcome. In this paper, we propose a novel model-based method to rigorously compute a ranking of state importance across the entire state space. We then focus our testing efforts on the highest-ranked states. In this paper, we focus on testing for safety. However, the proposed methods can be easily adapted to test for performance. In each iteration, our testing framework computes optimistic and pessimistic safety estimates. These estimates provide lower and upper bounds on the expected outcomes of the policy execution across all modeled states in the state space. Our approach divides…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsFocus
