Assessing and Accelerating Coverage in Deep Reinforcement Learning
Arpan Kusari

TL;DR
This paper introduces a new measure called Approximate Pseudo-Coverage (APC) to evaluate coverage in deep reinforcement learning, and employs RRT for coverage acceleration, demonstrated on standard control tasks.
Contribution
It proposes the first assessment method for coverage in DRL using APC and combines it with RRT-based exploration to improve coverage in high-dimensional spaces.
Findings
APC effectively measures coverage gaps in DRL.
RRT-based exploration accelerates coverage in standard tasks.
Coverage assessment improves safety and reliability in DRL applications.
Abstract
Current deep reinforcement learning (DRL) algorithms utilize randomness in simulation environments to assume complete coverage in the state space. However, particularly in high dimensions, relying on randomness may lead to gaps in coverage of the trained DRL neural network model, which in turn may lead to drastic and often fatal real-world situations. To the best of the author's knowledge, the assessment of coverage for DRL is lacking in current research literature. Therefore, in this paper, a novel measure, Approximate Pseudo-Coverage (APC), is proposed for assessing the coverage in DRL applications. We propose to calculate APC by projecting the high dimensional state space on to a lower dimensional manifold and quantifying the occupied space. Furthermore, we utilize an exploration-exploitation strategy for coverage maximization using Rapidly-Exploring Random Tree (RRT). The efficacy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Autonomous Vehicle Technology and Safety · Adversarial Robustness in Machine Learning
