Keeping Minimal Experience to Achieve Efficient Interpretable Policy Distillation
Xiao Liu, Shuyang Liu, Wenbin Li, Shangdong Yang, Yang Gao

TL;DR
This paper introduces BCMER, an interpretable policy distillation method that retains only critical boundary experiences, significantly reducing experience storage while maintaining policy performance in reinforcement learning.
Contribution
The paper proposes BCMER, a novel framework that identifies and preserves essential boundary experiences for efficient and interpretable policy distillation in reinforcement learning.
Findings
Reduces experience storage to as low as 1.4% of naive experience set.
Maintains high policy performance with minimal experience.
Effective in experience storage limited regimes.
Abstract
Although deep reinforcement learning has become a universal solution for complex control tasks, its real-world applicability is still limited because lacking security guarantees for policies. To address this problem, we propose Boundary Characterization via the Minimum Experience Retention (BCMER), an end-to-end Interpretable Policy Distillation (IPD) framework. Unlike previous IPD approaches, BCMER distinguishes the importance of experiences and keeps a minimal but critical experience pool with almost no loss of policy similarity. Specifically, the proposed BCMER contains two basic steps. Firstly, we propose a novel multidimensional hyperspheres intersection (MHI) approach to divide experience points into boundary points and internal points, and reserve the crucial boundary points. Secondly, we develop a nearest-neighbor-based model to generate robust and interpretable decision rules…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Reinforcement Learning in Robotics · Adversarial Robustness in Machine Learning
