MSVIPER: Improved Policy Distillation for Reinforcement-Learning-Based Robot Navigation
Aaron M. Roth, Jing Liang, Ram Sriram, Elham Tabassi, and Dinesh, Manocha

TL;DR
MSVIPER is a novel method that distills reinforcement learning policies into decision trees, enabling efficient, interpretable robot navigation with significant performance improvements in dynamic and complex environments.
Contribution
The paper introduces MSVIPER, a new policy distillation approach that produces compact decision trees from RL policies, with techniques for policy improvement without retraining.
Findings
Up to 95% reduction in freezing and oscillation behaviors.
Decision trees accurately mimic expert RL policies.
Enhanced outdoor navigation on complex terrains.
Abstract
We present Multiple Scenario Verifiable Reinforcement Learning via Policy Extraction (MSVIPER), a new method for policy distillation to decision trees for improved robot navigation. MSVIPER learns an "expert" policy using any Reinforcement Learning (RL) technique involving learning a state-action mapping and then uses imitation learning to learn a decision-tree policy from it. We demonstrate that MSVIPER results in efficient decision trees and can accurately mimic the behavior of the expert policy. Moreover, we present efficient policy distillation and tree-modification techniques that take advantage of the decision tree structure to allow improvements to a policy without retraining. We use our approach to improve the performance of RL-based robot navigation algorithms for indoor and outdoor scenes. We demonstrate the benefits in terms of reduced freezing and oscillation behaviors (by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
