Distillation of RL Policies with Formal Guarantees via Variational Abstraction of Markov Decision Processes (Technical Report)
Florent Delgrange, Ann Now\'e, Guillermo A. P\'erez

TL;DR
This paper develops a method to simplify and verify reinforcement learning policies in complex environments by creating a discrete latent model with formal guarantees, enabling safer policy deployment.
Contribution
It introduces a novel approach combining variational autoencoders and bisimulation bounds to produce simplified, verifiable policies with formal guarantees in RL settings.
Findings
Derived new bisimulation bounds for unknown environments.
Successfully trained a variational autoencoder with formal bisimulation guarantees.
Produced distilled policies with provable correctness in latent models.
Abstract
We consider the challenge of policy simplification and verification in the context of policies learned through reinforcement learning (RL) in continuous environments. In well-behaved settings, RL algorithms have convergence guarantees in the limit. While these guarantees are valuable, they are insufficient for safety-critical applications. Furthermore, they are lost when applying advanced techniques such as deep-RL. To recover guarantees when applying advanced RL algorithms to more complex environments with (i) reachability, (ii) safety-constrained reachability, or (iii) discounted-reward objectives, we build upon the DeepMDP framework introduced by Gelada et al. to derive new bisimulation bounds between the unknown environment and a learned discrete latent model of it. Our bisimulation bounds enable the application of formal methods for Markov decision processes. Finally, we show how…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
