Lipschitz Continuity in Model-based Reinforcement Learning
Kavosh Asadi, Dipendra Misra, Michael L. Littman

TL;DR
This paper investigates the role of Lipschitz continuous models in model-based reinforcement learning, providing theoretical bounds on prediction and value estimation errors, and demonstrating empirical benefits of Lipschitz control.
Contribution
It introduces new theoretical bounds for multi-step prediction and value function errors in Lipschitz models and empirically shows the advantages of controlling Lipschitz constants.
Findings
Lipschitz models enable tighter multi-step prediction error bounds.
Value functions derived from Lipschitz models are themselves Lipschitz.
Controlling Lipschitz constants improves empirical performance.
Abstract
We examine the impact of learning Lipschitz continuous models in the context of model-based reinforcement learning. We provide a novel bound on multi-step prediction error of Lipschitz models where we quantify the error using the Wasserstein metric. We go on to prove an error bound for the value-function estimate arising from Lipschitz models and show that the estimated value function is itself Lipschitz. We conclude with empirical results that show the benefits of controlling the Lipschitz constant of neural-network models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Model Reduction and Neural Networks
