Is Model Ensemble Necessary? Model-based RL via a Single Model with   Lipschitz Regularized Value Function

Ruijie Zheng; Xiyao Wang; Huazhe Xu; Furong Huang

arXiv:2302.01244·cs.LG·February 3, 2023

Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function

Ruijie Zheng, Xiyao Wang, Huazhe Xu, Furong Huang

PDF

Open Access 1 Video

TL;DR

This paper demonstrates that regularizing the Lipschitz continuity of the value function can replace the need for probabilistic model ensembles in model-based reinforcement learning, leading to more efficient algorithms.

Contribution

The paper introduces practical mechanisms to regularize the value function's Lipschitz condition, showing that a single model with this regularization can outperform ensemble methods.

Findings

01

Regularizing Lipschitz continuity reduces the gap between true and learned Bellman operators.

02

Single model with Lipschitz regularization outperforms ensemble models in experiments.

03

Theoretical analysis supports the effectiveness of Lipschitz regularization in model-based RL.

Abstract

Probabilistic dynamics model ensemble is widely used in existing model-based reinforcement learning methods as it outperforms a single dynamics model in both asymptotic performance and sample efficiency. In this paper, we provide both practical and theoretical insights on the empirical success of the probabilistic dynamics model ensemble through the lens of Lipschitz continuity. We find that, for a value function, the stronger the Lipschitz condition is, the smaller the gap between the true dynamics- and learned dynamics-induced Bellman operators is, thus enabling the converged value function to be closer to the optimal value function. Hence, we hypothesize that the key functionality of the probabilistic dynamics model ensemble is to regularize the Lipschitz condition of the value function using generated samples. To test this hypothesis, we devise two practical robust training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Smart Grid Security and Resilience

MethodsTest