Offline Reinforcement Learning via High-Fidelity Generative Behavior   Modeling

Huayu Chen; Cheng Lu; Chengyang Ying; Hang Su; Jun Zhu

arXiv:2209.14548·cs.LG·March 1, 2023·6 cites

Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling

Huayu Chen, Cheng Lu, Chengyang Ying, Hang Su, Jun Zhu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a generative behavior modeling approach for offline reinforcement learning that improves action diversity and performance by decoupling policy learning from explicit parameterization, leveraging advanced generative models.

Contribution

It proposes a novel decoupled generative modeling framework for offline RL, enhancing diversity and effectiveness over traditional policy models.

Findings

01

Achieves competitive or superior performance on D4RL benchmarks.

02

Successfully learns from heterogeneous datasets with multiple strategies.

03

Outperforms previous methods in complex tasks like AntMaze.

Abstract

In offline reinforcement learning, weighted regression is a common method to ensure the learned policy stays close to the behavior policy and to prevent selecting out-of-sample actions. In this work, we show that due to the limited distributional expressivity of policy models, previous methods might still select unseen actions during training, which deviates from their initial motivation. To address this problem, we adopt a generative approach by decoupling the learned policy into two parts: an expressive generative behavior model and an action evaluation model. The key insight is that such decoupling avoids learning an explicitly parameterized policy model with a closed-form expression. Directly learning the behavior policy allows us to leverage existing advances in generative modeling, such as diffusion-based methods, to model diverse behaviors. As for action evaluation, we combine…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chendrag/sfbc
pytorchOfficial

Videos

Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics