Model-based Offline Policy Optimization with Adversarial Network

Junming Yang; Xingguo Chen; Shengyuan Wang; Bolei Zhang

arXiv:2309.02157·cs.LG·September 6, 2023

Model-based Offline Policy Optimization with Adversarial Network

Junming Yang, Xingguo Chen, Shengyuan Wang, Bolei Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces MOAN, a model-based offline RL framework using adversarial learning to improve transition model generalization, uncertainty estimation, and exploration, leading to superior performance on benchmarks.

Contribution

Proposes MOAN, which employs adversarial learning for better generalization and uncertainty quantification in offline RL transition models, addressing over-conservatism and unreliable estimates.

Findings

01

Outperforms state-of-the-art offline RL baselines

02

Generates diverse in-distribution samples

03

Provides more accurate uncertainty quantification

Abstract

Model-based offline reinforcement learning (RL), which builds a supervised transition model with logging dataset to avoid costly interactions with the online environment, has been a promising approach for offline policy optimization. As the discrepancy between the logging data and online environment may result in a distributional shift problem, many prior works have studied how to build robust transition models conservatively and estimate the model uncertainty accurately. However, the over-conservatism can limit the exploration of the agent, and the uncertainty estimates may be unreliable. In this work, we propose a novel Model-based Offline policy optimization framework with Adversarial Network (MOAN). The key idea is to use adversarial learning to build a transition model with better generalization, where an adversary is introduced to distinguish between in-distribution and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

junming-yang/moan
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Fuel Cells and Related Materials