Doubly Mild Generalization for Offline Reinforcement Learning

Yixiu Mao; Qi Wang; Yun Qu; Yuhang Jiang; Xiangyang Ji

arXiv:2411.07934·cs.LG·November 14, 2024

Doubly Mild Generalization for Offline Reinforcement Learning

Yixiu Mao, Qi Wang, Yun Qu, Yuhang Jiang, Xiangyang Ji

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Doubly Mild Generalization (DMG), a novel offline RL approach that balances trusting limited generalization to improve performance while controlling overestimation and propagation errors.

Contribution

The paper proposes DMG, a new method combining mild action generalization and propagation control, with theoretical guarantees and state-of-the-art empirical results.

Findings

01

DMG outperforms existing methods on Gym-MuJoCo and AntMaze tasks.

02

DMG guarantees better performance than in-sample optimal policies.

03

DMG seamlessly transitions from offline to online learning with strong fine-tuning results.

Abstract

Offline Reinforcement Learning (RL) suffers from the extrapolation error and value overestimation. From a generalization perspective, this issue can be attributed to the over-generalization of value functions or policies towards out-of-distribution (OOD) actions. Significant efforts have been devoted to mitigating such generalization, and recent in-sample learning approaches have further succeeded in entirely eschewing it. Nevertheless, we show that mild generalization beyond the dataset can be trusted and leveraged to improve performance under certain conditions. To appropriately exploit generalization in offline RL, we propose Doubly Mild Generalization (DMG), comprising (i) mild action generalization and (ii) mild generalization propagation. The former refers to selecting actions in a close neighborhood of the dataset to maximize the Q values. Even so, the potential erroneous…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

maoyixiu/dmg
pytorchOfficial

Videos

Doubly Mild Generalization for Offline Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control