Off-policy Reinforcement Learning with Model-based Exploration Augmentation
Likun Wang, Xiangteng Zhang, Yinuo Wang, Guojian Zhan, Wenxuan Wang, Haoyu Gao, Jingliang Duan, Shengbo Eben Li

TL;DR
This paper introduces MoGE, a model-based exploration augmentation method that synthesizes critical states and transitions to improve exploration efficiency in reinforcement learning, demonstrating significant performance gains.
Contribution
MoGE is a novel, modular approach combining diffusion-based state generation and transition modeling to enhance exploration in off-policy RL algorithms.
Findings
Improves sample efficiency in complex control tasks.
Achieves higher performance compared to baseline methods.
Effectively generates under-explored states and transitions.
Abstract
Exploration is fundamental to reinforcement learning (RL), as it determines how effectively an agent discovers and exploits the underlying structure of its environment to achieve optimal performance. Existing exploration methods generally fall into two categories: active exploration and passive exploration. The former introduces stochasticity into the policy but struggles in high-dimensional environments, while the latter adaptively prioritizes transitions in the replay buffer to enhance exploration, yet remains constrained by limited sample diversity. To address the limitation in passive exploration, we propose Modelic Generative Exploration (MoGE), which augments exploration through the generation of under-explored critical states and synthesis of dynamics-consistent experiences through transition models. MoGE is composed of two components: (1) a diffusion-based generator that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
