Fewer May Be Better: Enhancing Offline Reinforcement Learning with   Reduced Dataset

Yiqin Yang; Quanwei Wang; Chenghao Li; Hao Hu; Chengjie Wu; Yuhua; Jiang; Dianyu Zhong; Ziyou Zhang; Qianchuan Zhao; Chongjie Zhang; Xu Bo

arXiv:2502.18955·cs.LG·February 27, 2025

Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset

Yiqin Yang, Quanwei Wang, Chenghao Li, Hao Hu, Chengjie Wu, Yuhua, Jiang, Dianyu Zhong, Ziyou Zhang, Qianchuan Zhao, Chongjie Zhang, Xu Bo

PDF

Open Access

TL;DR

This paper introduces ReDOR, a method for selecting minimal yet effective datasets in offline reinforcement learning, improving performance and efficiency by framing subset selection as a submodular optimization problem.

Contribution

ReDOR reformulates dataset selection as a gradient approximation optimization, leveraging submodular optimization and orthogonal matching pursuit for efficient offline RL data subset selection.

Findings

01

ReDOR improves RL performance with smaller datasets

02

ReDOR reduces computational complexity in dataset selection

03

ReDOR identifies minimal data subsets necessary for solving tasks

Abstract

Offline reinforcement learning (RL) represents a significant shift in RL research, allowing agents to learn from pre-collected datasets without further interaction with the environment. A key, yet underexplored, challenge in offline RL is selecting an optimal subset of the offline dataset that enhances both algorithm performance and training efficiency. Reducing dataset size can also reveal the minimal data requirements necessary for solving similar problems. In response to this challenge, we introduce ReDOR (Reduced Datasets for Offline RL), a method that frames dataset selection as a gradient approximation optimization problem. We demonstrate that the widely used actor-critic framework in RL can be reformulated as a submodular optimization objective, enabling efficient subset selection. To achieve this, we adapt orthogonal matching pursuit (OMP), incorporating several novel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Adaptive Dynamic Programming Control