Offline Reinforcement Learning with Imbalanced Datasets

Li Jiang; Sijie Cheng; Jielin Qiu; Haoran Xu; Wai Kin Chan; Zhao Ding

arXiv:2307.02752·cs.LG·May 22, 2024·2 cites

Offline Reinforcement Learning with Imbalanced Datasets

Li Jiang, Sijie Cheng, Jielin Qiu, Haoran Xu, Wai Kin Chan, Zhao Ding

PDF

Open Access

TL;DR

This paper investigates the challenges of imbalanced datasets in offline reinforcement learning, revealing limitations of existing methods and proposing a retrieval-augmented approach that improves policy learning in skewed data distributions.

Contribution

The paper introduces a novel retrieval-augmented offline RL method to address dataset imbalance, enhancing policy extraction where traditional methods struggle.

Findings

01

Retrieval-augmented method outperforms baselines on imbalanced datasets

02

Imbalanced datasets follow a power law distribution in offline RL

03

Traditional distributional constraint methods like CQL are less effective with imbalanced data

Abstract

The prevalent use of benchmarks in current offline reinforcement learning (RL) research has led to a neglect of the imbalance of real-world dataset distributions in the development of models. The real-world offline RL dataset is often imbalanced over the state space due to the challenge of exploration or safety considerations. In this paper, we specify properties of imbalanced datasets in offline RL, where the state coverage follows a power law distribution characterized by skewed policies. Theoretically and empirically, we show that typically offline RL methods based on distributional constraints, such as conservative Q-learning (CQL), are ineffective in extracting policies under the imbalanced dataset. Inspired by natural intelligence, we propose a novel offline RL method that utilizes the augmentation of CQL with a retrieval process to recall past related experiences, effectively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques · Stock Market Forecasting Methods

MethodsQ-Learning