Ensemble Successor Representations for Task Generalization in   Offline-to-Online Reinforcement Learning

Changhong Wang; Xudong Yu; Chenjia Bai; Qiaosheng Zhang; Zhen Wang

arXiv:2405.07223·cs.LG·May 14, 2024

Ensemble Successor Representations for Task Generalization in Offline-to-Online Reinforcement Learning

Changhong Wang, Xudong Yu, Chenjia Bai, Qiaosheng Zhang, Zhen Wang

PDF

Open Access

TL;DR

This paper proposes an ensemble successor representation approach for efficient task generalization in offline-to-online reinforcement learning, enabling rapid adaptation to new tasks using offline data.

Contribution

It introduces a novel ensemble successor representation method that improves task generalization and online adaptation in offline-to-online RL settings.

Findings

01

Outperforms existing methods in generalizing to unseen tasks

02

Enables faster online adaptation with offline data

03

Robust to datasets with different coverage

Abstract

In Reinforcement Learning (RL), training a policy from scratch with online experiences can be inefficient because of the difficulties in exploration. Recently, offline RL provides a promising solution by giving an initialized offline policy, which can be refined through online interactions. However, existing approaches primarily perform offline and online learning in the same task, without considering the task generalization problem in offline-to-online adaptation. In real-world applications, it is common that we only have an offline dataset from a specific task while aiming for fast online-adaptation for several tasks. To address this problem, our work builds upon the investigation of successor representations for task generalization in online RL and extends the framework to incorporate offline-to-online learning. We demonstrate that the conventional paradigm using successor features…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics