Understanding and Addressing the Pitfalls of Bisimulation-based   Representations in Offline Reinforcement Learning

Hongyu Zang; Xin Li; Leiji Zhang; Yang Liu; Baigui Sun; Riashat Islam,; Remi Tachet des Combes; Romain Laroche

arXiv:2310.17139·cs.LG·October 27, 2023·1 cites

Understanding and Addressing the Pitfalls of Bisimulation-based Representations in Offline Reinforcement Learning

Hongyu Zang, Xin Li, Leiji Zhang, Yang Liu, Baigui Sun, Riashat Islam,, Remi Tachet des Combes, Romain Laroche

PDF

Open Access 1 Video

TL;DR

This paper investigates why bisimulation-based state representations underperform in offline RL, identifies key issues like missing transitions and reward scaling, and proposes solutions that improve performance on benchmark tasks.

Contribution

It analyzes the pitfalls of bisimulation in offline RL, introduces an expectile operator and reward scaling strategies, and demonstrates improved results on benchmark datasets.

Findings

01

Bisimulation methods struggle with missing transitions in offline data.

02

Reward scaling is crucial to prevent feature collapse in representations.

03

Applying expectile operator and reward scaling improves performance on benchmarks.

Abstract

While bisimulation-based approaches hold promise for learning robust state representations for Reinforcement Learning (RL) tasks, their efficacy in offline RL tasks has not been up to par. In some instances, their performance has even significantly underperformed alternative methods. We aim to understand why bisimulation methods succeed in online settings, but falter in offline tasks. Our analysis reveals that missing transitions in the dataset are particularly harmful to the bisimulation principle, leading to ineffective estimation. We also shed light on the critical role of reward scaling in bounding the scale of bisimulation measurements and of the value error they induce. Based on these findings, we propose to apply the expectile operator for representation learning to our offline RL setting, which helps to prevent overfitting to incomplete data. Meanwhile, by introducing an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Understanding and Addressing the Pitfalls of Bisimulation-based Representations in Offline Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Software Engineering Research · Adversarial Robustness in Machine Learning