BECAUSE: Bilinear Causal Representation for Generalizable Offline   Model-based Reinforcement Learning

Haohong Lin; Wenhao Ding; Jian Chen; Laixi Shi; Jiacheng Zhu; Bo Li,; Ding Zhao

arXiv:2407.10967·cs.LG·March 4, 2025

BECAUSE: Bilinear Causal Representation for Generalizable Offline Model-based Reinforcement Learning

Haohong Lin, Wenhao Ding, Jian Chen, Laixi Shi, Jiacheng Zhu, Bo Li,, Ding Zhao

PDF

Open Access 1 Video

TL;DR

BECAUSE introduces a causal representation approach in offline model-based reinforcement learning to address distribution shift and objective mismatch, leading to improved robustness and sample efficiency across diverse tasks.

Contribution

The paper proposes BECAUSE, a novel causal representation method that reduces objective mismatch in offline MBRL, with theoretical guarantees and superior empirical performance.

Findings

01

BECAUSE outperforms existing offline RL algorithms on 18 diverse tasks.

02

It demonstrates robustness with fewer samples and more confounders.

03

Theoretical analysis confirms error bounds and sample efficiency.

Abstract

Offline model-based reinforcement learning (MBRL) enhances data efficiency by utilizing pre-collected datasets to learn models and policies, especially in scenarios where exploration is costly or infeasible. Nevertheless, its performance often suffers from the objective mismatch between model and policy learning, resulting in inferior performance despite accurate model predictions. This paper first identifies the primary source of this mismatch comes from the underlying confounders present in offline data for MBRL. Subsequently, we introduce \textbf{B}ilin\textbf{E}ar \textbf{CAUS}al r\textbf{E}presentation~(BECAUSE), an algorithm to capture causal representation for both states and actions to reduce the influence of the distribution shift, thus mitigating the objective mismatch problem. Comprehensive evaluations on 18 tasks that vary in data quality and environment context demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

BECAUSE: Bilinear Causal Representation for Generalizable Offline Model-based Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics