Out-of-Distribution Adaptation in Offline RL: Counterfactual Reasoning   via Causal Normalizing Flows

Minjae Cho; Jonathan P. How; and Chuangchuang Sun

arXiv:2405.03892·cs.LG·May 8, 2024·1 cites

Out-of-Distribution Adaptation in Offline RL: Counterfactual Reasoning via Causal Normalizing Flows

Minjae Cho, Jonathan P. How, and Chuangchuang Sun

PDF

Open Access

TL;DR

This paper introduces MOOD-CRL, a causal inference-based offline reinforcement learning method using causal normalizing flows to improve out-of-distribution adaptation and policy performance beyond the dataset support.

Contribution

It develops Causal Normalizing Flows for data augmentation and counterfactual reasoning, enabling offline RL to better handle distributional shifts and out-of-distribution scenarios.

Findings

01

Outperforms existing offline RL methods significantly.

02

Demonstrates effective counterfactual reasoning capabilities.

03

Enhances OOD adaptation in offline policy training.

Abstract

Despite notable successes of Reinforcement Learning (RL), the prevalent use of an online learning paradigm prevents its widespread adoption, especially in hazardous or costly scenarios. Offline RL has emerged as an alternative solution, learning from pre-collected static datasets. However, this offline learning introduces a new challenge known as distributional shift, degrading the performance when the policy is evaluated on scenarios that are Out-Of-Distribution (OOD) from the training dataset. Most existing offline RL resolves this issue by regularizing policy learning within the information supported by the given dataset. However, such regularization overlooks the potential for high-reward regions that may exist beyond the dataset. This motivates exploring novel offline learning techniques that can make improvements beyond the data support without compromising policy performance,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Data Stream Mining Techniques · Reinforcement Learning in Robotics

MethodsCausal inference