Causal prompting model-based offline reinforcement learning

Xuehui Yu; Yi Guan; Rujia Shen; Xin Li; Chen Tang; Jingchi Jiang

arXiv:2406.01065·cs.LG·June 4, 2024

Causal prompting model-based offline reinforcement learning

Xuehui Yu, Yi Guan, Rujia Shen, Xin Li, Chen Tang, Jingchi Jiang

PDF

TL;DR

This paper introduces CPRL, a novel framework for offline reinforcement learning that effectively handles noisy, diverse datasets and generalizes across tasks using causal prompts and skill reuse, demonstrated on real-world medical data.

Contribution

The paper proposes the CPRL framework with Hip-BCPD for modeling dynamics and a skill-reuse strategy, advancing robustness and generalization in offline RL for online systems.

Findings

01

Outperforms existing algorithms in noisy, out-of-distribution environments

02

Effectively models environmental dynamics with Hip-BCPD

03

Enables multi-task learning through skill reuse

Abstract

Model-based offline Reinforcement Learning (RL) allows agents to fully utilise pre-collected datasets without requiring additional or unethical explorations. However, applying model-based offline RL to online systems presents challenges, primarily due to the highly suboptimal (noise-filled) and diverse nature of datasets generated by online systems. To tackle these issues, we introduce the Causal Prompting Reinforcement Learning (CPRL) framework, designed for highly suboptimal and resource-constrained online scenarios. The initial phase of CPRL involves the introduction of the Hidden-Parameter Block Causal Prompting Dynamic (Hip-BCPD) to model environmental dynamics. This approach utilises invariant causal prompts and aligns hidden parameters to generalise to new and diverse online users. In the subsequent phase, a single policy is trained to address multiple tasks through the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.