Return Augmented Decision Transformer for Off-Dynamics Reinforcement Learning

Ruhan Wang; Yu Yang; Zhishuai Liu; Dongruo Zhou; Pan Xu

arXiv:2410.23450·cs.LG·March 3, 2026

Return Augmented Decision Transformer for Off-Dynamics Reinforcement Learning

Ruhan Wang, Yu Yang, Zhishuai Liu, Dongruo Zhou, Pan Xu

PDF

Open Access

TL;DR

This paper introduces the Return Augmented (REAG) method to improve Decision Transformer-based offline reinforcement learning in scenarios with dynamics shift, demonstrating consistent performance gains on benchmark datasets.

Contribution

The paper proposes the REAG approach to align return distributions across source and target domains in RCSL, with theoretical guarantees and practical implementations.

Findings

01

REAG improves DT-based RL performance in off-dynamics settings.

02

Theoretical analysis shows policy suboptimality is maintained with REAG.

03

Experiments on D4RL datasets validate the effectiveness of REAG methods.

Abstract

We study offline off-dynamics reinforcement learning (RL) to utilize data from an easily accessible source domain to enhance policy learning in a target domain with limited data. Our approach centers on return-conditioned supervised learning (RCSL), particularly focusing on Decision Transformer (DT) type frameworks, which can predict actions conditioned on desired return guidance and complete trajectory history. Previous works address the dynamics shift problem by augmenting the reward in the trajectory from the source domain to match the optimal trajectory in the target domain. However, this strategy can not be directly applicable in RCSL owing to (1) the unique form of the RCSL policy class, which explicitly depends on the return, and (2) the absence of a straightforward representation of the optimal trajectory distribution. We propose the Return Augmented (REAG) method for DT type…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Traffic control and management

MethodsAttention Is All You Need · Linear Layer · Dense Connections · Label Smoothing · Layer Normalization · Residual Connection · Position-Wise Feed-Forward Layer · Adam · Multi-Head Attention · Softmax