Efficient Cross-Domain Offline Reinforcement Learning with Dynamics- and Value-Aligned Data Filtering

Zhongjian Qiao; Rui Yang; Jiafei Lyu; Chenjia Bai; Xiu Li; Siyang Gao; Shuang Qiu

arXiv:2512.02435·cs.LG·March 23, 2026

Efficient Cross-Domain Offline Reinforcement Learning with Dynamics- and Value-Aligned Data Filtering

Zhongjian Qiao, Rui Yang, Jiafei Lyu, Chenjia Bai, Xiu Li, Siyang Gao, Shuang Qiu

PDF

Open Access

TL;DR

This paper introduces DVDF, a new cross-domain offline RL method that combines dynamics and value alignment for better policy learning, especially with limited target data, showing significant empirical improvements.

Contribution

It reveals the importance of value alignment alongside dynamics alignment and proposes a unified filtering framework for cross-domain offline RL.

Findings

01

DVDF outperforms baselines across various tasks.

02

Incorporating value alignment improves policy performance.

03

Effective even with very limited target data.

Abstract

Cross-domain offline reinforcement learning (RL) aims to train a well-performing agent in the target environment, leveraging both a limited target domain dataset and a source domain dataset with (possibly) sufficient data coverage. Due to the underlying dynamics misalignment between source and target domains, naively merging the two datasets may incur inferior performance. Recent advances address this issue by selectively leveraging source domain samples whose dynamics align well with the target domain. However, our work demonstrates that dynamics alignment alone is insufficient, by examining the limitations of prior frameworks and deriving a new target domain sub-optimality bound for the policy learned on the source domain. More importantly, our theory underscores an additional need for \textit{value alignment}, i.e., selecting high-quality, high-value samples from the source domain, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Reinforcement Learning in Robotics · Neurogenetic and Muscular Disorders Research