Bridging Dynamics Gaps via Diffusion Schr\"odinger Bridge for Cross-Domain Reinforcement Learning
Hanping Zhang, Yuhong Guo

TL;DR
This paper introduces BDGxRL, a novel framework that uses Diffusion Schr"odinger Bridge to align source and target dynamics for cross-domain reinforcement learning, enabling effective policy transfer without target environment interaction.
Contribution
We propose BDGxRL, which leverages DSB and reward modulation to enable target-oriented policy learning solely within the source domain, addressing dynamics shifts.
Findings
BDGxRL outperforms state-of-the-art methods on MuJoCo benchmarks.
It demonstrates strong adaptability under transition dynamics shifts.
The framework effectively aligns source and target dynamics without target environment access.
Abstract
Cross-domain reinforcement learning (RL) aims to learn transferable policies under dynamics shifts between source and target domains. A key challenge lies in the lack of target-domain environment interaction and reward supervision, which prevents direct policy learning. To address this challenge, we propose Bridging Dynamics Gaps for Cross-Domain Reinforcement Learning (BDGxRL), a novel framework that leverages Diffusion Schr\"odinger Bridge (DSB) to align source transitions with target-domain dynamics encoded in offline demonstrations. Moreover, we introduce a reward modulation mechanism that estimates rewards based on state transitions, applying to DSB-aligned samples to ensure consistency between rewards and target-domain dynamics. BDGxRL performs target-oriented policy learning entirely within the source domain, without access to the target environment or its rewards. Experiments on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Neural Networks and Reservoir Computing
