DmC: Nearest Neighbor Guidance Diffusion Model for Offline Cross-domain Reinforcement Learning
Linh Le Pham Van, Minh Hoang Nguyen, Duc Kieu, Hung Le, Hung The Tran, Sunil Gupta

TL;DR
This paper introduces DmC, a novel approach for cross-domain offline reinforcement learning that uses k-NN estimation and a diffusion model to generate target-aligned samples, improving performance with limited target data.
Contribution
The paper proposes DmC, a new framework combining k-NN based domain proximity estimation and a diffusion model to generate aligned samples, addressing dataset imbalance and partial domain overlap.
Findings
DmC outperforms existing methods in MuJoCo environments.
Using k-NN mitigates overfitting in domain gap estimation.
Generated samples improve policy learning with limited target data.
Abstract
Cross-domain offline reinforcement learning (RL) seeks to enhance sample efficiency in offline RL by utilizing additional offline source datasets. A key challenge is to identify and utilize source samples that are most relevant to the target domain. Existing approaches address this challenge by measuring domain gaps through domain classifiers, target transition dynamics modeling, or mutual information estimation using contrastive loss. However, these methods often require large target datasets, which is impractical in many real-world scenarios. In this work, we address cross-domain offline RL under a limited target data setting, identifying two primary challenges: (1) Dataset imbalance, which is caused by large source and small target datasets and leads to overfitting in neural network-based domain gap estimators, resulting in uninformative measurements; and (2) Partial domain overlap,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Reinforcement Learning in Robotics · Face recognition and analysis
