TL;DR
DADP introduces a novel approach for learning domain adaptive policies by disentangling static domain features from dynamic properties using diffusion models and temporal context strategies, enabling better zero-shot generalization.
Contribution
The paper proposes DADP, a method that employs unsupervised disentanglement and diffusion injection to improve domain adaptation in control policies.
Findings
DADP outperforms prior methods on locomotion and manipulation benchmarks.
Lagged Context Dynamical Prediction effectively disentangles static domain features.
DADP demonstrates strong zero-shot generalization across unseen domains.
Abstract
Learning domain adaptive policies that can generalize to unseen transition dynamics, remains a fundamental challenge in learning-based control. Substantial progress has been made through domain representation learning to capture domain-specific information, thus enabling domain-aware decision making. We analyze the process of learning domain representations through dynamical prediction and find that selecting contexts adjacent to the current step causes the learned representations to entangle static domain information with varying dynamical properties. Such mixture can confuse the conditioned policy, thereby constraining zero-shot adaptation. To tackle the challenge, we propose DADP (Domain Adaptive Diffusion Policy), which achieves robust adaptation through unsupervised disentanglement and domain-aware diffusion injection. First, we introduce Lagged Context Dynamical Prediction, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
