Skill-Aware Diffusion for Generalizable Robotic Manipulation
Aoshen Huang, Jiaming Chen, Jiyu Cheng, Ran Song, Wei Pan, Wei Zhang

TL;DR
This paper introduces Skill-Aware Diffusion (SADiff), a novel method that explicitly incorporates skill-level information into diffusion models to enhance the generalization of robotic manipulation tasks across diverse environments.
Contribution
The paper proposes SADiff, which learns skill-specific representations and conditions a diffusion model with skill constraints, along with a new dataset IsaacSkill for evaluation and transfer.
Findings
SADiff outperforms existing methods in simulation and real-world tasks.
Skill-aware encoding improves task generalization.
The IsaacSkill dataset enables comprehensive evaluation.
Abstract
Robust generalization in robotic manipulation is crucial for robots to adapt flexibly to diverse environments. Existing methods usually improve generalization by scaling data and networks, but model tasks independently and overlook skill-level information. Observing that tasks within the same skill share similar motion patterns, we propose Skill-Aware Diffusion (SADiff), which explicitly incorporates skill-level information to improve generalization. SADiff learns skill-specific representations through a skill-aware encoding module with learnable skill tokens, and conditions a skill-constrained diffusion model to generate object-centric motion flow. A skill-retrieval transformation strategy further exploits skill-specific trajectory priors to refine the mapping from 2D motion flow to executable 3D actions. Furthermore, we introduce IsaacSkill, a high-fidelity dataset containing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Generative Adversarial Networks and Image Synthesis
