H$^3$DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning
Yiyang Lu, Yufeng Tian, Zhecheng Yuan, Xianbang Wang, Pu Hua, Zhengrong Xue, Huazhe Xu

TL;DR
H$^3$DP introduces a triply-hierarchical diffusion framework that enhances visuomotor learning by explicitly integrating multi-level visual features with action generation, leading to significant improvements in robotic manipulation tasks.
Contribution
The paper proposes a novel triply-hierarchical diffusion policy that explicitly models hierarchical visual features and their coupling with action generation in visuomotor learning.
Findings
Achieves +27.5% improvement over baselines in simulation tasks.
Outperforms in 4 challenging real-world bimanual manipulation tasks.
Effectively integrates multi-scale visual features with diffusion-based action generation.
Abstract
Visuomotor policy learning has witnessed substantial progress in robotic manipulation, with recent approaches predominantly relying on generative models to model the action distribution. However, these methods often overlook the critical coupling between visual perception and action prediction. In this work, we introduce \textbf{Triply-Hierarchical Diffusion Policy}~(\textbf{H^{\mathbf{3}}DP}), a novel visuomotor learning framework that explicitly incorporates hierarchical structures to strengthen the integration between visual features and action generation. HDP contains levels of hierarchy: (1) depth-aware input layering that organizes RGB-D observations based on depth information; (2) multi-scale visual representations that encode semantic features at varying levels of granularity; and (3) a hierarchically conditioned diffusion process that aligns the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Neural Networks and Applications
MethodsDiffusion
