SE(3)-Equivariant Diffusion Policy in Spherical Fourier Space
Xupeng Zhu, Fan Wang, Robin Walters, Jane Shi

TL;DR
This paper introduces Spherical Diffusion Policy (SDP), an SE(3) equivariant diffusion method in spherical Fourier space that improves generalization of manipulation policies across 3D scene transformations, demonstrating significant performance gains.
Contribution
The paper presents a novel SE(3) equivariant diffusion policy using spherical Fourier space and spherical FiLM layers, enabling robust generalization in 3D manipulation tasks.
Findings
Significant performance improvements over baselines in 20 simulation tasks.
Effective generalization across transformed 3D scenes.
Successful application to physical robot tasks with different embodiments.
Abstract
Diffusion Policies are effective at learning closed-loop manipulation policies from human demonstrations but generalize poorly to novel arrangements of objects in 3D space, hurting real-world performance. To address this issue, we propose Spherical Diffusion Policy (SDP), an SE(3) equivariant diffusion policy that adapts trajectories according to 3D transformations of the scene. Such equivariance is achieved by embedding the states, actions, and the denoising process in spherical Fourier space. Additionally, we employ novel spherical FiLM layers to condition the action denoising process equivariantly on the scene embeddings. Lastly, we propose a spherical denoising temporal U-net that achieves spatiotemporal equivariance with computational efficiency. In the end, SDP is end-to-end SE(3) equivariant, allowing robust generalization across transformed 3D scenes. SDP demonstrates a large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Generative Adversarial Networks and Image Synthesis
