Efficient Hybrid SE(3)-Equivariant Visuomotor Flow Policy via Spherical Harmonics for Robot Manipulation
Qinglun Zhang, Shen Cheng, Tian Dan, Haoqiang Fan, Guanghui Liu, Shuaicheng Liu

TL;DR
E3Flow is a novel spherical harmonic-based framework for robot manipulation that improves efficiency, stability, and multi-modal learning, achieving higher success rates and faster inference compared to previous methods.
Contribution
The paper introduces E3Flow, a new SO(3)-equivariant diffusion policy framework that unifies efficient rectified flow with multi-modal visual fusion using spherical harmonics.
Findings
Achieves 3.12% higher success rate than Spherical Diffusion Policy
Provides a 7x faster inference speed
Successfully validates effectiveness in 8 simulation and 4 real-world tasks
Abstract
While existing equivariant methods enhance data efficiency, they suffer from high computational intensity, reliance on single-modality inputs, and instability when combined with fast-sampling methods. In this work, we propose E3Flow, a novel framework that addresses the critical limitations of equivariant diffusion policies. E3Flow overcomes these challenges, successfully unifying efficient rectified flow with stable, multi-modal equivariant learning for the first time. Our framework is built upon spherical harmonic representations to ensure rigorous SO(3) equivariance. We introduce a novel invariant Feature Enhancement Module (FEM) that dynamically fuses hybrid visual modalities (point clouds and images), injecting rich visual cues into the spherical harmonic features. We evaluate E3Flow on 8 manipulation tasks from the MimicGen and further conduct 4 real-world experiments to validate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Robot Manipulation and Learning · Reinforcement Learning in Robotics
