Efficient Hybrid SE(3)-Equivariant Visuomotor Flow Policy via Spherical Harmonics for Robot Manipulation

Qinglun Zhang; Shen Cheng; Tian Dan; Haoqiang Fan; Guanghui Liu; Shuaicheng Liu

arXiv:2603.23227·cs.RO·March 25, 2026

Efficient Hybrid SE(3)-Equivariant Visuomotor Flow Policy via Spherical Harmonics for Robot Manipulation

Qinglun Zhang, Shen Cheng, Tian Dan, Haoqiang Fan, Guanghui Liu, Shuaicheng Liu

PDF

Open Access

TL;DR

E3Flow is a novel spherical harmonic-based framework for robot manipulation that improves efficiency, stability, and multi-modal learning, achieving higher success rates and faster inference compared to previous methods.

Contribution

The paper introduces E3Flow, a new SO(3)-equivariant diffusion policy framework that unifies efficient rectified flow with multi-modal visual fusion using spherical harmonics.

Findings

01

Achieves 3.12% higher success rate than Spherical Diffusion Policy

02

Provides a 7x faster inference speed

03

Successfully validates effectiveness in 8 simulation and 4 real-world tasks

Abstract

While existing equivariant methods enhance data efficiency, they suffer from high computational intensity, reliance on single-modality inputs, and instability when combined with fast-sampling methods. In this work, we propose E3Flow, a novel framework that addresses the critical limitations of equivariant diffusion policies. E3Flow overcomes these challenges, successfully unifying efficient rectified flow with stable, multi-modal equivariant learning for the first time. Our framework is built upon spherical harmonic representations to ensure rigorous SO(3) equivariance. We introduce a novel invariant Feature Enhancement Module (FEM) that dynamically fuses hybrid visual modalities (point clouds and images), injecting rich visual cues into the spherical harmonic features. We evaluate E3Flow on 8 manipulation tasks from the MimicGen and further conduct 4 real-world experiments to validate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Robot Manipulation and Learning · Reinforcement Learning in Robotics