EquAct: An SE(3)-Equivariant Multi-Task Transformer for Open-Loop Robotic Manipulation

Xupeng Zhu; Yu Qi; Yizhe Zhu; Robin Walters; Robert Platt

arXiv:2505.21351·cs.RO·May 28, 2025

EquAct: An SE(3)-Equivariant Multi-Task Transformer for Open-Loop Robotic Manipulation

Xupeng Zhu, Yu Qi, Yizhe Zhu, Robin Walters, Robert Platt

PDF

Open Access

TL;DR

This paper introduces EquAct, a novel SE(3)-equivariant multi-task transformer that ensures geometric consistency in language-conditioned robotic manipulation, demonstrating superior generalization and performance in simulation and real-world tasks.

Contribution

EquAct is the first SE(3)-equivariant transformer for multi-task manipulation, combining a point cloud U-net with spherical Fourier features and invariant FiLM layers for language conditioning.

Findings

01

EquAct achieves state-of-the-art results on 18 RLBench tasks.

02

EquAct generalizes well under SE(3) and SE(2) scene perturbations.

03

EquAct performs effectively on physical robotic tasks.

Abstract

Transformer architectures can effectively learn language-conditioned, multi-task 3D open-loop manipulation policies from demonstrations by jointly processing natural language instructions and 3D observations. However, although both the robot policy and language instructions inherently encode rich 3D geometric structures, standard transformers lack built-in guarantees of geometric consistency, often resulting in unpredictable behavior under SE(3) transformations of the scene. In this paper, we leverage SE(3) equivariance as a key structural property shared by both policy and language, and propose EquAct-a novel SE(3)-equivariant multi-task transformer. EquAct is theoretically guaranteed to be SE(3) equivariant and consists of two key components: (1) an efficient SE(3)-equivariant point cloud-based U-net with spherical Fourier features for policy reasoning, and (2) SE(3)-invariant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Teleoperation and Haptic Systems · Advanced Control Systems Optimization