EquiContact: A Hierarchical SE(3) Vision-to-Force Equivariant Policy for Spatially Generalizable Contact-rich Tasks

Joohwan Seo; Arvind Kruthiventy; Soomi Lee; Megan Teng; Seoyeon Choi; Xiang Zhang; Jongeun Choi; and Roberto Horowitz

arXiv:2507.10961·cs.RO·February 2, 2026

EquiContact: A Hierarchical SE(3) Vision-to-Force Equivariant Policy for Spatially Generalizable Contact-rich Tasks

Joohwan Seo, Arvind Kruthiventy, Soomi Lee, Megan Teng, Seoyeon Choi, Xiang Zhang, Jongeun Choi, and Roberto Horowitz

PDF

Open Access

TL;DR

EquiContact introduces a hierarchical, SE(3)-equivariant vision-to-force policy that enables spatially generalizable contact-rich manipulation, demonstrated on real-world tasks with high success rates from limited demonstrations.

Contribution

The paper proposes EquiContact, a novel hierarchical policy combining a diffusion-based vision planner and a compliant visuomotor controller, achieving spatial generalization in contact-rich tasks.

Findings

01

High success rate on peg-in-hole, screwing, and wiping tasks.

02

Robust generalization to unseen spatial configurations.

03

SE(3)-equivariance from perception to control.

Abstract

This paper presents a framework for learning vision-based robotic policies for contact-rich manipulation tasks that generalize spatially across task configurations. We focus on achieving robust spatial generalization of the policy for the peg-in-hole (PiH) task trained from a small number of demonstrations. We propose EquiContact, a hierarchical policy composed of a high-level vision planner (Diffusion Equivariant Descriptor Field, Diff-EDF) and a novel low-level compliant visuomotor policy (Geometric Compliant ACT, G-CompACT). G-CompACT operates using only localized observations (geometrically consistent error vectors (GCEV), force-torque readings, and wrist-mounted RGB images) and produces actions defined in the end-effector frame. Through these design choices, we show that the entire EquiContact pipeline is SE(3)-equivariant, from perception to force control. We also outline three…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Tactile and Sensory Interactions · EEG and Brain-Computer Interfaces