ReTac-ACT: A State-Gated Vision-Tactile Fusion Transformer for Precision Assembly
Minchi Ruan, LiangQing Zhou, Hongtong Li, Zongtao Wang, ZhaoMing Lu, Jianwei Zhang, Bin Fang

TL;DR
ReTac-ACT is a novel vision-tactile fusion transformer that significantly improves precision in robotic assembly tasks by dynamically integrating visual and tactile feedback, especially under occlusion conditions.
Contribution
The paper introduces ReTac-ACT, a new vision-tactile policy with bidirectional cross-attention, a gating network, and tactile reconstruction, advancing robotic precision assembly capabilities.
Findings
Achieves 90% success rate on peg-in-hole tasks.
Maintains 80% success at 0.1mm clearance.
Outperforms vision-only and baseline methods.
Abstract
Precision assembly requires sub-millimeter corrections in contact-rich "last-millimeter" regions where visual feedback fails due to occlusion from the end-effector and workpiece. We present ReTac-ACT (Reconstruction-enhanced Tactile ACT), a vision-tactile imitation learning policy that addresses this challenge through three synergistic mechanisms: (i) bidirectional cross-attention enabling reciprocal visuo-tactile feature enhancement before fusion, (ii) a proprioception-conditioned gating network that dynamically elevates tactile reliance when visual occlusion occurs, and (iii) a tactile reconstruction objective enforcing learning of manipulation-relevant contact information rather than generic visual textures. Evaluated on the standardized NIST Assembly Task Board M1 benchmark, ReTac-ACT achieves 90% peg-in-hole success, substantially outperforming vision-only and generalist baseline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Advanced Sensor and Energy Harvesting Materials · Soft Robotics and Applications
