SI-Diff: A Framework for Learning Search and High-Precision Insertion with a Force-Domain Diffusion Policy
Yibo Liu, Stanko Oparnica, Simon Shewchun-Jakaitis, Guoyi Fu, Jie Wang, Jun Yang, Anand Jagannathan, Tony Hong-Yau Lo

TL;DR
SI-Diff is a unified framework that learns both search and high-precision insertion in contact-rich assembly tasks using a force-domain diffusion policy, improving tolerance to misalignments and enabling zero-shot transfer.
Contribution
It introduces a mode-conditioning mechanism and a search teacher policy to unify search and insertion tasks within a single model, enhancing robustness and transferability.
Findings
Extends x-y misalignment tolerance from 2mm to 5mm.
Demonstrates strong zero-shot transfer to unseen shapes.
Outperforms the state-of-the-art TacDiffusion baseline.
Abstract
Contact-rich assembly is fundamental in robotics but poses significant challenges due to uncertainties in relative poses, such as misalignments and small clearances in peg-in-hole tasks. Existing approaches typically address search and high-precision insertion separately, because these tasks involve distinct action patterns. However, supporting both tasks within a single model, without switching models or weights, is desirable for intelligent assembly systems. In this work, we propose SI-Diff, a framework that learns both search and high-precision insertion through a force-domain diffusion policy. To this end, we introduce a new mode-conditioning mechanism that enables the policy to capture distinct action behaviors under a single framework. Moreover, we develop a new search teacher policy that can generate diverse trajectories. By training on successful and efficient demonstrations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
