TwinTrack: Bridging Vision and Contact Physics for Real-Time Tracking of Unknown Objects in Contact-Rich Scenes
Wen Yang, Zhixian Xie, Yiting Wang, Abhijit Tadepalli, Heni Ben Amor, Shan Lin, Wanxin Jin

TL;DR
TwinTrack is a physics-aware perception system that combines vision and contact physics to achieve robust, real-time 6-DoF pose tracking of unknown objects in contact-rich scenes, outperforming existing methods in challenging scenarios.
Contribution
The paper introduces TwinTrack, a novel system integrating Real2Sim and Sim2Real for contact physics-aware, real-time object tracking in complex scenes, which is a significant advancement over prior vision-only approaches.
Findings
Achieves over 20 Hz tracking speed in contact-rich scenarios.
Outperforms baseline methods in robustness and accuracy.
Effectively estimates physical properties like mass and friction.
Abstract
Real-time tracking of previously unseen, highly dynamic objects in contact-rich scenes, such as during dexterous in-hand manipulation, remains a major challenge. Pure vision-based approaches often fail under heavy occlusions due to frequent contact interactions and motion blur caused by abrupt impacts. We propose Twintrack, a physics-aware perception system that enables robust, real-time 6-DoF pose tracking of unknown dynamic objects in contact-rich scenes by leveraging contact physics cues. At its core, Twintrack integrates Real2Sim and Sim2Real. Real2Sim combines vision and contact physics to jointly estimate object geometry and physical properties: an initial reconstruction is obtained from vision, then refined by learning a geometry residual and simultaneously estimating physical parameters (e.g., mass, inertia, and friction) based on contact dynamics consistency. Sim2Real achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
