ViTaSCOPE: Visuo-tactile Implicit Representation for In-hand Pose and Extrinsic Contact Estimation
Jayjun Lee, Nima Fazeli

TL;DR
ViTaSCOPE introduces a neural implicit representation that fuses visual and tactile data to accurately estimate in-hand object poses and contact points, enhancing dexterous manipulation capabilities.
Contribution
It presents a novel visuo-tactile neural implicit model that combines signed distance and shear fields for precise object and contact localization, with zero-shot sim-to-real transfer.
Findings
Accurately localizes objects and contacts in simulation and real-world.
Enables seamless reasoning over visuo-tactile cues.
Demonstrates improved dexterous manipulation performance.
Abstract
Mastering dexterous, contact-rich object manipulation demands precise estimation of both in-hand object poses and external contact locationstasks particularly challenging due to partial and noisy observations. We present ViTaSCOPE: Visuo-Tactile Simultaneous Contact and Object Pose Estimation, an object-centric neural implicit representation that fuses vision and high-resolution tactile feedback. By representing objects as signed distance fields and distributed tactile feedback as neural shear fields, ViTaSCOPE accurately localizes objects and registers extrinsic contacts onto their 3D geometry as contact fields. Our method enables seamless reasoning over complementary visuo-tactile cues by leveraging simulation for scalable training and zero-shot transfers to the real-world by bridging the sim-to-real gap. We evaluate our method through comprehensive simulated and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTactile and Sensory Interactions · Motor Control and Adaptation · Hand Gesture Recognition Systems
