VIRDO: Visio-tactile Implicit Representations of Deformable Objects
Youngsun Wi, Pete Florence, Andy Zeng, Nima Fazeli

TL;DR
VIRDO introduces a novel visio-tactile implicit representation for deformable objects that enables high-fidelity reconstruction, generalization to new contacts, and state estimation using visual and tactile data.
Contribution
This work presents VIRDO, a multi-modal implicit representation that integrates visual and tactile data for deformable object modeling in robotics.
Findings
High-fidelity cross-modal reconstructions achieved
Effective generalization to unseen contact formations
State estimation with partial visio-tactile feedback demonstrated
Abstract
Deformable object manipulation requires computationally efficient representations that are compatible with robotic sensing modalities. In this paper, we present VIRDO:an implicit, multi-modal, and continuous representation for deformable-elastic objects. VIRDO operates directly on visual (point cloud) and tactile (reaction forces) modalities and learns rich latent embeddings of contact locations and forces to predict object deformations subject to external contacts.Here, we demonstrate VIRDOs ability to: i) produce high-fidelity cross-modal reconstructions with dense unsupervised correspondences, ii) generalize to unseen contact formations,and iii) state-estimation with partial visio-tactile feedback
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Tactile and Sensory Interactions · Advanced Sensor and Energy Harvesting Materials
