VIRDO++: Real-World, Visuo-tactile Dynamics and Perception of Deformable Objects
Youngsun Wi, Andy Zeng, Pete Florence, Nima Fazeli

TL;DR
VIRDO++ is a novel multimodal neural approach that integrates vision and touch for real-world deformable object manipulation, enabling accurate state estimation and dynamics prediction without privileged contact data.
Contribution
The paper introduces VIRDO++, a new formulation and algorithm for visuo-tactile state estimation and dynamics prediction that generalizes to unseen objects and contact scenarios.
Findings
High-fidelity visuo-tactile state estimation demonstrated
Effective dynamics prediction for deformable objects
Generalization to unseen objects and contact formations
Abstract
Deformable objects manipulation can benefit from representations that seamlessly integrate vision and touch while handling occlusions. In this work, we present a novel approach for, and real-world demonstration of, multimodal visuo-tactile state-estimation and dynamics prediction for deformable objects. Our approach, VIRDO++, builds on recent progress in multimodal neural implicit representations for deformable object state-estimation [1] via a new formulation for deformation dynamics and a complementary state-estimation algorithm that (i) maintains a belief over deformations, and (ii) enables practical real-world application by removing the need for privileged contact information. In the context of two real-world robotic tasks, we show:(i) high-fidelity cross-modal state-estimation and prediction of deformable objects from partial visuo-tactile feedback, and (ii) generalization to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTactile and Sensory Interactions · Robot Manipulation and Learning · Motor Control and Adaptation
