Visuo-Tactile based Predictive Cross Modal Perception for Object Exploration in Robotics
Anirvan Dutta, Etienne Burdet, Mohsen Kaboli

TL;DR
This paper presents a visuo-tactile predictive framework enabling robots to efficiently explore and infer physical properties of unknown objects through interactive pushing and cross-modal perception, improving autonomous understanding in unstructured environments.
Contribution
A novel visuo-tactile cross-modal perception framework that uses initial visual shape data to improve property estimation and predictive capabilities in robotic exploration.
Findings
Framework achieves superior performance in real-robot experiments.
Initial visual priors enhance property estimation efficiency.
Predictive cross-modal perception improves object exploration accuracy.
Abstract
Autonomously exploring the unknown physical properties of novel objects such as stiffness, mass, center of mass, friction coefficient, and shape is crucial for autonomous robotic systems operating continuously in unstructured environments. We introduce a novel visuo-tactile based predictive cross-modal perception framework where initial visual observations (shape) aid in obtaining an initial prior over the object properties (mass). The initial prior improves the efficiency of the object property estimation, which is autonomously inferred via interactive non-prehensile pushing and using a dual filtering approach. The inferred properties are then used to enhance the predictive capability of the cross-modal function efficiently by using a human-inspired `surprise' formulation. We evaluated our proposed framework in the real-robotic scenario, demonstrating superior performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Tactile and Sensory Interactions · Interactive and Immersive Displays
