Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper
Xinyue Zhu, Binghao Huang, Yunzhu Li

TL;DR
This paper introduces a portable visuo-tactile gripper and a cross-modal learning framework that enhances robotic manipulation by integrating visual and tactile data, leading to more precise and robust tasks in real-world settings.
Contribution
The work presents a novel lightweight tactile-enabled gripper and a cross-modal representation learning method that improves manipulation performance in diverse environments.
Findings
Enhanced accuracy in fine-grained tasks
Robustness to external disturbances
Effective multimodal policy learning
Abstract
Handheld grippers are increasingly used to collect human demonstrations due to their ease of deployment and versatility. However, most existing designs lack tactile sensing, despite the critical role of tactile feedback in precise manipulation. We present a portable, lightweight gripper with integrated tactile sensors that enables synchronized collection of visual and tactile data in diverse, real-world, and in-the-wild settings. Building on this hardware, we propose a cross-modal representation learning framework that integrates visual and tactile signals while preserving their distinct characteristics. The learning procedure allows the emergence of interpretable representations that consistently focus on contacting regions relevant for physical interactions. When used for downstream manipulation tasks, these representations enable more efficient and effective policy learning,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Sensor and Energy Harvesting Materials · Robot Manipulation and Learning · Soft Robotics and Applications
