RA-Touch: Retrieval-Augmented Touch Understanding with Enriched Visual Data
Yoorhim Cho, Hongyeob Kim, Semin Kim, Youjia Zhang, Yunseok Choi, Sungeun Hong

TL;DR
RA-Touch is a retrieval-augmented framework that enhances visuo-tactile perception by leveraging tactile-focused descriptions in visual data, enabling better understanding of tactile properties without direct tactile data collection.
Contribution
The paper introduces RA-Touch, a novel retrieval-augmented approach that incorporates tactile semantics into visual data to improve tactile property understanding.
Findings
Outperforms prior methods on the TVL benchmark.
Effectively utilizes tactile-aware external descriptions.
Demonstrates the potential of retrieval-based visual reuse for tactile understanding.
Abstract
Visuo-tactile perception aims to understand an object's tactile properties, such as texture, softness, and rigidity. However, the field remains underexplored because collecting tactile data is costly and labor-intensive. We observe that visually distinct objects can exhibit similar surface textures or material properties. For example, a leather sofa and a leather jacket have different appearances but share similar tactile properties. This implies that tactile understanding can be guided by material cues in visual data, even without direct tactile supervision. In this paper, we introduce RA-Touch, a retrieval-augmented framework that improves visuo-tactile perception by leveraging visual data enriched with tactile semantics. We carefully recaption a large-scale visual dataset with tactile-focused descriptions, enabling the model to access tactile semantics typically absent from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTactile and Sensory Interactions · Interactive and Immersive Displays · Gaze Tracking and Assistive Technology
MethodsFocus
