HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals
Guimin Hu, Daniel Hershcovich, Hasti Seifi

TL;DR
This paper introduces HapticCap, a large multimodal dataset of vibration signals with user descriptions, and proposes a retrieval task to improve understanding and description of haptic signals using contrastive learning.
Contribution
The paper presents the first fully human-annotated haptic-captioned dataset and a novel retrieval task to enhance modeling of vibration signals with textual descriptions.
Findings
HapticCap contains 92,070 haptic-text pairs.
Language and audio models improve retrieval performance.
Separate training for description categories yields better results.
Abstract
Haptic signals, from smartphone vibrations to virtual reality touch feedback, can effectively convey information and enhance realism, but designing signals that resonate meaningfully with users is challenging. To facilitate this, we introduce a multimodal dataset and task, of matching user descriptions to vibration haptic signals, and highlight two primary challenges: (1) lack of large haptic vibration datasets annotated with textual descriptions as collecting haptic descriptions is time-consuming, and (2) limited capability of existing tasks and models to describe vibration signals in text. To advance this area, we create HapticCap, the first fully human-annotated haptic-captioned dataset, containing 92,070 haptic-text pairs for user descriptions of sensory, emotional, and associative attributes of vibrations. Based on HapticCap, we propose the haptic-caption retrieval task and present…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSafety Warnings and Signage · Tactile and Sensory Interactions
