CLTP: Contrastive Language-Tactile Pre-training for 3D Contact Geometry Understanding

Wenxuan Ma; Xiaoge Cao; Yixiang Zhang; Chaofan Zhang; Shaobo Yang; Peng Hao; Bin Fang; Yinghao Cai; Shaowei Cui; Shuo Wang

arXiv:2505.08194·cs.RO·May 14, 2025

CLTP: Contrastive Language-Tactile Pre-training for 3D Contact Geometry Understanding

Wenxuan Ma, Xiaoge Cao, Yixiang Zhang, Chaofan Zhang, Shaobo Yang, Peng Hao, Bin Fang, Yinghao Cai, Shaowei Cui, Shuo Wang

PDF

TL;DR

CLTP introduces a novel framework that aligns tactile 3D point clouds with natural language to improve contact state understanding in robotic manipulation, enabling zero-shot classification and tactile-language interactions.

Contribution

It is the first to align tactile and language representations from the contact state perspective, using a large dataset and a pre-aligned feature space for improved manipulation tasks.

Findings

01

Outperforms existing methods in zero-shot 3D classification

02

Achieves high accuracy in contact state classification

03

Enables effective tactile language model interactions

Abstract

Recent advancements in integrating tactile sensing with vision-language models (VLMs) have demonstrated remarkable potential for robotic multimodal perception. However, existing tactile descriptions remain limited to superficial attributes like texture, neglecting critical contact states essential for robotic manipulation. To bridge this gap, we propose CLTP, an intuitive and effective language tactile pretraining framework that aligns tactile 3D point clouds with natural language in various contact scenarios, thus enabling contact-state-aware tactile language understanding for contact-rich manipulation tasks. We first collect a novel dataset of 50k+ tactile 3D point cloud-language pairs, where descriptions explicitly capture multidimensional contact states (e.g., contact location, shape, and force) from the tactile sensor's perspective. CLTP leverages a pre-aligned and frozen…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.