FG-CLTP: Fine-Grained Contrastive Language Tactile Pretraining for Robotic Manipulation

Wenxuan Ma; Chaofan Zhang; Yinghao Cai; Guocai Yao; Shaowei Cui; Shuo Wang

arXiv:2603.10871·cs.RO·March 12, 2026

FG-CLTP: Fine-Grained Contrastive Language Tactile Pretraining for Robotic Manipulation

Wenxuan Ma, Chaofan Zhang, Yinghao Cai, Guocai Yao, Shaowei Cui, Shuo Wang

PDF

Open Access

TL;DR

This paper introduces FG-CLTP, a novel framework for fine-grained tactile language pretraining in robotics, utilizing a large dataset and numerical tokenization to improve manipulation accuracy and generalization.

Contribution

The paper presents a new dataset, a discretized numerical tokenization method, and a 3D tactile-language-action architecture for enhanced robotic manipulation.

Findings

01

Achieved 95.9% classification accuracy

02

Reduced regression MAE by 52.6%

03

Minimal sim-to-real gap of 3.5%

Abstract

Recent advancements in integrating tactile sensing into vision-language-action (VLA) models have demonstrated transformative potential for robotic perception. However, existing tactile representations predominantly rely on qualitative descriptors (e.g., texture), neglecting quantitative contact states such as force magnitude, contact geometry, and principal axis orientation, which are indispensable for fine-grained manipulation. To bridge this gap, we propose FG-CLTP, a fine-grained contrastive language tactile pretraining framework. We first introduce a novel dataset comprising over 100k tactile 3D point cloud-language pairs that explicitly capture multidimensional contact states from the sensor's perspective. We then implement a discretized numerical tokenization mechanism to achieve quantitative-semantic alignment, effectively injecting explicit physical metrics into the multimodal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Sensor and Energy Harvesting Materials · Robot Manipulation and Learning · Tactile and Sensory Interactions