SToLa: Self-Adaptive Touch-Language Framework with Tactile Commonsense Reasoning in Open-Ended Scenarios
Ning Cheng, Jinan Xu, Jialing Chen, Bin Fang, Wenjuan Han

TL;DR
This paper introduces SToLa, a self-adaptive framework that enhances tactile and language reasoning in open-ended scenarios by using Mixture of Experts and a new comprehensive tactile commonsense dataset.
Contribution
SToLa is the first framework to dynamically unify tactile and language modalities using Mixture of Experts for improved reasoning in open-ended physical scenarios.
Findings
SToLa achieves competitive results on PhysiCLeAR benchmark.
The Mixture of Experts architecture effectively manages multimodal data.
The new dataset enables more diverse and complex tactile reasoning evaluation.
Abstract
This paper explores the challenges of integrating tactile sensing into intelligent systems for multimodal reasoning, particularly in enabling commonsense reasoning about the open-ended physical world. We identify two key challenges: modality discrepancy, where existing large touch-language models often treat touch as a mere sub-modality of language, and open-ended tactile data scarcity, where current datasets lack the diversity, open-endness and complexity needed for reasoning. To overcome these challenges, we introduce SToLa, a Self-Adaptive Touch-Language framework. SToLa utilizes Mixture of Experts (MoE) to dynamically process, unify, and manage tactile and language modalities, capturing their unique characteristics. Crucially, we also present a comprehensive tactile commonsense reasoning dataset and benchmark featuring free-form questions and responses, 8 physical properties, 4…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Sensor and Energy Harvesting Materials · Tactile and Sensory Interactions
