How does fine-tuning improve sensorimotor representations in large language models?
Minghua Wu, Javier Conde, Pedro Reviriego, Marc Brysbaert

TL;DR
This paper investigates how task-specific fine-tuning can align large language models' internal representations with human sensorimotor experiences, showing that fine-tuning can improve embodiment but is sensitive to task format.
Contribution
It demonstrates that fine-tuning can steer LLM representations toward more embodied patterns and explores the conditions affecting this transfer.
Findings
Fine-tuning improves sensorimotor alignment in LLMs.
Embodiment gains generalize across languages and related dimensions.
Sensorimotor improvements are sensitive to learning objectives and do not transfer across task formats.
Abstract
Large Language Models (LLMs) exhibit a significant "embodiment gap", where their text-based representations fail to align with human sensorimotor experiences. This study systematically investigates whether and how task-specific fine-tuning can bridge this gap. Utilizing Representational Similarity Analysis (RSA) and dimension-specific correlation metrics, we demonstrate that the internal representations of LLMs can be steered toward more embodied, grounded patterns through fine-tuning. Furthermore, the results show that while sensorimotor improvements generalize robustly across languages and related sensory-motor dimensions, they are highly sensitive to the learning objective, failing to transfer across two disparate task formats.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAction Observation and Synchronization · Neurobiology of Language and Bilingualism · Embodied and Extended Cognition
