A Roadmap for Embodied and Social Grounding in LLMs
Sara Incao, Carlo Mazzola, Giulia Belgiovine, Alessandra Sciutti

TL;DR
This paper proposes a comprehensive roadmap for grounding Large Language Models in embodied, social, and temporal experiences to enhance their understanding and interaction with the physical and social world.
Contribution
It introduces a novel framework emphasizing active bodily systems, structured temporal experiences, and social skills for improved LLM grounding in robotics.
Findings
Highlights the importance of bodily experience for language understanding
Proposes integrating social skills for shared experiences
Suggests temporal structuring for coherent interactions
Abstract
The fusion of Large Language Models (LLMs) and robotic systems has led to a transformative paradigm in the robotic field, offering unparalleled capabilities not only in the communication domain but also in skills like multimodal input handling, high-level reasoning, and plan generation. The grounding of LLMs knowledge into the empirical world has been considered a crucial pathway to exploit the efficiency of LLMs in robotics. Nevertheless, connecting LLMs' representations to the external world with multimodal approaches or with robots' bodies is not enough to let them understand the meaning of the language they are manipulating. Taking inspiration from humans, this work draws attention to three necessary elements for an agent to grasp and experience the world. The roadmap for LLMs grounding is envisaged in an active bodily system as the reference point for experiencing the environment,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Topic Modeling · Multi-Agent Systems and Negotiation
MethodsSoftmax · Attention Is All You Need
