Language Bootstrapping: Learning Word Meanings From Perception-Action Association
Giampiero Salvi, Luis Montesano, Alexandre Bernardino, Jos\'e, Santos-Victor

TL;DR
This paper presents a method for robots to learn word meanings through perception-action associations during manipulation tasks, enabling language grounding without grammatical analysis.
Contribution
The study extends an affordance network model to incorporate spoken words, allowing robots to associate verbal descriptions with actions and perceptions in real-time.
Findings
Robots can form meaningful word-to-meaning links without grammatical cues.
The approach is robust to speech recognition errors.
Word associations improve task instruction and contextual understanding.
Abstract
We address the problem of bootstrapping language acquisition for an artificial system similarly to what is observed in experiments with human infants. Our method works by associating meanings to words in manipulation tasks, as a robot interacts with objects and listens to verbal descriptions of the interactions. The model is based on an affordance network, i.e., a mapping between robot actions, robot perceptions, and the perceived effects of these actions upon objects. We extend the affordance model to incorporate spoken words, which allows us to ground the verbal symbols to the execution of actions and the perception of the environment. The model takes verbal descriptions of a task as the input and uses temporal co-occurrence to create links between speech utterances and the involved objects, actions, and effects. We show that the robot is able form useful word-to-meaning associations,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
