Towards Understanding Language through Perception in Situated Human-Robot Interaction: From Word Grounding to Grammar Induction
Amir Aly, Tadahiro Taniguchi

TL;DR
This paper explores how robots can understand human language by grounding words in visual perception and inducing grammatical structures, enabling better comprehension of instructions during interaction.
Contribution
It introduces methods for grounding parts of speech through perception and inducing CCG grammar, advancing robot language understanding capabilities.
Findings
Grounding parts of speech via visual perception.
Inducing CCG grammar for phrase understanding.
Improved robot comprehension of human instructions.
Abstract
Robots are widely collaborating with human users in diferent tasks that require high-level cognitive functions to make them able to discover the surrounding environment. A difcult challenge that we briefy highlight in this short paper is inferring the latent grammatical structure of language, which includes grounding parts of speech (e.g., verbs, nouns, adjectives, and prepositions) through visual perception, and induction of Combinatory Categorial Grammar (CCG) for phrases. This paves the way towards grounding phrases so as to make a robot able to understand human instructions appropriately during interaction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Multimodal Machine Learning Applications · Language and cultural evolution
