A Joint Model of Language and Perception for Grounded Attribute Learning
Cynthia Matuszek (University of Washington), Nicholas FitzGerald, (University of Washington), Luke Zettlemoyer (University of Washington),, Liefeng Bo (University of Washington), Dieter Fox (University of Washington)

TL;DR
This paper introduces a joint learning approach combining language and perception models to interpret natural language descriptions of objects in physical scenes, aiding robot interaction.
Contribution
It presents a novel probabilistic grammar-based language model integrated with perception classifiers for grounded attribute learning in physical environments.
Findings
Accurate interpretation of object descriptions in scenes
Effective induction of latent concepts from language and perception
Improved task performance in physical workspace understanding
Abstract
As robots become more ubiquitous and capable, it becomes ever more important to enable untrained users to easily interact with them. Recently, this has led to study of the language grounding problem, where the goal is to extract representations of the meanings of natural language tied to perception and actuation in the physical world. In this paper, we present an approach for joint learning of language and perception models for grounded attribute induction. Our perception model includes attribute classifiers, for example to detect object color and shape, and the language model is based on a probabilistic categorial grammar that enables the construction of rich, compositional meaning representations. The approach is evaluated on the task of interpreting sentences that describe sets of objects in a physical workspace. We demonstrate accurate task performance and effective latent-variable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Human Pose and Action Recognition
