Neural Variational Learning for Grounded Language Acquisition

Nisha Pillai; Cynthia Matuszek; Francis Ferraro

arXiv:2107.14593·cs.CL·August 2, 2021

Neural Variational Learning for Grounded Language Acquisition

Nisha Pillai, Cynthia Matuszek, Francis Ferraro

PDF

TL;DR

This paper introduces a neural generative approach for grounded language learning that links language to visual percepts without predefined categories, enabling effective multilingual and low-resource language grounding.

Contribution

It presents a unified generative model that learns shared semantic-visual embeddings for grounded language acquisition without relying on pre-defined visual categories.

Findings

01

Effective in low-resource settings

02

Generalizes across multilingual datasets

03

Outperforms non-neural methods in language grounding

Abstract

We propose a learning system in which language is grounded in visual percepts without specific pre-defined categories of terms. We present a unified generative method to acquire a shared semantic/visual embedding that enables the learning of language about a wide range of real-world objects. We evaluate the efficacy of this learning by predicting the semantics of objects and comparing the performance with neural and non-neural inputs. We show that this generative approach exhibits promising results in language grounding without pre-specifying visual categories under low resource settings. Our experiments demonstrate that this approach is generalizable to multilingual, highly varied datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.