What is Learned in Visually Grounded Neural Syntax Acquisition

Noriyuki Kojima; Hadar Averbuch-Elor; Alexander M. Rush; Yoav Artzi

arXiv:2005.01678·cs.CL·May 20, 2020·1 cites

What is Learned in Visually Grounded Neural Syntax Acquisition

Noriyuki Kojima, Hadar Averbuch-Elor, Alexander M. Rush, Yoav Artzi

PDF

Open Access 1 Repo

TL;DR

This paper analyzes a visually grounded neural syntax model, revealing that simpler versions and lexical cues like noun concreteness are key to its performance, challenging assumptions about complex syntactic learning.

Contribution

It demonstrates that a simplified model and lexical signals can achieve similar or better performance, questioning the necessity of complex syntactic reasoning in visual grounding.

Findings

01

Simpler models perform as well or better than complex ones.

02

Noun concreteness is a primary factor in model predictions.

03

Complex syntactic reasoning is less critical than lexical cues.

Abstract

Visual features are a promising signal for learning bootstrap textual models. However, blackbox learning models make it difficult to isolate the specific contribution of visual components. In this analysis, we consider the case study of the Visually Grounded Neural Syntax Learner (Shi et al., 2019), a recent approach for learning syntax from a visual training signal. By constructing simplified versions of the model, we isolate the core factors that yield the model's strong performance. Contrary to what the model might be capable of learning, we find significantly less expressive versions produce similar predictions and perform just as well, or even better. We also find that a simple lexical signal of noun concreteness plays the main role in the model's predictions as opposed to more complex syntactic reasoning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lil-lab/vgnsl_analysis
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling