Dependency Induction Through the Lens of Visual Perception

Ruisi Su; Shruti Rijhwani; Hao Zhu; Junxian He; Xinyu Wang; Yonatan; Bisk; Graham Neubig

arXiv:2109.09790·cs.CL·September 22, 2021

Dependency Induction Through the Lens of Visual Perception

Ruisi Su, Shruti Rijhwani, Hao Zhu, Junxian He, Xinyu Wang, Yonatan, Bisk, Graham Neubig

PDF

Open Access 1 Repo

TL;DR

This paper introduces an unsupervised model that uses visual and lexical cues to improve grammar induction, significantly enhancing dependency and constituency parsing performance over text-only models.

Contribution

It proposes a novel joint learning approach for constituency and dependency grammars leveraging word concreteness and visual cues, advancing visually grounded syntax models.

Findings

01

Concreteness improves dependency grammar learning, increasing DAS by over 50%.

02

Visual semantic role labels enhance constituency parsing accuracy.

03

The model outperforms existing visually grounded models with smaller grammars.

Abstract

Most previous work on grammar induction focuses on learning phrasal or dependency structure purely from text. However, because the signal provided by text alone is limited, recently introduced visually grounded syntax models make use of multimodal information leading to improved performance in constituency grammar induction. However, as compared to dependency grammars, constituency grammars do not provide a straightforward way to incorporate visual information without enforcing language-specific heuristics. In this paper, we propose an unsupervised grammar induction model that leverages word concreteness and a structural vision-based heuristic to jointly learn constituency-structure and dependency-structure grammars. Our experiments find that concreteness is a strong indicator for learning dependency grammars, improving the direct attachment score (DAS) by over 50\% as compared to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ruisi-su/concrete_dep
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Speech and dialogue systems