Learning Language Structures through Grounding
Freda Shi

TL;DR
This paper explores learning language structures through grounding in various modalities, demonstrating improvements in syntactic parsing, semantic program induction, and cross-lingual alignment by leveraging visual, execution, and multilingual data.
Contribution
It introduces novel methods for grounding language structures using visual, execution, and cross-lingual data, advancing parsing, semantic understanding, and multilingual alignment.
Findings
Visual grounding improves syntactic parsing quality.
New evaluation metric for speech parsing without text.
State-of-the-art cross-lingual word alignment achieved.
Abstract
Language is highly structured, with syntactic and semantic structures, to some extent, agreed upon by speakers of the same language. With implicit or explicit awareness of such structures, humans can learn and use language efficiently and generalize to sentences that contain unseen words. Motivated by human language learning, in this dissertation, we consider a family of machine learning tasks that aim to learn language structures through grounding. We seek distant supervision from other data sources (i.e., grounds), including but not limited to other modalities (e.g., vision), execution results of programs, and other languages. We demonstrate the potential of this task formulation and advocate for its adoption through three schemes. In Part I, we consider learning syntactic parses through visual grounding. We propose the task of visually grounded grammar induction, present the first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsEFL/ESL Teaching and Learning · Multilingual Education and Policy · Second Language Learning and Teaching
