Improved Latent Tree Induction with Distant Supervision via Span   Constraints

Zhiyang Xu; Andrew Drozdov; Jay Yoon Lee; Tim O'Gorman; Subendhu; Rongali; Dylan Finkbeiner; Shilpa Suresh; Mohit Iyyer; Andrew McCallum

arXiv:2109.05112·cs.CL·November 3, 2021

Improved Latent Tree Induction with Distant Supervision via Span Constraints

Zhiyang Xu, Andrew Drozdov, Jay Yoon Lee, Tim O'Gorman, Subendhu, Rongali, Dylan Finkbeiner, Shilpa Suresh, Mohit Iyyer, Andrew McCallum

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method that uses minimal span constraints from distant supervision to significantly improve unsupervised constituency parsing performance, making it more practical for real-world applications.

Contribution

The authors propose a novel approach leveraging span constraints from distant supervision to enhance latent tree induction in unsupervised parsing systems.

Findings

01

Span constraints improve WSJ parsing by over 5 F1 points.

02

Method extends effectively to biomedical text parsing.

03

Minimal span constraints can be derived from simple sources like Wikipedia.

Abstract

For over thirty years, researchers have developed and analyzed methods for latent tree induction as an approach for unsupervised syntactic parsing. Nonetheless, modern systems still do not perform well enough compared to their supervised counterparts to have any practical use as structural annotation of text. In this work, we present a technique that uses distant supervision in the form of span constraints (i.e. phrase bracketing) to improve performance in unsupervised constituency parsing. Using a relatively small number of span constraints we can substantially improve the output from DIORA, an already competitive unsupervised parsing system. Compared with full parse tree annotation, span constraints can be acquired with minimal effort, such as with a lexicon derived from Wikipedia, to find exact text matches. Our experiments show span constraints based on entities improves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

iesl/distantly-supervised-diora
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification