Unsupervised Full Constituency Parsing with Neighboring Distribution   Divergence

Letian Peng; Zuchao Li; Hai Zhao

arXiv:2110.15931·cs.CL·November 1, 2021

Unsupervised Full Constituency Parsing with Neighboring Distribution Divergence

Letian Peng, Zuchao Li, Hai Zhao

PDF

Open Access

TL;DR

This paper introduces an unsupervised, training-free method for full constituency parsing using Neighboring Distribution Divergence, achieving state-of-the-art results in unlabeled parsing and strong baselines for labeled parsing.

Contribution

It proposes a novel unsupervised labeling procedure based on NDD, enabling accurate constituency parsing without training, and sets new benchmarks for unlabeled F1 scores.

Findings

01

Achieves state-of-the-art unlabeled F1 scores.

02

Provides strong baselines for labeled F1.

03

Demonstrates effectiveness of NDD in constituency parsing.

Abstract

Unsupervised constituency parsing has been explored much but is still far from being solved. Conventional unsupervised constituency parser is only able to capture the unlabeled structure of sentences. Towards unsupervised full constituency parsing, we propose an unsupervised and training-free labeling procedure by exploiting the property of a recently introduced metric, Neighboring Distribution Divergence (NDD), which evaluates semantic similarity between sentences before and after editions. For implementation, we develop NDD into Dual POS-NDD (DP-NDD) and build "molds" to detect constituents and their labels in sentences. We show that DP-NDD not only labels constituents precisely but also inducts more accurate unlabeled constituency trees than all previous unsupervised methods with simpler rules. With two frameworks for labeled constituency trees inference, we set both the new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications