Unsupervised Full Constituency Parsing with Neighboring Distribution Divergence
Letian Peng, Zuchao Li, Hai Zhao

TL;DR
This paper introduces an unsupervised, training-free method for full constituency parsing using Neighboring Distribution Divergence, achieving state-of-the-art results in unlabeled parsing and strong baselines for labeled parsing.
Contribution
It proposes a novel unsupervised labeling procedure based on NDD, enabling accurate constituency parsing without training, and sets new benchmarks for unlabeled F1 scores.
Findings
Achieves state-of-the-art unlabeled F1 scores.
Provides strong baselines for labeled F1.
Demonstrates effectiveness of NDD in constituency parsing.
Abstract
Unsupervised constituency parsing has been explored much but is still far from being solved. Conventional unsupervised constituency parser is only able to capture the unlabeled structure of sentences. Towards unsupervised full constituency parsing, we propose an unsupervised and training-free labeling procedure by exploiting the property of a recently introduced metric, Neighboring Distribution Divergence (NDD), which evaluates semantic similarity between sentences before and after editions. For implementation, we develop NDD into Dual POS-NDD (DP-NDD) and build "molds" to detect constituents and their labels in sentences. We show that DP-NDD not only labels constituents precisely but also inducts more accurate unlabeled constituency trees than all previous unsupervised methods with simpler rules. With two frameworks for labeled constituency trees inference, we set both the new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
