A Conditional Splitting Framework for Efficient Constituency Parsing
Thanh-Tung Nguyen, Xuan-Phi Nguyen, Shafiq Joty, Xiaoli Li

TL;DR
This paper presents a versatile seq2seq parsing framework that models constituency parsing as conditional splitting, enabling efficient top-down decoding and applying to both syntactic and discourse parsing with competitive results.
Contribution
It introduces a novel conditional splitting approach that simplifies parsing, supports efficient inference, and unifies syntactic and discourse parsing without pre-segmentation.
Findings
Achieves competitive syntactic parsing results with/without pre-trained models.
Outperforms state-of-the-art in discourse parsing.
Supports efficient linear-time decoding.
Abstract
We introduce a generic seq2seq parsing framework that casts constituency parsing problems (syntactic and discourse parsing) into a series of conditional splitting decisions. Our parsing model estimates the conditional probability distribution of possible splitting points in a given text span and supports efficient top-down decoding, which is linear in number of nodes. The conditional splitting formulation together with efficient beam search inference facilitate structural consistency without relying on expensive structured inference. Crucially, for discourse analysis we show that in our formulation, discourse segmentation can be framed as a special case of parsing which allows us to perform discourse parsing without requiring segmentation as a pre-requisite. Experiments show that our model achieves good results on the standard syntactic parsing tasks under settings with/without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Sequence to Sequence
