TL;DR
This paper presents a novel sequence labeling approach for discontinuous constituent parsing, encoding tree discontinuities as permutations, resulting in fast and accurate models despite their simplicity.
Contribution
It introduces a new encoding for discontinuous trees as permutations and demonstrates their learnability, filling a gap in existing constituent parsing reductions.
Findings
Models are fast and accurate with the proposed encoding
Discontinuous representations are learnable with simple architectures
The approach outperforms previous methods on relevant benchmarks
Abstract
This paper reduces discontinuous parsing to sequence labeling. It first shows that existing reductions for constituent parsing as labeling do not support discontinuities. Second, it fills this gap and proposes to encode tree discontinuities as nearly ordered permutations of the input sequence. Third, it studies whether such discontinuous representations are learnable. The experiments show that despite the architectural simplicity, under the right representation, the models are fast and accurate.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
