Discontinuous Constituent Parsing as Sequence Labeling

David Vilares; Carlos G\'omez-Rodr\'iguez

arXiv:2010.00633·cs.CL·October 5, 2020

Discontinuous Constituent Parsing as Sequence Labeling

David Vilares, Carlos G\'omez-Rodr\'iguez

PDF

1 Repo

TL;DR

This paper presents a novel sequence labeling approach for discontinuous constituent parsing, encoding tree discontinuities as permutations, resulting in fast and accurate models despite their simplicity.

Contribution

It introduces a new encoding for discontinuous trees as permutations and demonstrates their learnability, filling a gap in existing constituent parsing reductions.

Findings

01

Models are fast and accurate with the proposed encoding

02

Discontinuous representations are learnable with simple architectures

03

The approach outperforms previous methods on relevant benchmarks

Abstract

This paper reduces discontinuous parsing to sequence labeling. It first shows that existing reductions for constituent parsing as labeling do not support discontinuities. Second, it fills this gap and proposes to encode tree discontinuities as nearly ordered permutations of the input sequence. Third, it studies whether such discontinuous representations are learnable. The experiments show that despite the architectural simplicity, under the right representation, the models are fast and accurate.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aghie/disco2labels
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.