TL;DR
This paper introduces a fast, GPU-optimized neural CRF-based constituency parser that achieves state-of-the-art accuracy and high parsing speed, leveraging a novel batching strategy and boundary-based scoring architecture.
Contribution
It presents a novel GPU-efficient batching method for CRF loss computation and a boundary representation scoring architecture for improved parsing accuracy.
Findings
Achieves state-of-the-art results on PTB, CTB5.1, and CTB7 datasets.
Parses over 1,000 sentences per second with high accuracy.
Introduces a two-stage bracketing-then-labeling approach for efficiency.
Abstract
Estimating probability distribution is one of the core issues in the NLP field. However, in both deep learning (DL) and pre-DL eras, unlike the vast applications of linear-chain CRF in sequence labeling tasks, very few works have applied tree-structure CRF to constituency parsing, mainly due to the complexity and inefficiency of the inside-outside algorithm. This work presents a fast and accurate neural CRF constituency parser. The key idea is to batchify the inside algorithm for loss computation by direct large tensor operations on GPU, and meanwhile avoid the outside algorithm for gradient computation via efficient back-propagation. We also propose a simple two-stage bracketing-then-labeling parsing approach to improve efficiency further. To improve the parsing performance, inspired by recent progress in dependency parsing, we introduce a new scoring architecture based on boundary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · WordPiece · Dense Connections · Linear Warmup With Linear Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Attention Is All You Need · Multi-Head Attention · Adam · Residual Connection
