Neural Combinatory Constituency Parsing
Zhousi Chen, Longtu Zhang, Aizhan Imankulova, and Mamoru Komachi

TL;DR
This paper introduces two fast neural models for constituency parsing that decompose the process into classification and vector composition, achieving high accuracy and efficiency on multiple languages.
Contribution
The paper presents novel binary and multi-branching neural models with theoretical and empirical efficiency improvements for constituency parsing.
Findings
Binary model achieves 92.54 F1 on Penn Treebank
Models with XLNet attain near state-of-the-art accuracy
Observed language-specific syntactic tendencies during training
Abstract
We propose two fast neural combinatory models for constituency parsing: binary and multi-branching. Our models decompose the bottom-up parsing process into 1) classification of tags, labels, and binary orientations or chunks and 2) vector composition based on the computed orientations or chunks. These models have theoretical sub-quadratic complexity and empirical linear complexity. The binary model achieves an F1 score of 92.54 on Penn Treebank, speeding at 1327.2 sents/sec. Both the models with XLNet provide near state-of-the-art accuracies for English. Syntactic branching tendency and headedness of a language are observed during the training and inference processes for Penn Treebank, Chinese Treebank, and Keyaki Treebank (Japanese).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Softmax · Dropout · Layer Normalization · Byte Pair Encoding · Adam
