Parsing as Pretraining

David Vilares; Michalina Strzyz; Anders S{\o}gaard; Carlos; G\'omez-Rodr\'iguez

arXiv:2002.01685·cs.CL·February 6, 2020·6 cites

Parsing as Pretraining

David Vilares, Michalina Strzyz, Anders S{\o}gaard, Carlos, G\'omez-Rodr\'iguez

PDF

Open Access 2 Repos

TL;DR

This paper demonstrates that pretraining architectures can be used directly for full parsing tasks in English without decoding, achieving state-of-the-art results by casting parsing as sequence tagging and analyzing syntax sensitivity.

Contribution

It introduces a novel approach to full parsing using only pretrained encoders and a simple feed-forward layer, bypassing traditional decoding methods.

Findings

01

Surpasses existing sequence tagging parsers on PTB with 93.5% F1

02

Achieves 78.8% LAS on end-to-end EN-EWT UD

03

Analyzes syntax-sensitivity of different word vectors

Abstract

Recent analyses suggest that encoders pretrained for language modeling capture certain morpho-syntactic structure. However, probing frameworks for word vectors still do not report results on standard setups such as constituent and dependency parsing. This paper addresses this problem and does full parsing (on English) relying only on pretraining architectures -- and no decoding. We first cast constituent and dependency parsing as sequence tagging. We then use a single feed-forward layer to directly map word vectors to labels that encode a linearized tree. This is used to: (i) see how far we can reach on syntax modelling with just pretrained encoders, and (ii) shed some light about the syntax-sensitivity of different word vectors (by freezing the weights of the pretraining network during training). For evaluation, we use bracketing F1-score and LAS, and analyze in-depth differences…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis