Universal Dependency Parsing from Scratch

Peng Qi; Timothy Dozat; Yuhao Zhang; Christopher D. Manning

arXiv:1901.10457·cs.CL·January 30, 2019·1 cites

Universal Dependency Parsing from Scratch

Peng Qi, Timothy Dozat, Yuhao Zhang, Christopher D. Manning

PDF

Open Access 1 Repo

TL;DR

This paper presents a neural pipeline for universal dependency parsing from raw text, achieving top performance across multiple languages and resource levels, with detailed analysis of model components.

Contribution

Introduces a comprehensive neural system for dependency parsing from scratch, improving performance and robustness across diverse languages and resource settings.

Findings

01

Achieved top rankings on LAS, MLAS, and BLEX metrics after fixing a bug.

02

Outperformed all systems on low-resource treebanks.

03

Extensive ablation studies validated model component effectiveness.

Abstract

This paper describes Stanford's system at the CoNLL 2018 UD Shared Task. We introduce a complete neural pipeline system that takes raw text as input, and performs all tasks required by the shared task, ranging from tokenization and sentence segmentation, to POS tagging and dependency parsing. Our single system submission achieved very competitive performance on big treebanks. Moreover, after fixing an unfortunate bug, our corrected system would have placed the 2nd, 1st, and 3rd on the official evaluation metrics LAS,MLAS, and BLEX, and would have outperformed all submission systems on low-resource treebank categories on all metrics by a large margin. We further show the effectiveness of different model components through extensive ablation studies.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

stanfordnlp/stanfordnlp
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications