Towards Better UD Parsing: Deep Contextualized Word Embeddings,   Ensemble, and Treebank Concatenation

Wanxiang Che; Yijia Liu; Yuxuan Wang; Bo Zheng; Ting Liu

arXiv:1807.03121·cs.CL·July 31, 2018·105 cites

Towards Better UD Parsing: Deep Contextualized Word Embeddings, Ensemble, and Treebank Concatenation

Wanxiang Che, Yijia Liu, Yuxuan Wang, Bo Zheng, Ting Liu

PDF

Open Access 1 Repo

TL;DR

This paper presents an improved multilingual dependency parser that leverages deep contextualized embeddings, ensembling, and treebank concatenation, achieving top performance in the CoNLL 2018 shared task.

Contribution

It introduces the integration of deep contextualized word embeddings and ensembling techniques into a multilingual parser, along with methods for effective treebank concatenation.

Findings

01

Achieved highest LAS score of 75.84% in the shared task

02

Deep contextualized embeddings improved parsing accuracy

03

Ensembling and treebank concatenation contributed to performance gains

Abstract

This paper describes our system (HIT-SCIR) submitted to the CoNLL 2018 shared task on Multilingual Parsing from Raw Text to Universal Dependencies. We base our submission on Stanford's winning system for the CoNLL 2017 shared task and make two effective extensions: 1) incorporating deep contextualized word embeddings into both the part of speech tagger and parser; 2) ensembling parsers trained with different initialization. We also explore different ways of concatenating treebanks for further improvements. Experimental results on the development data show the effectiveness of our methods. In the final evaluation, our system was ranked first according to LAS (75.84%) and outperformed the other systems by a large margin.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HIT-SCIR/ELMoForManyLangs
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification