82 Treebanks, 34 Models: Universal Dependency Parsing with   Multi-Treebank Models

Aaron Smith; Bernd Bohnet; Miryam de Lhoneux; Joakim Nivre; Yan Shao,; Sara Stymne

arXiv:1809.02237·cs.CL·September 10, 2018

82 Treebanks, 34 Models: Universal Dependency Parsing with Multi-Treebank Models

Aaron Smith, Bernd Bohnet, Miryam de Lhoneux, Joakim Nivre, Yan Shao,, Sara Stymne

PDF

TL;DR

This paper introduces a multi-treebank universal dependency parsing system that improves efficiency and performance by training models across multiple related languages, achieving top scores in segmentation and tagging.

Contribution

The authors develop a multi-treebank training approach for universal dependency parsing, reducing model count and enhancing performance across related languages.

Findings

01

Ranked 7th out of 27 teams in LAS and MLAS metrics.

02

Achieved best scores in word segmentation, POS tagging, and morphological features.

03

Demonstrated effectiveness of multi-treebank models for universal dependency parsing.

Abstract

We present the Uppsala system for the CoNLL 2018 Shared Task on universal dependency parsing. Our system is a pipeline consisting of three components: the first performs joint word and sentence segmentation; the second predicts part-of- speech tags and morphological features; the third predicts dependency trees from words and tags. Instead of training a single parsing model for each treebank, we trained models with multiple treebanks for one language or closely related languages, greatly reducing the number of models. On the official test run, we ranked 7th of 27 teams for the LAS and MLAS metrics. Our system obtained the best scores overall for word segmentation, universal POS tagging, and morphological features.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.