Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection
Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Jan, Haji\v{c}, Christopher D. Manning, Sampo Pyysalo, Sebastian Schuster, Francis, Tyers, Daniel Zeman

TL;DR
Universal Dependencies v2 introduces updated guidelines and expands a multilingual treebank collection, enabling consistent syntactic annotation across 90 languages within a dependency-based framework.
Contribution
This paper presents the new UD v2 guidelines and provides an overview of the extensive multilingual treebank collection now available.
Findings
Expanded to 90 languages with comprehensive treebanks
Major updates to annotation guidelines from UD v1 to UD v2
Facilitates cross-linguistic syntactic analysis
Abstract
Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a dependency-based lexicalist framework. The annotation consists in a linguistically motivated word segmentation; a morphological layer comprising lemmas, universal part-of-speech tags, and standardized morphological features; and a syntactic layer focusing on syntactic relations between predicates, arguments and modifiers. In this paper, we describe version 2 of the guidelines (UD v2), discuss the major changes from UD v1 to UD v2, and give an overview of the currently available treebanks for 90 languages.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
