UDPipe at EvaLatin 2020: Contextualized Embeddings and Treebank Embeddings
Milan Straka, Jana Strakov\'a

TL;DR
This paper describes a system based on UDPipe 2.0 that achieves top performance in Latin NLP tasks at EvaLatin 2020, utilizing contextualized embeddings and treebank encodings to improve lemmatization and POS tagging.
Contribution
The paper introduces a Latin NLP system that leverages contextualized embeddings and treebank encodings, achieving state-of-the-art results in the EvaLatin shared task.
Findings
Top performance in lemmatization and POS tagging in open modality.
Second place in cross-genre and cross-time settings.
Contextualized embeddings significantly improve results.
Abstract
We present our contribution to the EvaLatin shared task, which is the first evaluation campaign devoted to the evaluation of NLP tools for Latin. We submitted a system based on UDPipe 2.0, one of the winners of the CoNLL 2018 Shared Task, The 2018 Shared Task on Extrinsic Parser Evaluation and SIGMORPHON 2019 Shared Task. Our system places first by a wide margin both in lemmatization and POS tagging in the open modality, where additional supervised data is allowed, in which case we utilize all Universal Dependency Latin treebanks. In the closed modality, where only the EvaLatin training data is allowed, our system achieves the best performance in lemmatization and in classical subtask of POS tagging, while reaching second place in cross-genre and cross-time settings. In the ablation experiments, we also evaluate the influence of BERT and XLM-RoBERTa contextualized embeddings, and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification · Translation Studies and Practices
MethodsLinear Layer · Weight Decay · Softmax · Adam · Multi-Head Attention · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Linear Warmup With Linear Decay · Dense Connections
