Cross-Lingual Dependency Parsing for Closely Related Languages -   Helsinki's Submission to VarDial 2017

J\"org Tiedemann

arXiv:1708.05719·cs.CL·August 22, 2017

Cross-Lingual Dependency Parsing for Closely Related Languages - Helsinki's Submission to VarDial 2017

J\"org Tiedemann

PDF

TL;DR

This paper presents a cross-lingual dependency parsing approach using annotation projection and treebank translation, achieving competitive results across multiple related languages, sometimes surpassing fully supervised models.

Contribution

It introduces effective cross-lingual methods for dependency parsing that leverage related language resources, demonstrating significant improvements over baseline models.

Findings

01

Slovak parsing benefits from Czech treebank data.

02

Cross-lingual models outperform fully supervised models in some cases.

03

Norwegian parsing improves with Swedish data, Danish contribution is limited.

Abstract

This paper describes the submission from the University of Helsinki to the shared task on cross-lingual dependency parsing at VarDial 2017. We present work on annotation projection and treebank translation that gave good results for all three target languages in the test set. In particular, Slovak seems to work well with information coming from the Czech treebank, which is in line with related work. The attachment scores for cross-lingual models even surpass the fully supervised models trained on the target language treebank. Croatian is the most difficult language in the test set and the improvements over the baseline are rather modest. Norwegian works best with information coming from Swedish whereas Danish contributes surprisingly little.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.