TL;DR
This paper investigates cross-lingual dependency parsing for Faroese, comparing annotation projection methods from multiple sources and polyglot models, and finds that monolingual source projections combined with multi-treebank training yield the best target language results.
Contribution
It introduces a comparative analysis of projection approaches and multi-treebank modeling for low-resource Faroese dependency parsing, highlighting effective strategies for cross-lingual transfer.
Findings
Polyglot training improves overall target parsing results.
Best target results come from monolingual source projections combined with multi-treebank training.
Multi-source projection and multi-treebank modeling enhance low-resource language parsing.
Abstract
Cross-lingual dependency parsing involves transferring syntactic knowledge from one language to another. It is a crucial component for inducing dependency parsers in low-resource scenarios where no training data for a language exists. Using Faroese as the target language, we compare two approaches using annotation projection: first, projecting from multiple monolingual source models; second, projecting from a single polyglot model which is trained on the combination of all source languages. Furthermore, we reproduce multi-source projection (Tyers et al., 2018), in which dependency trees of multiple sources are combined. Finally, we apply multi-treebank modelling to the projected treebanks, in addition to or alternatively to polyglot modelling on the source side. We find that polyglot training on the source languages produces an overall trend of better results on the target language but…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
