Syntactic Transfer to Kyrgyz Using the Treebank Translation Method
Anton Alekseev, Alina Tillabaeva, Gulnara Dzh. Kabaeva, Sergey I., Nikolenko

TL;DR
This paper introduces a treebank translation method to transfer syntactic annotations from Turkish to Kyrgyz, improving annotation accuracy for Kyrgyz's low-resource syntactic corpus development.
Contribution
It presents a novel tool for syntactic transfer between languages and a method for evaluating annotation complexity, enhancing Kyrgyz syntactic corpus creation.
Findings
Higher syntactic annotation accuracy than monolingual models
Effective transfer of syntactic structures from Turkish to Kyrgyz
A new approach for assessing annotation complexity
Abstract
The Kyrgyz language, as a low-resource language, requires significant effort to create high-quality syntactic corpora. This study proposes an approach to simplify the development process of a syntactic corpus for Kyrgyz. We present a tool for transferring syntactic annotations from Turkish to Kyrgyz based on a treebank translation method. The effectiveness of the proposed tool was evaluated using the TueCL treebank. The results demonstrate that this approach achieves higher syntactic annotation accuracy compared to a monolingual model trained on the Kyrgyz KTMU treebank. Additionally, the study introduces a method for assessing the complexity of manual annotation for the resulting syntactic trees, contributing to further optimization of the annotation process.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLinguistics and Cultural Studies · Natural Language Processing Techniques
