Cross-Lingual Transfer and Parameter-Efficient Adaptation in the Turkic Language Family: A Theoretical Framework for Low-Resource Language Models
O. Ibrahimzade, K. Tabasaransky

TL;DR
This paper develops a theoretical framework for cross-lingual transfer and efficient adaptation of multilingual language models within the Turkic language family, emphasizing typological similarities and resource disparities.
Contribution
It introduces the Turkic Transfer Coefficient (TTC) to quantify transfer potential and integrates insights from multilingual learning and parameter-efficient fine-tuning techniques.
Findings
The framework explains how typological similarity facilitates transfer.
It models how adaptation performance depends on model capacity and data.
Identifies limits of parameter-efficient adaptation in low-resource settings.
Abstract
Large language models (LLMs) have transformed natural language processing, yet their capabilities remain uneven across languages. Most multilingual models are trained primarily on high-resource languages, leaving many languages with large speaker populations underrepresented in both training data and evaluation benchmarks. This imbalance is particularly visible in the Turkic language family. This paper proposes a theoretical framework for studying cross-lingual transfer and parameter-efficient adaptation of multilingual LLMs within the Turkic language family, focusing on Azerbaijani, Kazakh, Uzbek, Turkmen, and Gagauz. These languages share substantial typological and morphological similarity while differing greatly in available digital resources, making them a natural setting for analyzing multilingual adaptation strategies. We integrate insights from multilingual representation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
