Self-Translate-Train: Enhancing Cross-Lingual Transfer of Large Language Models via Inherent Capability
Ryokan Ri, Shun Kiyono, Sho Takase

TL;DR
This paper introduces Self-Translate-Train, a method where large language models translate training data into target languages and fine-tune on this data, improving cross-lingual transfer especially for low-resource languages.
Contribution
It proposes a novel approach that leverages the model's inherent capabilities to translate and fine-tune on its own generated data, enhancing cross-lingual transfer performance.
Findings
Self-Translate-Train outperforms zero-shot transfer methods.
Models capture useful cross-lingual correspondence even without effective generalization.
Encourages further research into eliciting cross-lingual capabilities of LLMs.
Abstract
Zero-shot cross-lingual transfer by fine-tuning multilingual pretrained models shows promise for low-resource languages, but often suffers from misalignment of internal representations between languages. We hypothesize that even when the model cannot generalize across languages effectively in fine-tuning, it still captures cross-lingual correspondence useful for cross-lingual transfer. We explore this hypothesis with Self-Translate-Train, a method that lets large language models (LLMs) to translate training data into the target language and fine-tunes the model on its own generated data. By demonstrating that Self-Translate-Train outperforms zero-shot transfer, we encourage further exploration of better methods to elicit cross-lingual capabilities of LLMs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
