Self-Translate-Train: Enhancing Cross-Lingual Transfer of Large Language   Models via Inherent Capability

Ryokan Ri; Shun Kiyono; Sho Takase

arXiv:2407.00454·cs.CL·September 18, 2024

Self-Translate-Train: Enhancing Cross-Lingual Transfer of Large Language Models via Inherent Capability

Ryokan Ri, Shun Kiyono, Sho Takase

PDF

Open Access

TL;DR

This paper introduces Self-Translate-Train, a method where large language models translate training data into target languages and fine-tune on this data, improving cross-lingual transfer especially for low-resource languages.

Contribution

It proposes a novel approach that leverages the model's inherent capabilities to translate and fine-tune on its own generated data, enhancing cross-lingual transfer performance.

Findings

01

Self-Translate-Train outperforms zero-shot transfer methods.

02

Models capture useful cross-lingual correspondence even without effective generalization.

03

Encourages further research into eliciting cross-lingual capabilities of LLMs.

Abstract

Zero-shot cross-lingual transfer by fine-tuning multilingual pretrained models shows promise for low-resource languages, but often suffers from misalignment of internal representations between languages. We hypothesize that even when the model cannot generalize across languages effectively in fine-tuning, it still captures cross-lingual correspondence useful for cross-lingual transfer. We explore this hypothesis with Self-Translate-Train, a method that lets large language models (LLMs) to translate training data into the target language and fine-tunes the model on its own generated data. By demonstrating that Self-Translate-Train outperforms zero-shot transfer, we encourage further exploration of better methods to elicit cross-lingual capabilities of LLMs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling