A Three-Pronged Approach to Cross-Lingual Adaptation with Multilingual   LLMs

Vaibhav Singh; Amrith Krishna; Karthika NJ; Ganesh Ramakrishnan

arXiv:2406.17377·cs.CL·June 26, 2024·1 cites

A Three-Pronged Approach to Cross-Lingual Adaptation with Multilingual LLMs

Vaibhav Singh, Amrith Krishna, Karthika NJ, Ganesh Ramakrishnan

PDF

Open Access

TL;DR

This paper explores three methods to improve cross-lingual adaptation of large language models to low-resource languages, demonstrating that additional supervision, language reordering, and continued pre-training can enhance performance.

Contribution

It introduces three novel approaches for cross-lingual transfer in LLMs, specifically targeting low-resource languages like Bengali, Hindi, and Tamil, with empirical validation.

Findings

01

Adding supervisory signals improves transfer performance.

02

Language reordering benefits in-context learning but less so in fine-tuning.

03

Continued pre-training on one low-resource language aids related languages.

Abstract

Low-resource languages, by its very definition, tend to be under represented in the pre-training corpora of Large Language Models. In this work, we investigate three low-resource cross-lingual approaches that enable an LLM adapt to tasks in previously unseen languages. Llama-2 is an LLM where Indic languages, among many other language families, contribute to less than $0.005%$ of the total $2$ trillion token pre-training corpora. In this work, we experiment with the English-dominated Llama-2 for cross-lingual transfer to three Indic languages, Bengali, Hindi, and Tamil as target languages. We study three approaches for cross-lingual transfer, under ICL and fine-tuning. One, we find that adding additional supervisory signals via a dominant language in the LLM, leads to improvements, both under in-context learning and fine-tuning. Two, adapting the target languages to word reordering may…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Text Readability and Simplification · Topic Modeling