Code-Switching Curriculum Learning for Multilingual Transfer in LLMs
Haneul Yoo, Cheonbok Park, Sangdoo Yun, Alice Oh, Hwaran Lee

TL;DR
This paper introduces code-switching curriculum learning (CSCL), a training approach inspired by human language acquisition, to improve cross-lingual transfer in large language models, especially benefiting low-resource languages.
Contribution
The paper proposes a novel CSCL method that mimics human language learning stages, significantly enhancing multilingual transfer in LLMs across diverse languages and models.
Findings
CSCL improves transfer to Korean, Japanese, and Indonesian.
Both token- and sentence-level code-switching are crucial for effectiveness.
CSCL reduces spurious correlations and enhances safety alignment.
Abstract
Large language models (LLMs) now exhibit near human-level performance in various tasks, but their performance drops drastically after a handful of high-resource languages due to the imbalance in pre-training data. Inspired by the human process of second language acquisition, particularly code-switchingthe practice of language alternation in a conversationwe propose code-switching curriculum learning (CSCL) to enhance cross-lingual transfer for LLMs. CSCL mimics the stages of human language learning by progressively training models with a curriculum consisting of 1) token-level code-switching, 2) sentence-level code-switching, and 3) monolingual corpora. Using Qwen 2 as our underlying model, we demonstrate the efficacy of the CSCL in improving language transfer to Korean, achieving significant performance gains compared to monolingual continual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSecond Language Learning and Teaching
