Think Multilingual, Not Harder: A Data-Efficient Framework for Teaching Reasoning Models to Code-Switch

Eleanor M. Lin; David Jurgens

arXiv:2604.15490·cs.CL·April 20, 2026

Think Multilingual, Not Harder: A Data-Efficient Framework for Teaching Reasoning Models to Code-Switch

Eleanor M. Lin, David Jurgens

PDF

TL;DR

This paper introduces a data-efficient fine-tuning framework to enhance beneficial code-switching behaviors in reasoning large language models, based on analyzing diverse reasoning traces and applying targeted interventions.

Contribution

It is the first to systematically analyze and teach beneficial code-switching behaviors in reasoning models through linguistically motivated fine-tuning.

Findings

01

Fine-tuning increases beneficial code-switching behaviors in models.

02

Analysis reveals diverse code-switching behaviors across models and tasks.

03

Fine-tuning on unrelated tasks like translation can influence reasoning behaviors.

Abstract

Recent developments in reasoning capabilities have enabled large language models to solve increasingly complex mathematical, symbolic, and logical tasks. Interestingly, while reasoning models are often trained to generate monolingual text, these models have also been observed to code-switch (i.e., mix languages). Prior works have either viewed code-switching as an undesirable error, attempted to control code-switching through modifications to input prompts or the output decoding process, or focus on narrow subsets of languages, domains, tasks, and models. We address these gaps by introducing the first linguistically and behaviorally motivated fine-tuning framework for identifying beneficial code-switched reasoning behaviors in large language models and teaching these models to code-switch more effectively for reasoning. First, we create and systematically analyze a dataset of reasoning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.