OLA: Output Language Alignment in Code-Switched LLM Interactions
Juhyun Oh, Haneul Yoo, Faiz Ghifari Haznitrama, Alice Oh

TL;DR
This paper introduces OLA, a benchmark for evaluating how well large language models (LLMs) can align their responses with the expected output language in code-switched interactions, revealing significant biases and proposing a fine-tuning solution.
Contribution
The paper presents OLA, a new benchmark for assessing LLMs' output language alignment in code-switching, and demonstrates that minimal data fine-tuning improves alignment performance.
Findings
LLMs often respond in unintended languages during code-switching.
Bias toward non-English responses persists across multiple language pairs.
Minimal data fine-tuning significantly improves language alignment.
Abstract
Code-switching, alternating between languages within a conversation, is natural for multilingual users, yet poses fundamental challenges for large language models (LLMs). When a user code-switches in their prompt to an LLM, they typically do not specify the expected language of the LLM response, and thus LLMs must infer the output language from contextual and pragmatic cues. We find that current LLMs systematically fail to align with this expectation, responding in undesired languages even when cues are clear to humans. We introduce OLA, a benchmark to evaluate LLMs' Output Language Alignment in code-switched interactions. OLA focuses on Korean--English code-switching and spans simple intra-sentential mixing to instruction-content mismatches. Even frontier models frequently misinterpret implicit language expectation, exhibiting a bias toward non-English responses. We further show this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Neurobiology of Language and Bilingualism
