Minimal Pair-Based Evaluation of Code-Switching
Igor Sterner, Simone Teufel

TL;DR
This paper introduces a minimal pair-based evaluation method to assess how well large language models mimic human-like code-switching, revealing that larger models better prefer natural code-switched sentences, especially with closed-class words.
Contribution
It proposes a novel minimal pair-based evaluation methodology for code-switching, covering diverse languages and phenomena, and demonstrates its effectiveness in comparing LLMs to bilingual human preferences.
Findings
Larger LLMs prefer natural CS sentences more than smaller models.
Humans consistently prefer naturally occurring CS sentences across all language pairs.
Probability differences in models are most significant with closed-class words.
Abstract
There is a lack of an evaluation methodology that estimates the extent to which large language models (LLMs) use code-switching (CS) in the same way as bilinguals. Existing methods do not have wide language coverage, fail to account for the diverse range of CS phenomena, or do not scale. We propose an intervention based on minimal pairs of CS. Each minimal pair contains one naturally occurring CS sentence and one minimally manipulated variant. We collect up to 1,000 such pairs each for 11 language pairs. Our human experiments show that, for every language pair, bilinguals consistently prefer the naturally occurring CS sentence. Meanwhile our experiments with current LLMs show that the larger the model, the more consistently it assigns higher probability to the naturally occurring CS sentence than to the variant. In accordance with theoretical claims, the largest probability differences…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsVLSI and Analog Circuit Testing · VLSI and FPGA Design Techniques
