Code-Switching and Syntax: A Large-Scale Experiment

Igor Sterner; Simone Teufel

arXiv:2506.01846·cs.CL·July 29, 2025

Code-Switching and Syntax: A Large-Scale Experiment

Igor Sterner, Simone Teufel

PDF

TL;DR

This paper presents a large-scale experiment demonstrating that syntactic information alone can predict code-switching patterns in bilingual sentences, matching human performance and generalizing across language pairs.

Contribution

It provides the first large-scale, multi-language experiment showing syntax's sufficiency in predicting code-switching positions, supporting theoretical claims.

Findings

01

Syntax alone predicts CS positions as well as humans.

02

The syntactic patterns generalize to unseen language pairs.

03

The experiment confirms the role of syntax in CS patterns.

Abstract

The theoretical code-switching (CS) literature provides numerous pointwise investigations that aim to explain patterns in CS, i.e. why bilinguals switch language in certain positions in a sentence more often than in others. A resulting consensus is that CS can be explained by the syntax of the contributing languages. There is however no large-scale, multi-language, cross-phenomena experiment that tests this claim. When designing such an experiment, we need to make sure that the system that is predicting where bilinguals tend to switch has access only to syntactic information. We provide such an experiment here. Results show that syntax alone is sufficient for an automatic system to distinguish between sentences in minimal pairs of CS, to the same degree as bilingual humans. Furthermore, the learnt syntactic patterns generalise well to unseen language pairs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.