ACL2025

Code-Switching and Syntax: A Large-Scale Experiment

Igor Sterner, Simone Teufel

5 citations

Abstract

The theoretical code-switching (CS) literature provides numerous pointwise investigations that aim to explain patterns in CS, i.e. why bilinguals switch language in certain positions in a sentence more often than in others. A resulting consensus is that CS can be explained by the syntax of the contributing languages. There is however no large-scale, multi-language, cross-phenomena experiment that tests this claim. When designing such an experiment, we need to make sure that the system that is predicting where bilinguals tend to switch has access only to syntactic information. We provide such an experiment here. Results show that syntax alone is sufficient for an automatic system to distinguish between sentences in minimal pairs of CS, to the same degree as bilingual humans. Furthermore, the learnt syntactic patterns generalise well to unseen language pairs. * The author is now at the University of Edinburgh. 128 1024 8192 50 60 70 80 # de-en training minimal pairs ACS benchmark acc. (%) Trained language pair; written CS Unseen language pair; written CS Unseen language pair; spoken CS