Score: 0

Adapting Language Balance in Code-Switching Speech

Published: October 21, 2025 | arXiv ID: 2510.18724v1

By: Enes Yavuz Ugan, Ngoc-Quan Pham, Alexander Waibel

Potential Business Impact:

Helps computers understand mixed-language sentences better.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Despite achieving impressive results on standard benchmarks, large foundational models still struggle against code-switching test cases. When data scarcity cannot be used as the usual justification for poor performance, the reason may lie in the infrequent occurrence of code-switched moments, where the embedding of the second language appears subtly. Instead of expecting the models to learn this infrequency on their own, it might be beneficial to provide the training process with labels. Evaluating model performance on code-switching data requires careful localization of code-switching points where recognition errors are most consequential, so that the analysis emphasizes mistakes occurring at those moments. Building on this observation, we leverage the difference between the embedded and the main language to highlight those code-switching points and thereby emphasize learning at those locations. This simple yet effective differentiable surrogate mitigates context bias during generation -- the central challenge in code-switching -- thereby improving the model's robustness. Our experiments with Arabic and Chinese-English showed that the models are able to predict the switching places more correctly, reflected by the reduced substitution error.

Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training

Computation and Language

Helps computers learn many languages by mixing them.

2 Apr 2025 2

90%

Beyond Monolingual Assumptions: A Survey of Code-Switched NLP in the Era of Large Language Models

Computation and Language

Helps computers understand mixed-language conversations.

8 Oct 2025 1

89%

Strategies of Code-switching in Human-Machine Dialogs

Computation and Language

Chatbot learns to switch languages like people.

10 Aug 2025 1

View PDF Login to Bookmark

Page Count

5 pages

Adapting Language Balance in Code-Switching Speech

Helps computers understand mixed-language sentences better.

Technical Abstract

Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training

Beyond Monolingual Assumptions: A Survey of Code-Switched NLP in the Era of Large Language Models

Strategies of Code-switching in Human-Machine Dialogs