Automatic Identification of Motivation for Code-Switching in Speech Transcripts
Ritu Belani, Jeffrey Flanigan

TL;DR
This paper introduces a system that automatically identifies diverse motivations behind code-switching in speech transcripts, achieving 75% accuracy and demonstrating cross-lingual adaptability to Hindi-English.
Contribution
It presents the first system for automatic motivation identification in code-switching, along with a new annotated dataset and cross-lingual transfer capabilities.
Findings
Achieved 75% accuracy in identifying motivations in Spanish-English speech
System can be adapted to Hindi-English with 66% accuracy
Introduced a new dataset with annotated motivations for code-switching
Abstract
Code-switching, or switching between languages, occurs for many reasons and has important linguistic, sociological, and cultural implications. Multilingual speakers code-switch for a variety of purposes, such as expressing emotions, borrowing terms, making jokes, introducing a new topic, etc. The reason for code-switching may be quite useful for analysis, but is not readily apparent. To remedy this situation, we annotate a new dataset of motivations for code-switching in Spanish-English. We build the first system (to our knowledge) to automatically identify a wide range of motivations that speakers code-switch in everyday speech, achieving an accuracy of 75% across all motivations. Additionally, we show that the system can be adapted to new language pairs, achieving 66% accuracy on a new language pair (Hindi-English), demonstrating the cross-lingual applicability of our annotation scheme
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Communication and Language · Multilingual Education and Policy · Linguistics, Language Diversity, and Identity
