A Survey of Code-switched Speech and Language Processing
Sunayana Sitaram, Khyathi Raghavi Chandu, Sai Krishna Rallabandi and, Alan W Black

TL;DR
This survey reviews computational methods for processing code-switched speech and text, highlighting challenges, available resources, and future research directions in multilingual natural language processing.
Contribution
It provides a comprehensive overview of existing approaches, datasets, and applications for code-switching in speech and NLP, and discusses open problems and future directions.
Findings
Limited resources for code-switched data hinder progress.
Various applications utilize code-switching in speech and NLP.
Future research needs include better models and datasets.
Abstract
Code-switching, the alternation of languages within a conversation or utterance, is a common communicative phenomenon that occurs in multilingual communities across the world. This survey reviews computational approaches for code-switched Speech and Natural Language Processing. We motivate why processing code-switched text and speech is essential for building intelligent agents and systems that interact with users in multilingual communities. As code-switching data and resources are scarce, we list what is available in various code-switched language pairs with the language processing tasks they can be used for. We review code-switching research in various Speech and NLP applications, including language processing tools and end-to-end systems. We conclude with future directions and open problems in the field.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling
