A Survey of Code-switched Speech and Language Processing

Sunayana Sitaram; Khyathi Raghavi Chandu; Sai Krishna Rallabandi and; Alan W Black

arXiv:1904.00784·cs.CL·July 24, 2020·53 cites

A Survey of Code-switched Speech and Language Processing

Sunayana Sitaram, Khyathi Raghavi Chandu, Sai Krishna Rallabandi and, Alan W Black

PDF

Open Access

TL;DR

This survey reviews computational methods for processing code-switched speech and text, highlighting challenges, available resources, and future research directions in multilingual natural language processing.

Contribution

It provides a comprehensive overview of existing approaches, datasets, and applications for code-switching in speech and NLP, and discusses open problems and future directions.

Findings

01

Limited resources for code-switched data hinder progress.

02

Various applications utilize code-switching in speech and NLP.

03

Future research needs include better models and datasets.

Abstract

Code-switching, the alternation of languages within a conversation or utterance, is a common communicative phenomenon that occurs in multilingual communities across the world. This survey reviews computational approaches for code-switched Speech and Natural Language Processing. We motivate why processing code-switched text and speech is essential for building intelligent agents and systems that interact with users in multilingual communities. As code-switching data and resources are scarce, we list what is available in various code-switched language pairs with the language processing tasks they can be used for. We review code-switching research in various Speech and NLP applications, including language processing tools and end-to-end systems. We conclude with future directions and open problems in the field.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling