DOTA-ME-CS: Daily Oriented Text Audio-Mandarin English-Code Switching Dataset
Yupei Li, Zifan Wei, Heng Yu, Jiahao Xue, Huichi Zhou, Bj\"orn W. Schuller

TL;DR
This paper introduces DOTA-ME-CS, a comprehensive daily-oriented Mandarin-English code-switching speech dataset with AI-enhanced diversity, aiming to advance bilingual speech recognition research.
Contribution
The paper presents a new large-scale, diverse code-switching speech dataset with AI-augmented data, filling a gap in resources for bilingual ASR research.
Findings
Dataset contains 18.54 hours of audio from 34 participants.
AI techniques increase dataset diversity and complexity.
Dataset and code will be publicly available.
Abstract
Code-switching, the alternation between two or more languages within communication, poses great challenges for Automatic Speech Recognition (ASR) systems. Existing models and datasets are limited in their ability to effectively handle these challenges. To address this gap and foster progress in code-switching ASR research, we introduce the DOTA-ME-CS: Daily oriented text audio Mandarin-English code-switching dataset, which consists of 18.54 hours of audio data, including 9,300 recordings from 34 participants. To enhance the dataset's diversity, we apply artificial intelligence (AI) techniques such as AI timbre synthesis, speed variation, and noise addition, thereby increasing the complexity and scalability of the task. The dataset is carefully curated to ensure both diversity and quality, providing a robust resource for researchers addressing the intricacies of bilingual speech…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques
