Multi-Accent Mandarin Dry-Vocal Singing Dataset: Benchmark for Singing Accent Recognition
Zihao Wang, Ruibin Yuan, Ziqi Geng, Hengjia Li, Xingwei Qu, Xinyi Li, Songye Chen, Haoying Fu, Roger B. Dannenberg, Kejun Zhang

TL;DR
This paper introduces MADVSD, a comprehensive Mandarin singing dataset with regional accent annotations, enabling improved singing accent recognition and analysis of dialectal influences in singing.
Contribution
It provides the first large-scale, regionally annotated Mandarin singing dataset with phonetic exercises, facilitating research in singing accent recognition and dialectal influence analysis.
Findings
MADVSD enables effective benchmarking of singing accent recognition models.
Dialectal influences significantly affect singing accent variations.
Vowels play a crucial role in accentual differences in singing.
Abstract
Singing accent research is underexplored compared to speech accent studies, primarily due to the scarcity of suitable datasets. Existing singing datasets often suffer from detail loss, frequently resulting from the vocal-instrumental separation process. Additionally, they often lack regional accent annotations. To address this, we introduce the Multi-Accent Mandarin Dry-Vocal Singing Dataset (MADVSD). MADVSD comprises over 670 hours of dry vocal recordings from 4,206 native Mandarin speakers across nine distinct Chinese regions. In addition to each participant recording audio of three popular songs in their native accent, they also recorded phonetic exercises covering all Mandarin vowels and a full octave range. We validated MADVSD through benchmark experiments in singing accent recognition, demonstrating its utility for evaluating state-of-the-art speech models in singing contexts.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhonetics and Phonology Research · Music and Audio Processing · Speech Recognition and Synthesis
