Unit selection synthesis based data augmentation for fixed phrase speaker verification
Houjun Huang, Xu Xiang, Fei Zhao, Shuai Wang, Yanmin Qian

TL;DR
This paper introduces a novel data augmentation method for fixed phrase speaker verification that uses unit selection synthesis to generate target transcript speech from text-independent data, improving system performance.
Contribution
The paper presents a new unit selection synthesis approach for data augmentation in speaker verification, effectively leveraging text-independent data for fixed phrase tasks.
Findings
Significant performance improvement on AISHELL dataset.
Effective use of phonetic segments from text-independent data.
Enhanced robustness of speaker verification system.
Abstract
Data augmentation is commonly used to help build a robust speaker verification system, especially in limited-resource case. However, conventional data augmentation methods usually focus on the diversity of acoustic environment, leaving the lexicon variation neglected. For text dependent speaker verification tasks, it's well-known that preparing training data with the target transcript is the most effectual approach to build a well-performing system, however collecting such data is time-consuming and expensive. In this work, we propose a unit selection synthesis based data augmentation method to leverage the abundant text-independent data resources. In this approach text-independent speeches of each speaker are firstly broke up to speech segments each contains one phone unit. Then segments that contain phonetics in the target transcript are selected to produce a speech with the target…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
