Hard-Synth: Synthesizing Diverse Hard Samples for ASR using Zero-Shot TTS and LLM
Jiawei Yu, Yuang Li, Xiaosong Qiao, Huan Zhao, Xiaofeng Zhao, Wei, Tang, Min Zhang, Hao Yang, Jinsong Su

TL;DR
Hard-Synth introduces a novel data augmentation technique for ASR that uses large language models and zero-shot TTS to generate diverse, challenging speech samples, improving recognition accuracy and reducing bias.
Contribution
It presents a new method combining LLMs and zero-shot TTS for generating hard speech samples without extra text data or predefined styles.
Findings
Achieves 6.5 ext{--}4.4\% relative WER reduction on LibriSpeech.
Demonstrates data efficiency and bias reduction in ASR.
Enhances Conformer model performance with diverse synthetic data.
Abstract
Text-to-speech (TTS) models have been widely adopted to enhance automatic speech recognition (ASR) systems using text-only corpora, thereby reducing the cost of labeling real speech data. Existing research primarily utilizes additional text data and predefined speech styles supported by TTS models. In this paper, we propose Hard-Synth, a novel ASR data augmentation method that leverages large language models (LLMs) and advanced zero-shot TTS. Our approach employs LLMs to generate diverse in-domain text through rewriting, without relying on additional text data. Rather than using predefined speech styles, we introduce a hard prompt selection method with zero-shot TTS to clone speech styles that the ASR model finds challenging to recognize. Experiments demonstrate that Hard-Synth significantly enhances the Conformer model, achieving relative word error rate (WER) reductions of 6.5\%/4.4\%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNuclear Physics and Applications · Geophysical Methods and Applications
