Robust Channel Learning for Large-Scale Radio Speaker Verification
Wenhao Yang, Jianguo Wei, Wenhuan Lu, Lei Li, Xugang Lu

TL;DR
This paper introduces a robust speaker verification framework tailored for radio communications, employing data augmentation, noise handling, and efficient fine-tuning to improve accuracy under challenging channel conditions.
Contribution
It presents a novel Channel Robust Speaker Learning framework with data augmentation, noise modeling, and efficient transfer learning, plus a large-scale radio speech benchmark.
Findings
Enhanced speaker verification accuracy in radio scenarios.
Effective mitigation of bandwidth and noise effects.
Reduced training time with efficient fine-tuning.
Abstract
Recent research in speaker verification has increasingly focused on achieving robust and reliable recognition under challenging channel conditions and noisy environments. Identifying speakers in radio communications is particularly difficult due to inherent limitations such as constrained bandwidth and pervasive noise interference. To address this issue, we present a Channel Robust Speaker Learning (CRSL) framework that enhances the robustness of the current speaker verification pipeline, considering data source, data augmentation, and the efficiency of model transfer processes. Our framework introduces an augmentation module that mitigates bandwidth variations in radio speech datasets by manipulating the bandwidth of training inputs. It also addresses unknown noise by introducing noise within the manifold space. Additionally, we propose an efficient fine-tuning method that reduces the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis
