CEC: A Noisy Label Detection Method for Speaker Recognition
Yao Shen, Yingying Gao, Yaqian Hao, Chenguang Hu, Fulin Zhang, Junlan, Feng, Shilei Zhang

TL;DR
This paper introduces a novel statistical approach for detecting noisy labels in speaker recognition datasets, improving model robustness by categorizing samples and adjusting training difficulty.
Contribution
The paper proposes CIC and TIC metrics based on Cross-Epoch Counting for effective noisy label detection and sample categorization to enhance speaker recognition robustness.
Findings
Achieves superior speaker verification performance.
Effectively detects noisy labels in datasets.
Improves robustness by sample difficulty adjustment.
Abstract
Noisy labels are inevitable, even in well-annotated datasets. The detection of noisy labels is of significant importance to enhance the robustness of speaker recognition models. In this paper, we propose a novel noisy label detection approach based on two new statistical metrics: Continuous Inconsistent Counting (CIC) and Total Inconsistent Counting (TIC). These metrics are calculated through Cross-Epoch Counting (CEC) and correspond to the early and late stages of training, respectively. Additionally, we categorize samples based on their prediction results into three categories: inconsistent samples, hard samples, and easy samples. During training, we gradually increase the difficulty of hard samples to update model parameters, preventing noisy labels from being overfitted. Compared to contrastive schemes, our approach not only achieves the best performance in speaker verification but…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing
