Naturalness-Aware Curriculum Learning with Dynamic Temperature for Speech Deepfake Detection
Taewoo Kim, Guisik Kim, Choongsang Cho, Young Han Lee

TL;DR
This paper introduces a novel naturalness-aware curriculum learning framework for speech deepfake detection that leverages speech naturalness cues and dynamic temperature scaling to improve model robustness and generalization, achieving significant reduction in error rates.
Contribution
It proposes a new training framework that incorporates speech naturalness and dynamic temperature scaling, enhancing deepfake detection without changing model architecture.
Findings
Achieved 23% relative reduction in EER on ASVspoof 2021 DF dataset.
Validated effectiveness of naturalness-aware training through ablation studies.
Improved robustness and generalization of speech deepfake detection models.
Abstract
Recent advances in speech deepfake detection (SDD) have significantly improved artifacts-based detection in spoofed speech. However, most models overlook speech naturalness, a crucial cue for distinguishing bona fide speech from spoofed speech. This study proposes naturalness-aware curriculum learning, a novel training framework that leverages speech naturalness to enhance the robustness and generalization of SDD. This approach measures sample difficulty using both ground-truth labels and mean opinion scores, and adjusts the training schedule to progressively introduce more challenging samples. To further improve generalization, a dynamic temperature scaling method based on speech naturalness is incorporated into the training process. A 23% relative reduction in the EER was achieved in the experiments on the ASVspoof 2021 DF dataset, without modifying the model architecture. Ablation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition
