From Sharpness to Better Generalization for Speech Deepfake Detection

Wen Huang; Xuechen Liu; Xin Wang; Junichi Yamagishi; Yanmin Qian

arXiv:2506.11532·eess.AS·June 16, 2025

From Sharpness to Better Generalization for Speech Deepfake Detection

Wen Huang, Xuechen Liu, Xin Wang, Junichi Yamagishi, Yanmin Qian

PDF

Open Access

TL;DR

This paper explores the role of sharpness as a theoretical measure of generalization in speech deepfake detection, demonstrating that reducing sharpness improves robustness across unseen conditions.

Contribution

It introduces sharpness as a theoretical proxy for generalization in SDD and applies sharpness-aware minimization to enhance model robustness.

Findings

01

Sharpness increases under domain shifts, indicating higher sensitivity.

02

Reducing sharpness improves performance on unseen test sets.

03

Sharpness correlates significantly with generalization in most settings.

Abstract

Generalization remains a critical challenge in speech deepfake detection (SDD). While various approaches aim to improve robustness, generalization is typically assessed through performance metrics like equal error rate without a theoretical framework to explain model performance. This work investigates sharpness as a theoretical proxy for generalization in SDD. We analyze how sharpness responds to domain shifts and find it increases in unseen conditions, indicating higher model sensitivity. Based on this, we apply Sharpness-Aware Minimization (SAM) to reduce sharpness explicitly, leading to better and more stable performance across diverse unseen test sets. Furthermore, correlation analysis confirms a statistically significant relationship between sharpness and generalization in most test settings. These findings suggest that sharpness can serve as a theoretical indicator for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis

MethodsSharpness-Aware Minimization