Neural Encoding Detection is Not All You Need for Synthetic Speech Detection
Luca Cuccovillo, Xin Wang, Milica Gerhardt, Patrick Aichroth

TL;DR
This paper reviews current synthetic speech detection methods, emphasizing the limitations of relying solely on neural encoding detection and guiding future research directions.
Contribution
It critically analyzes the focus on neural encoding detection and offers recommendations to diversify research approaches in synthetic speech detection.
Findings
Neural encoding detection alone may not be sufficient for robust synthetic speech detection
Current trends may overemphasize neural encoding, risking future obsolescence
Guidelines for future research directions in the field
Abstract
This paper reviews the current state and emerging trends in synthetic speech detection. It outlines the main data-driven approaches, discusses the advantages and drawbacks of focusing future research solely on neural encoding detection, and offers recommendations for promising research directions. Unlike works that introduce new detection methods or datasets, this paper aims to guide future state-of-the-art research in the field and to highlight the risk of overcommitting to approaches that may not stand the test of time.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
