The Sound of Silence: Efficiency of First Digit Features in Synthetic   Audio Detection

Daniele Mari; Federica Latora; Simone Milani

arXiv:2210.02746·cs.SD·October 7, 2022

The Sound of Silence: Efficiency of First Digit Features in Synthetic Audio Detection

Daniele Mari, Federica Latora, Simone Milani

PDF

Open Access 1 Repo

TL;DR

This paper explores how first digit statistics from MFCC coefficients can effectively detect synthetic speech, offering a lightweight and robust method that achieves over 90% accuracy across various forgery techniques.

Contribution

It introduces a novel, computationally-efficient detection method based on first digit analysis of MFCCs, improving robustness without complex neural architectures.

Findings

01

Achieves over 90% accuracy on ASVSpoof dataset

02

Effective across multiple synthetic speech algorithms

03

Lightweight and computationally efficient

Abstract

The recent integration of generative neural strategies and audio processing techniques have fostered the widespread of synthetic speech synthesis or transformation algorithms. This capability proves to be harmful in many legal and informative processes (news, biometric authentication, audio evidence in courts, etc.). Thus, the development of efficient detection algorithms is both crucial and challenging due to the heterogeneity of forgery techniques. This work investigates the discriminative role of silenced parts in synthetic speech detection and shows how first digit statistics extracted from MFCC coefficients can efficiently enable a robust detection. The proposed procedure is computationally-lightweight and effective on many different algorithms since it does not rely on large neural detection architecture and obtains an accuracy above 90\% in most of the classes of the ASVSpoof…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dan8991/the-sound-of-silence
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing