SpeechVerifier: Robust Acoustic Fingerprint against Tampering Attacks via Watermarking
Lingfeng Yao, Chenpei Huang, Shengyao Wang, Junpei Xue, Hanqing Guo, Jiang Liu, Xun Chen, Miao Pan

TL;DR
SpeechVerifier is a novel method that uses watermarking and contrastive learning to detect tampering in speech recordings without external references, maintaining robustness against benign modifications.
Contribution
It introduces a self-contained speech verification framework combining multiscale features, contrastive fingerprinting, and watermarking to detect malicious tampering effectively.
Findings
High accuracy in tampering detection
Robustness to compression and resampling
Effective self-contained verification
Abstract
With the surge of social media, maliciously tampered public speeches, especially those from influential figures, have seriously affected social stability and public trust. Existing speech tampering detection methods remain insufficient: they either rely on external reference data or fail to be both sensitive to attacks and robust to benign operations, such as compression and resampling. To tackle these challenges, we introduce SpeechVerifer to proactively verify speech integrity using only the published speech itself, i.e., without requiring any external references. Inspired by audio fingerprinting and watermarking, SpeechVerifier can (i) effectively detect tampering attacks, (ii) be robust to benign operations and (iii) verify the integrity only based on published speeches. Briefly, SpeechVerifier utilizes multiscale feature extraction to capture speech features across different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Steganography and Watermarking Techniques · Music and Audio Processing · Digital Media Forensic Detection
