AudioMarkBench: Benchmarking Robustness of Audio Watermarking
Hongbin Liu, Moyang Guo, Zhengyuan Jiang, Lun Wang, Neil Zhenqiang, Gong

TL;DR
AudioMarkBench systematically evaluates the robustness of audio watermarking methods against various perturbations, revealing vulnerabilities and emphasizing the need for more resilient solutions in the context of synthetic speech and AI-generated audio.
Contribution
This work introduces the first comprehensive benchmark dataset and evaluation framework for assessing audio watermarking robustness across multiple perturbations and settings.
Findings
Current watermarking methods are vulnerable to perturbations.
Robustness varies significantly across different attack scenarios.
The benchmark highlights critical areas for improving audio watermarking techniques.
Abstract
The increasing realism of synthetic speech, driven by advancements in text-to-speech models, raises ethical concerns regarding impersonation and disinformation. Audio watermarking offers a promising solution via embedding human-imperceptible watermarks into AI-generated audios. However, the robustness of audio watermarking against common/adversarial perturbations remains understudied. We present AudioMarkBench, the first systematic benchmark for evaluating the robustness of audio watermarking against watermark removal and watermark forgery. AudioMarkBench includes a new dataset created from Common-Voice across languages, biological sexes, and ages, 3 state-of-the-art watermarking methods, and 15 types of perturbations. We benchmark the robustness of these methods against the perturbations in no-box, black-box, and white-box settings. Our findings highlight the vulnerabilities of current…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Steganography and Watermarking Techniques · Digital Media Forensic Detection · Music and Audio Processing
