GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio   Synthesis

Weizhi Liu; Yue Li; Dongdong Lin; Hui Tian; Haizhou Li

arXiv:2407.10471·cs.CR·July 18, 2024

GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis

Weizhi Liu, Yue Li, Dongdong Lin, Hui Tian, Haizhou Li

PDF

Open Access

TL;DR

GROOT introduces a novel watermarking method for diffusion-model-based audio synthesis, enabling proactive supervision of generated audio with high robustness and minimal impact on audio quality.

Contribution

This paper pioneers a simultaneous watermark generation and audio synthesis paradigm using fixed diffusion models and a dedicated encoder, enhancing robustness over existing methods.

Findings

01

GROOT achieves around 95% watermark extraction accuracy under compound attacks.

02

It outperforms state-of-the-art watermarking techniques in robustness.

03

The method maintains high audio quality with minimal distortion.

Abstract

Amid the burgeoning development of generative models like diffusion models, the task of differentiating synthesized audio from its natural counterpart grows more daunting. Deepfake detection offers a viable solution to combat this challenge. Yet, this defensive measure unintentionally fuels the continued refinement of generative models. Watermarking emerges as a proactive and sustainable tactic, preemptively regulating the creation and dissemination of synthesized content. Thus, this paper, as a pioneer, proposes the generative robust audio watermarking method (Groot), presenting a paradigm for proactively supervising the synthesized audio and its source diffusion models. In this paradigm, the processes of watermark generation and audio synthesis occur simultaneously, facilitated by parameter-fixed diffusion models equipped with a dedicated encoder. The watermark embedded within the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies

MethodsDiffusion