Watermarking Training Data of Music Generation Models
Pascal Epple, Igor Shilov, Bozhidar Stevanoski, Yves-Alexandre de, Montjoye

TL;DR
This paper explores using audio watermarking to detect unauthorized training data in music generation models, showing that watermarking can influence model outputs and assessing the robustness of watermarking techniques.
Contribution
It introduces a method to identify watermarked training data in music models and evaluates the impact and robustness of audio watermarking techniques in this context.
Findings
Watermarking can cause noticeable shifts in model outputs.
Imperceptible watermarking techniques are effective in detection.
Robustness of watermarking against removal techniques varies.
Abstract
Generative Artificial Intelligence (Gen-AI) models are increasingly used to produce content across domains, including text, images, and audio. While these models represent a major technical breakthrough, they gain their generative capabilities from being trained on enormous amounts of human-generated content, which often includes copyrighted material. In this work, we investigate whether audio watermarking techniques can be used to detect an unauthorized usage of content to train a music generation model. We compare outputs generated by a model trained on watermarked data to a model trained on non-watermarked data. We study factors that impact the model's generation behaviour: the watermarking technique, the proportion of watermarked samples in the training set, and the robustness of the watermarking technique against the model's tokenizer. Our results show that audio watermarking…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Steganography and Watermarking Techniques · Chaos-based Image/Signal Encryption · Computer Graphics and Visualization Techniques
