SERUM: Simple, Efficient, Robust, and Unifying Marking for Diffusion-based Image Generation
Jan Kociszewski, Hubert Jastrz\k{e}bski, Tymoteusz St\k{e}pkowski, Filip Manijak, Krzysztof Rojek, Franziska Boenisch, Adam Dziedzic

TL;DR
SERUM introduces a simple, efficient, and robust watermarking method for diffusion model-generated images, enabling reliable detection with minimal impact on image quality and supporting multi-user scenarios.
Contribution
It presents a novel watermarking approach that adds a unique noise pattern to diffusion outputs, offering robustness, efficiency, and multi-user support, surpassing prior methods.
Findings
Achieves highest TPR at 1% FPR in most scenarios.
Supports fast injection and detection with low overhead.
Maintains negligible impact on image quality.
Abstract
We propose SERUM: an intriguingly simple yet highly effective method for marking images generated by diffusion models (DMs). We only add a unique watermark noise to the initial diffusion generation noise and train a lightweight detector to identify watermarked images, simplifying and unifying the strengths of prior approaches. SERUM provides robustness against any image augmentations or watermark removal attacks and is extremely efficient, all while maintaining negligible impact on image quality. In contrast to prior approaches, which are often only resilient to limited perturbations and incur significant training, injection, and detection costs, our SERUM achieves remarkable performance, with the highest true positive rate (TPR) at a 1% false positive rate (FPR) in most scenarios, along with fast injection and detection and low detector training overhead. Its decoupled architecture…
Peer Reviews
Decision·ICLR 2026 Poster
- The main idea proposed in the paper is simple but effective. By decoupling the watermark embedding in the initial noise from the detection mechanism which is a separate and lightweight classifier, the method avoids the bottlenecks of prior works - DDIM inversion which is slow, and expensive finetuning - The authors have presented sufficient proof to validate better robustness of this method in Table 1 against 3 other DM based methods on all standard perturbations. In addition to the method ach
- Multi user support involves training a unique detector for each user's unique noise pattern. Figure 3 demonstrates that this approach works well for up to 10 users, but this does not guarantee its viability at a large scale. A system with millions of users would require managing millions of individual detector models. As the number of unique noise patterns grows, the probability of 'collisions' where a pattern coincidentally produces a signature detectable by another user's detector may increa
**Originality:** This work introduces a new approach to diffusion model watermarking by injecting a unique Gaussian watermark into the initial noise and employing a lightweight external detector. The technique is distinct in its simplicity, enabling scalable multi-user attribution without modifying the underlying generative model. **Quality:** The methodology is supported by comprehensive experiments on multiple Stable Diffusion models, benchmarking detection robustness against diverse pertur
1. Although the detector is described as lightweight, performance scaling to much larger numbers of unique watermarks (multi-user setting) may face practical bottlenecks. The paper assesses up to 10 users. In a real-world application, it will be hundreds and even thousands of users. I suggest that the author should invest further in a large-scale application if possible. 2. The work primarily compares SERUM to recent DM watermarking approaches like Stable Signature, RingID, and GaussMarker, but
Method-wise, the pipeline is simple in design and the description is clear and easy to understand. Experiment-wise, the authors conduct experiments on SD1.4, SD2.0, and SD2.1, with a fairly comprehensive amount of evaluation.
1. Robustness to Crop and Scale remains an issue. The choice to retain 75% in evaluation is relatively weak. I think it quite common to retain 50% of image in life. What will the performance be like if under stronger Crop and Scale attack? 2. The metric TPR at 1% FPR is weak. Scores are all close to 100%, which makes it hard to distinguish performance differences between methods. Although prior work used this evaluation metric, I still recommend controlling FPR at a much smaller range, for examp
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Steganography and Watermarking Techniques · Digital Media Forensic Detection
