Attack-Resistant Watermarking for AIGC Image Forensics via Diffusion-based Semantic Deflection
Qingyu Liu, Yitao Zhang, Zhongjie Ba, Chao Shuai, Peng Cheng, Tianhang Zheng, Zhibo Wang

TL;DR
This paper introduces PAI, a diffusion-based watermarking framework that enhances copyright protection, attack detection, and tamper localization for AI-generated images, outperforming existing methods in robustness and semantic accuracy.
Contribution
PAI is a novel, training-free watermarking method that uses a key-conditioned deflection mechanism to improve robustness and enable semantic tamper localization in diffusion-based AIGC images.
Findings
Achieves 98.43% verification accuracy, surpassing SOTA by 37.25%.
Maintains strong tampering localization against advanced AIGC edits.
Effective against 12 different attack methods.
Abstract
Protecting the copyright of user-generated AI images is an emerging challenge as AIGC becomes pervasive in creative workflows. Existing watermarking methods (1) remain vulnerable to real-world adversarial threats, often forced to trade off between defenses against spoofing and removal attacks; and (2) cannot support semantic-level tamper localization. We introduce PAI, a training-free inherent watermarking framework for AIGC copyright protection, plug-and-play with diffusion-based AIGC services. PAI simultaneously provides three key functionalities: robust ownership verification, attack detection, and semantic-level tampering localization. Unlike existing inherent watermark methods that only embed watermarks at noise initialization of diffusion models, we design a novel key-conditioned deflection mechanism that subtly steers the denoising trajectory according to the user key. Such…
Peer Reviews
Decision·ICLR 2026 Poster
1. The paper introduces a novel semantic deflection mechanism within diffusion processes, embedding watermarks in the semantic space rather than pixel space — a creative and technically sound idea that improves robustness against diverse attacks. 2. The proposed PAI framework integrates watermark generation, verification, attack detection, and tampering localization into a unified pipeline, demonstrating both conceptual coherence and practical applicability for real-world AIGC scenarios. 3. Ex
1. The experimental validation mainly focuses on standard diffusion-based AIGC models and common attack types. The generalization of PAI to non-diffusion models (e.g., GAN-based generators or real-world edited images) remains unclear. 2. Although the authors claim imperceptibility of watermarks, no formal perceptual metrics or human evaluations are provided to verify that the embedded signals do not degrade visual quality or introduce detectable artifacts. 3. The framework is exclusively desig
1. Novel robustness design: This paper integrates attack simulation directly into training, improving resilience against diffusion and regeneration attacks. 2. Comprehensive experiments test under multiple real-world generative attacks and shows consistent superiority. 3. Good balance of fidelity and robustness: watermark imperceptibility maintained with strong retrieval accuracy. 4. Well-structured ablation studies: This paper clearly isolate the contributions of dual-domain embedding and AS
1. Limited theoretical analysis lacks a formal framework or justification for robustness improvements beyond empirical results. 2. The evaluation of this paper focuses on diffusion-based attacks. Some adversarial or large semantic edits such as instructpix2pix should be included in the experiments. 3. Computational cost of dual-domain embedding and ASM could hinder deployment in high-throughput systems. Costing 7950.59ms for watermarking an image seems too long. 4. Generalization uncertainty
- The idea of moving inherent watermarking beyond static noise initialization to active, key-conditioned "deflection" of the denoising trajectory is a significant conceptual leap. It creates a much deeper and more complex entanglement between the secret key and the final image semantics. - The most original and impactful contribution is the verification method. Correctly identifying the 1D scalar metric as the root cause of the removal/spoofing trade-off is a key insight. Replacing it with a hi
The deflection intensity $\gamma=0.1$ and the application of deflection for only the first five steps are presented without extensive ablation. The appendix (A.6.3) only ablates the number of steps (5, 10, 15), showing that 5 is sufficient. However, the intensity $\gamma$ is a critical parameter that presumably balances image quality against robustness. An ablation study on $\gamma$ would provide a more complete picture of the method's properties and trade-offs.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Digital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis
