Robust Image Self-Recovery against Tampering using Watermark Generation with Pixel Shuffling
Minyoung Kim, Paul Hongsuck Seo

TL;DR
ReImage is a neural watermarking framework that enables robust image self-recovery from tampering by embedding a shuffled version of the image as a watermark, achieving high-quality restoration across various attack scenarios.
Contribution
The paper introduces ReImage, a novel neural watermarking method with pixel shuffling for effective image self-recovery, addressing limitations of existing techniques.
Findings
Achieves state-of-the-art recovery quality in tampering scenarios
Effectively reconstructs original images from manipulated versions
Demonstrates robustness across diverse tampering attacks
Abstract
The rapid growth of Artificial Intelligence-Generated Content (AIGC) raises concerns about the authenticity of digital media. In this context, image self-recovery, reconstructing original content from its manipulated version, offers a practical solution for understanding the attacker's intent and restoring trustworthy data. However, existing methods often fail to accurately recover tampered regions, falling short of the primary goal of self-recovery. To address this challenge, we propose ReImage, a neural watermarking-based self-recovery framework that embeds a shuffled version of the target image into itself as a watermark. We design a generator that produces watermarks optimized for neural watermarking and introduce an image enhancement module to refine the recovered image. We further analyze and resolve key limitations of shuffled watermarking, enabling its effective use in…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
1. Novel and Effective Framework: The paper proposes ReImage, a well-designed neural watermarking-based self-recovery method that leverages pixel shuffling to spatially misalign watermark content with the image. This innovation addresses a known issue of recovery failure due to alignment between tampered and watermarked regions along with clustered tampered regions in secret image. 2. Thorough Design and Ablation Study: The architecture is modular and interpretable, consisting of components lik
1. Insufficient Evaluation under diverse degradations: While robustness to three types of degradations is briefly evaluated (Gaussian noise, JPEG compression, and Poisson noise), the degradation types are limited, and there is no geometric degradation included (Imuge has included cropping in its experimental evaluation). It is well known that geometric degradations, such as cropping, pose significant challenges for watermarking models, suggesting a potential trade-off between robustness and imag
- The paper's primary contribution, reformulating MLLM-based localization from a segmentation-based "stitched" pipeline to an end-to-end autoregressive token generation task, is highly novel and elegant. - The problems are clearly articulated, and the proposed solutions (KSD, SwEI) are well-motivated, technically sound, and directly target the identified weaknesses. The experimental evaluation is comprehensive and rigorous. - The fact that this MLLM-based model surpasses dedicated, SOTA forens
- The SwEI module's success relies on injecting features from SparseViT, which is itself a SOTA forensic expert. This raises a question of how much of the performance is simply a successful distillation of the expert model. - The proposed TamperTok architecture involves running a large MLLM and an expert model (SparseViT) in parallel to extract features, followed by an autoregressive decoding step. This is almost certainly more computationally expensive (in terms of parameters, GFLOPs, and late
- Authors design several modules to improve the self-recovery, and the recovered images are better. - Authors provides many experimental results to show the performance in different aspects.
- This paper to me is more like an engineering paper. Many parts (from the pipeline: two-staged, similar to the compared imuge, to the model: inn, similar to W-RAE, again to the training details: mixed and fixed jpeg, filtering, etc.) show minor academic innovation. The proposed pixel shuffling for watermark and the two modules (IE and WG) are also trivial designs. - Pixel shuffling improved image recovery, but also increase the entropy of the information to be hidden. After reading Section 3.3.
Results show that the recovery quality of the method is higher. In many tests, the results are much better than the existing methods.
1. It seems the applied validation dataset is different from the applied baselines. It seems how these baseline models are prepared remain unclear. 2. It seems the applied methodology lacks novelty. Like Invertible networks and mask free generation are also applied in Imuge. 3. It seems the reviewer cannot easily benchmark the advantage of this method, either via theoretical or empirical analysis, or via source code (no code or API provided. though it is completely optional, the reviewer canno
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Advanced Steganography and Watermarking Techniques · Generative Adversarial Networks and Image Synthesis
