JND-Guided Neural Watermarking with Spatial Transformer Decoding for Screen-Capture Robustness
Jiayi Qin, Jingwei Li, Chuan Wu

TL;DR
This paper introduces a deep learning framework for screen-capture robust watermarking that models realistic distortions, uses perceptual loss for imperceptibility, and includes automated decoding modules, achieving high-quality, robust watermarks.
Contribution
It presents a novel end-to-end approach with a comprehensive distortion simulation, perceptual loss, and automated localization for robust watermarking against complex screen-capture distortions.
Findings
Achieves an average PSNR of 30.94 dB and SSIM of 0.94 on watermarked images.
Successfully embeds 127-bit payloads with high perceptual quality.
Demonstrates robustness against realistic screen-capture distortions.
Abstract
Screen-shooting robust watermarking aims to imperceptibly embed extractable information into host images such that the watermark survives the complex distortion pipeline of screen display and camera recapture. However, achieving high extraction accuracy while maintaining satisfactory visual quality remains an open challenge, primarily because the screen-shooting channel introduces severe and entangled degradations including Moir\'{e} patterns, color-gamut shifts, perspective warping, and sensor noise. In this paper, we present an end-to-end deep learning framework that jointly optimizes watermark embedding and extraction for screen-shooting robustness. Our framework incorporates three key innovations: (i) a comprehensive noise simulation layer that faithfully models realistic screen-shooting distortions -- notably including a physically-motivated Moir\'{e} pattern generator -- enabling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
