Perceptual MAE for Image Manipulation Localization: A High-level Vision Learner Focusing on Low-level Features
Xiaochen Ma, Jizhe Zhou, Xiong Xu, Zhuohang Jiang, Chi-Man Pun

TL;DR
This paper introduces Perceptual MAE, a novel approach that combines high-level semantic understanding with low-level feature analysis to improve image manipulation localization, outperforming existing methods.
Contribution
It reformulates IML as a high-level vision task, enhancing Masked Autoencoder with high-resolution inputs and perceptual loss to unify low- and high-level features.
Findings
Outperforms state-of-the-art methods on five datasets.
Effectively combines low-level and high-level features.
Enhances understanding of object semantics in IML.
Abstract
Nowadays, multimedia forensics faces unprecedented challenges due to the rapid advancement of multimedia generation technology thereby making Image Manipulation Localization (IML) crucial in the pursuit of truth. The key to IML lies in revealing the artifacts or inconsistencies between the tampered and authentic areas, which are evident under pixel-level features. Consequently, existing studies treat IML as a low-level vision task, focusing on allocating tampered masks by crafting pixel-level features such as image RGB noises, edge signals, or high-frequency features. However, in practice, tampering commonly occurs at the object level, and different classes of objects have varying likelihoods of becoming targets of tampering. Therefore, object semantics are also vital in identifying the tampered areas in addition to pixel-level features. This necessitates IML models to carry out a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Adversarial Robustness in Machine Learning · Advanced Steganography and Watermarking Techniques
MethodsMasked autoencoder
