Perceptual MAE for Image Manipulation Localization: A High-level Vision   Learner Focusing on Low-level Features

Xiaochen Ma; Jizhe Zhou; Xiong Xu; Zhuohang Jiang; Chi-Man Pun

arXiv:2310.06525·cs.CV·October 11, 2023·1 cites

Perceptual MAE for Image Manipulation Localization: A High-level Vision Learner Focusing on Low-level Features

Xiaochen Ma, Jizhe Zhou, Xiong Xu, Zhuohang Jiang, Chi-Man Pun

PDF

Open Access

TL;DR

This paper introduces Perceptual MAE, a novel approach that combines high-level semantic understanding with low-level feature analysis to improve image manipulation localization, outperforming existing methods.

Contribution

It reformulates IML as a high-level vision task, enhancing Masked Autoencoder with high-resolution inputs and perceptual loss to unify low- and high-level features.

Findings

01

Outperforms state-of-the-art methods on five datasets.

02

Effectively combines low-level and high-level features.

03

Enhances understanding of object semantics in IML.

Abstract

Nowadays, multimedia forensics faces unprecedented challenges due to the rapid advancement of multimedia generation technology thereby making Image Manipulation Localization (IML) crucial in the pursuit of truth. The key to IML lies in revealing the artifacts or inconsistencies between the tampered and authentic areas, which are evident under pixel-level features. Consequently, existing studies treat IML as a low-level vision task, focusing on allocating tampered masks by crafting pixel-level features such as image RGB noises, edge signals, or high-frequency features. However, in practice, tampering commonly occurs at the object level, and different classes of objects have varying likelihoods of becoming targets of tampering. Therefore, object semantics are also vital in identifying the tampered areas in addition to pixel-level features. This necessitates IML models to carry out a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Adversarial Robustness in Machine Learning · Advanced Steganography and Watermarking Techniques

MethodsMasked autoencoder