Inharmonious Region Localization
Jing Liang, Li Niu, Liqing Zhang

TL;DR
This paper introduces a novel deep learning framework for localizing inharmonious regions in images, utilizing multi-scale feature fusion and attention mechanisms to improve accuracy in identifying manipulated regions.
Contribution
The paper proposes new Bi-directional Feature Integration and Global-context Guided Decoder blocks for effective multi-scale feature fusion in inharmonious region localization.
Findings
Achieves competitive performance on image harmonization datasets.
Demonstrates the effectiveness of multi-scale feature aggregation.
Provides source code for reproducibility.
Abstract
The advance of image editing techniques allows users to create artistic works, but the manipulated regions may be incompatible with the background. Localizing the inharmonious region is an appealing yet challenging task. Realizing that this task requires effective aggregation of multi-scale contextual information and suppression of redundant information, we design novel Bi-directional Feature Integration (BFI) block and Global-context Guided Decoder (GGD) block to fuse multi-scale features in the encoder and decoder respectively. We also employ Mask-guided Dual Attention (MDA) block between the encoder and decoder to suppress the redundant information. Experiments on the image harmonization dataset demonstrate that our method achieves competitive performance for inharmonious region localization. The source code is available at https://github.com/bcmi/DIRL.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis
