From Passive Perception to Active Memory: A Weakly Supervised Image Manipulation Localization Framework Driven by Coarse-Grained Annotations
Zhiqing Guo, Dongdong Xi, Songlin Li, Gaobo Yang

TL;DR
This paper introduces BoxPromptIML, a weakly-supervised image manipulation localization framework that balances annotation cost and accuracy by using coarse annotations, knowledge distillation, and a memory-inspired feature fusion module.
Contribution
The paper proposes a novel weakly-supervised IML method combining coarse region annotations, knowledge distillation from SAM, and a memory-inspired feature fusion to improve localization accuracy and efficiency.
Findings
Outperforms fully-supervised models in accuracy and robustness.
Maintains low annotation cost and high generalization.
Effective in both in-distribution and out-of-distribution datasets.
Abstract
Image manipulation localization (IML) faces a fundamental trade-off between minimizing annotation cost and achieving fine-grained localization accuracy. Existing fully-supervised IML methods depend heavily on dense pixel-level mask annotations, which limits scalability to large datasets or real-world deployment.In contrast, the majority of existing weakly-supervised IML approaches are based on image-level labels, which greatly reduce annotation effort but typically lack precise spatial localization. To address this dilemma, we propose BoxPromptIML, a novel weakly-supervised IML framework that effectively balances annotation cost and localization performance. Specifically, we propose a coarse region annotation strategy, which can generate relatively accurate manipulation masks at lower cost. To improve model efficiency and facilitate deployment, we further design an efficient lightweight…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
