Learning Consistency from High-quality Pseudo-labels for Weakly Supervised Object Localization
Kangbo Sun, Jie Zhu

TL;DR
This paper introduces a two-stage pseudo-labeling approach that enhances weakly supervised object localization by learning from high-quality pseudo-labels, leading to improved accuracy on multiple benchmark datasets.
Contribution
It proposes a novel mask-based pseudo label generator and a confidence evaluation method to refine localization networks through consistency learning.
Findings
Achieves state-of-the-art results on CUB-200-2011, ImageNet-1k, and Tiny-ImageNet datasets.
Demonstrates the effectiveness of learning from high-quality pseudo-labels for weakly supervised localization.
Improves localization accuracy compared to existing methods.
Abstract
Pseudo-supervised learning methods have been shown to be effective for weakly supervised object localization tasks. However, the effectiveness depends on the powerful regularization ability of deep neural networks. Based on the assumption that the localization network should have similar location predictions on different versions of the same image, we propose a two-stage approach to learn more consistent localization. In the first stage, we propose a mask-based pseudo label generator algorithm, and use the pseudo-supervised learning method to initialize an object localization network. In the second stage, we propose a simple and effective method for evaluating the confidence of pseudo-labels based on classification discrimination, and by learning consistency from high-quality pseudo-labels, we further refine the localization network to get better localization performance. Experimental…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
