Normalization Matters in Weakly Supervised Object Localization
Jeesoo Kim, Junsuk Choe, Sangdoo Yun, Nojun Kwak

TL;DR
This paper highlights the importance of normalization in weakly supervised object localization, reviews existing methods, and proposes a new normalization technique that significantly improves localization accuracy across multiple datasets and architectures.
Contribution
The paper reviews normalization strategies in WSOL and introduces a novel normalization method that enhances CAM-based localization performance.
Findings
Significant performance gains with the proposed normalization across datasets.
Normalization choice depends on dataset properties.
The new method outperforms conventional min-max normalization.
Abstract
Weakly-supervised object localization (WSOL) enables finding an object using a dataset without any localization information. By simply training a classification model using only image-level annotations, the feature map of the model can be utilized as a score map for localization. In spite of many WSOL methods proposing novel strategies, there has not been any de facto standard about how to normalize the class activation map (CAM). Consequently, many WSOL methods have failed to fully exploit their own capacity because of the misuse of a normalization method. In this paper, we review many existing normalization methods and point out that they should be used according to the property of the given dataset. Additionally, we propose a new normalization method which substantially enhances the performance of any CAM-based WSOL methods. Using the proposed normalization method, we provide a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
