On learning to localize objects with minimal supervision
Hyun Oh Song, Ross Girshick, Stefanie Jegelka, Julien Mairal, Zaid, Harchaoui, Trevor Darrell

TL;DR
This paper introduces a novel weakly supervised object localization method that uses only image-level labels, combining submodular cover and latent SVM techniques, achieving significant performance improvements on PASCAL VOC 2007.
Contribution
It presents a new approach that effectively localizes objects with minimal supervision, reducing annotation costs and improving detection accuracy.
Findings
50% relative improvement in mean average precision on PASCAL VOC 2007
Combines submodular cover with latent SVM for effective localization
Operates with only image-level labels, reducing annotation effort
Abstract
Learning to localize objects with minimal supervision is an important problem in computer vision, since large fully annotated datasets are extremely costly to obtain. In this paper, we propose a new method that achieves this goal with only image-level labels of whether the objects are present or not. Our approach combines a discriminative submodular cover problem for automatically discovering a set of positive object windows with a smoothed latent SVM formulation. The latter allows us to leverage efficient quasi-Newton optimization techniques. Our experiments demonstrate that the proposed approach provides a 50% relative improvement in mean average precision over the current state-of-the-art on PASCAL VOC 2007 detection.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Medical Image Segmentation Techniques
