Object Counting and Instance Segmentation with Image-level Supervision
Hisham Cholakkal, Guolei Sun, Fahad Shahbaz Khan, Ling Shao

TL;DR
This paper introduces a novel image-level supervised method for object counting and instance segmentation that estimates density maps and reduces supervision needs, outperforming existing approaches on standard datasets.
Contribution
It is the first to propose density map estimation for object counting and instance segmentation using only image-level supervision with limited object count info.
Findings
Outperforms existing methods on PASCAL VOC and COCO for counting.
Improves state-of-the-art in image-level supervised instance segmentation by 17.8%.
Effectively estimates object density maps with minimal supervision.
Abstract
Common object counting in a natural scene is a challenging problem in computer vision with numerous real-world applications. Existing image-level supervised common object counting approaches only predict the global object count and rely on additional instance-level supervision to also determine object locations. We propose an image-level supervised approach that provides both the global object count and the spatial distribution of object instances by constructing an object category density map. Motivated by psychological studies, we further reduce image-level supervision using a limited object count information (up to four). To the best of our knowledge, we are the first to propose image-level supervised density map estimation for common object counting and demonstrate its effectiveness in image-level supervised instance segmentation. Comprehensive experiments are performed on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques
