Weakly-supervised multi-class object localization using only object counts as labels
Kyle Mills, Isaac Tamblyn

TL;DR
This paper presents a novel weakly-supervised method using extensive deep neural networks to localize multiple object classes in images solely based on object count labels, without requiring explicit annotations.
Contribution
It introduces a new approach leveraging EDNNs for object localization using only count labels, and provides seven new datasets to evaluate this method.
Findings
EDNN achieves over 99% accuracy in counting objects.
The method successfully localizes objects without explicit annotations.
Performance extends to larger images than training data.
Abstract
We demonstrate the use of an extensive deep neural network to localize instances of objects in images. The EDNN is naturally able to accurately perform multi-class counting using only ground truth count values as labels. Without providing any conceptual information, object annotations, or pixel segmentation information, the neural network is able to formulate its own conceptual representation of the items in the image. Using images labelled with only the counts of the objects present,the structure of the extensive deep neural network can be exploited to perform localization of the objects within the visual field. We demonstrate that a trained EDNN can be used to count objects in images much larger than those on which it was trained. In order to demonstrate our technique, we introduce seven new data sets: five progressively harder MNIST digit-counting data sets, and two datasets of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques
