Learning to count with deep object features
Santi Segu\'i, Oriol Pujol, Jordi Vitri\`a

TL;DR
This paper investigates how deep neural networks learn internal representations for counting objects, demonstrating their ability to classify digits without direct supervision and applying this to counting pedestrians.
Contribution
It explores the learned features in counting CNNs, revealing their capacity for implicit classification and extending to pedestrian counting tasks.
Findings
Internal representations can classify digits without direct supervision.
Deep networks can count pedestrians in scenes.
Features learned are informative for object classification.
Abstract
Learning to count is a learning strategy that has been recently proposed in the literature for dealing with problems where estimating the number of object instances in a scene is the final objective. In this framework, the task of learning to detect and localize individual object instances is seen as a harder task that can be evaded by casting the problem as that of computing a regression value from hand-crafted image features. In this paper we explore the features that are learned when training a counting convolutional neural network in order to understand their underlying representation. To this end we define a counting problem for MNIST data and show that the internal representation of the network is able to classify digits in spite of the fact that no direct supervision was provided for them during training. We also present preliminary results about a deep network that is able to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Human Pose and Action Recognition
