TL;DR
This paper introduces an improved Normed-Deformable Convolution (NDConv) for crowd counting, enhancing head feature extraction by more evenly sampling head regions, leading to state-of-the-art results across multiple datasets.
Contribution
The paper proposes a novel NDConv module with a Normed-Deformable loss to improve sampling uniformity and feature completeness in crowd counting models.
Findings
Outperforms state-of-the-art methods on multiple datasets
Achieves lower MAE scores indicating higher accuracy
Maintains similar computational complexity to deformable convolution
Abstract
In recent years, crowd counting has become an important issue in computer vision. In most methods, the density maps are generated by convolving with a Gaussian kernel from the ground-truth dot maps which are marked around the center of human heads. Due to the fixed geometric structures in CNNs and indistinct head-scale information, the head features are obtained incompletely. Deformable convolution is proposed to exploit the scale-adaptive capabilities for CNN features in the heads. By learning the coordinate offsets of the sampling points, it is tractable to improve the ability to adjust the receptive field. However, the heads are not uniformly covered by the sampling points in the deformable convolution, resulting in loss of head information. To handle the non-uniformed sampling, an improved Normed-Deformable Convolution (\textit{i.e.,}NDConv) implemented by Normed-Deformable loss…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMasked autoencoder · Convolution · Deformable Convolution
