Interlayer and Intralayer Scale Aggregation for Scale-invariant Crowd Counting
Mingjie Wang, Hao Cai, Jun Zhou, Minglun Gong

TL;DR
This paper introduces a single-column network for crowd counting that effectively captures multi-scale features and maintains scale-invariance, outperforming multi-column methods in accuracy and transferability.
Contribution
The paper proposes a novel single-column network with interlayer and intralayer scale-invariant features, improving over multi-column approaches in crowd counting tasks.
Findings
Outperforms state-of-the-art methods in accuracy
Demonstrates strong transferability across datasets
Achieves scale-invariance in crowd counting
Abstract
Crowd counting is an important vision task, which faces challenges on continuous scale variation within a given scene and huge density shift both within and across images. These challenges are typically addressed using multi-column structures in existing methods. However, such an approach does not provide consistent improvement and transferability due to limited ability in capturing multi-scale features, sensitiveness to large density shift, and difficulty in training multi-branch models. To overcome these limitations, a Single-column Scale-invariant Network (ScSiNet) is presented in this paper, which extracts sophisticated scale-invariant features via the combination of interlayer multi-scale integration and a novel intralayer scale-invariant transformation (SiT). Furthermore, in order to enlarge the diversity of densities, a randomly integrated loss is presented for training our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Image Enhancement Techniques · Anomaly Detection Techniques and Applications
