Local Information Matters: A Rethink of Crowd Counting
Tianhang Pan, Xiuyi Jia

TL;DR
This paper introduces LIMM, a crowd counting model emphasizing local information through window partitioning and contrastive learning, achieving state-of-the-art results by focusing on local density distinctions.
Contribution
The paper proposes a novel crowd counting approach that emphasizes local modeling via window partitioning and contrastive learning, improving accuracy without sacrificing large object detection.
Findings
Significant MAE reduction on JHU-Crowd++ high-density subset (8.7%).
Enhanced local density discrimination capability.
State-of-the-art crowd counting performance.
Abstract
The motivation of this paper originates from rethinking an essential characteristic of crowd counting: individuals (heads of humans) in the crowd counting task typically occupy a very small portion of the image. This characteristic has never been the focus of existing works: they typically use the same backbone as other visual tasks and pursue a large receptive field. This drives us to propose a new model design principle of crowd counting: emphasizing local modeling capability of the model. We follow the principle and design a crowd counting model named Local Information Matters Model (LIMM). The main innovation lies in two strategies: a window partitioning design that applies grid windows to the model input, and a window-wise contrastive learning design to enhance the model's ability to distinguish between local density levels. Moreover, a global attention module is applied to the end…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
