Region-Aware Network: Model Human's Top-Down Visual Perception Mechanism for Crowd Counting
Yuehai Chen, Jing Yang, Dong Zhang, Kun Zhang, Badong Chen, Shaoyi, Du

TL;DR
This paper introduces RANet, a novel crowd counting model inspired by human top-down visual perception, utilizing feedback and region-aware blocks to focus on crowd regions and handle scale variations effectively.
Contribution
The paper proposes a feedback network with region-aware blocks that model human perception, improving attention to crowd regions and capturing global contextual information.
Findings
Outperforms state-of-the-art on multiple datasets
Effectively handles scale variation and background noise
Utilizes global relevance matrix for better context encoding
Abstract
Background noise and scale variation are common problems that have been long recognized in crowd counting. Humans glance at a crowd image and instantly know the approximate number of human and where they are through attention the crowd regions and the congestion degree of crowd regions with a global receptive field. Hence, in this paper, we propose a novel feedback network with Region-Aware block called RANet by modeling humans Top-Down visual perception mechanism. Firstly, we introduce a feedback architecture to generate priority maps that provide prior about candidate crowd regions in input images. The prior enables the RANet pay more attention to crowd regions. Then we design Region-Aware block that could adaptively encode the contextual information into input images through global receptive field. More specifically, we scan the whole input images and its priority maps in the form of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
