CountFormer: Multi-View Crowd Counting Transformer

Hong Mo; Xiong Zhang; Jianchao Tan; Cheng Yang; Qiong Gu; Bo Hang,; Wenqi Ren

arXiv:2407.02047·cs.CV·July 3, 2024

CountFormer: Multi-View Crowd Counting Transformer

Hong Mo, Xiong Zhang, Jianchao Tan, Cheng Yang, Qiong Gu, Bo Hang,, Wenqi Ren

PDF

Open Access 1 Repo

TL;DR

CountFormer is a novel 3D multi-view crowd counting transformer that effectively integrates camera parameters and multi-view features to produce accurate scene-level density maps, outperforming existing methods.

Contribution

The paper introduces CountFormer, a concise 3D MVC framework that embeds camera parameters and employs attention-based feature lifting and aggregation for flexible multi-view crowd counting.

Findings

01

Outperforms state-of-the-art methods on multiple datasets.

02

Handles arbitrary dynamic camera layouts effectively.

03

Demonstrates robustness in real-world scenarios.

Abstract

Multi-view counting (MVC) methods have shown their superiority over single-view counterparts, particularly in situations characterized by heavy occlusion and severe perspective distortions. However, hand-crafted heuristic features and identical camera layout requirements in conventional MVC methods limit their applicability and scalability in real-world scenarios.In this work, we propose a concise 3D MVC framework called \textbf{CountFormer}to elevate multi-view image-level features to a scene-level volume representation and estimate the 3D density map based on the volume features. By incorporating a camera encoding strategy, CountFormer successfully embeds camera parameters into the volume query and image-level features, enabling it to handle various camera layouts with significant differences.Furthermore, we introduce a feature lifting module capitalized on the attention mechanism to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MandyMo/ECCV_Countformer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Data Stream Mining Techniques

MethodsSoftmax · Attention Is All You Need