TransCrowd: weakly-supervised crowd counting with transformers

Dingkang Liang; Xiwu Chen; Wei Xu; Yu Zhou; Xiang Bai

arXiv:2104.09116·cs.CV·September 9, 2022·25 cites

TransCrowd: weakly-supervised crowd counting with transformers

Dingkang Liang, Xiwu Chen, Wei Xu, Yu Zhou, Xiang Bai

PDF

Open Access 1 Repo

TL;DR

TransCrowd introduces a novel transformer-based weakly-supervised crowd counting method that leverages global context and self-attention, outperforming CNN-based approaches with only count-level annotations.

Contribution

It is the first to apply a pure transformer model to weakly-supervised crowd counting, addressing CNN limitations in context modeling.

Findings

01

Outperforms CNN-based weakly-supervised methods

02

Achieves competitive results with fully-supervised methods

03

Demonstrates effectiveness across five benchmark datasets

Abstract

The mainstream crowd counting methods usually utilize the convolution neural network (CNN) to regress a density map, requiring point-level annotations. However, annotating each person with a point is an expensive and laborious process. During the testing phase, the point-level annotations are not considered to evaluate the counting accuracy, which means the point-level annotations are redundant. Hence, it is desirable to develop weakly-supervised counting methods that just rely on count-level annotations, a more economical way of labeling. Current weakly-supervised counting methods adopt the CNN to regress a total count of the crowd by an image-to-count paradigm. However, having limited receptive fields for context modeling is an intrinsic limitation of these weakly-supervised CNN-based methods. These methods thus cannot achieve satisfactory performance, with limited applications in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dk-liang/TransCrowd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications · Fire Detection and Safety Systems

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Softmax · Layer Normalization · Label Smoothing · Residual Connection · Byte Pair Encoding