Selecting Learnable Training Samples is All DETRs Need in Crowded   Pedestrian Detection

Feng Gao; Jiaxu Leng; Gan Ji; Xinbo Gao

arXiv:2305.10801·cs.CV·May 19, 2023·2 cites

Selecting Learnable Training Samples is All DETRs Need in Crowded Pedestrian Detection

Feng Gao, Jiaxu Leng, Gan Ji, Xinbo Gao

PDF

Open Access

TL;DR

This paper introduces SSCP, a novel sample selection method for DETRs in crowded pedestrian detection, improving performance by selecting learnable samples and adaptively weighting their losses without extra inference cost.

Contribution

The paper proposes SSCP, combining CGLA and UAFL, to enhance DETRs by selecting learnable samples and adjusting loss weights based on sample utilizability.

Findings

01

Improved MR on Crowdhuman to 39.7%

02

Enhanced MR on Citypersons to 31.8%

03

No additional inference overhead

Abstract

DEtection TRansformer (DETR) and its variants (DETRs) achieved impressive performance in general object detection. However, in crowded pedestrian detection, the performance of DETRs is still unsatisfactory due to the inappropriate sample selection method which results in more false positives. To settle the issue, we propose a simple but effective sample selection method for DETRs, Sample Selection for Crowded Pedestrians (SSCP), which consists of the constraint-guided label assignment scheme (CGLA) and the utilizability-aware focal loss (UAFL). Our core idea is to select learnable samples for DETRs and adaptively regulate the loss weights of samples based on their utilizability. Specifically, in CGLA, we proposed a new cost function to ensure that only learnable positive training samples are retained and the rest are negative training samples. Further, considering the utilizability of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning

MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Convolution · Dense Connections · Multi-Head Attention · Adam · Residual Connection · Absolute Position Encodings · Softmax