End-to-end Person Search Sequentially Trained on Aggregated Dataset
Angelique Loesch, Jaonary Rabarisoa, Romaric Audigier

TL;DR
This paper introduces an end-to-end person search model that jointly detects and re-identifies individuals, achieving state-of-the-art accuracy while being trainable on diverse datasets without extensive annotations.
Contribution
A novel multi-task CNN architecture that combines detection and re-ID, allowing sequential training on aggregated datasets for improved robustness and accuracy.
Findings
State-of-the-art person search accuracy achieved.
Sequential training on aggregated datasets enhances robustness.
Shared feature maps improve cross-dataset re-ID performance.
Abstract
In video surveillance applications, person search is a challenging task consisting in detecting people and extracting features from their silhouette for re-identification (re-ID) purpose. We propose a new end-to-end model that jointly computes detection and feature extraction steps through a single deep Convolutional Neural Network architecture. Sharing feature maps between the two tasks for jointly describing people commonalities and specificities allows faster runtime, which is valuable in real-world applications. In addition to reaching state-of-the-art accuracy, this multi-task model can be sequentially trained task-by-task, which results in a broader acceptance of input dataset types. Indeed, we show that aggregating more pedestrian detection datasets without costly identity annotations makes the shared feature maps more generic, and improves re-ID precision. Moreover, these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
