Swap Path Network for Robust Person Search Pre-training
Lucas Jaffe, Avideh Zakhor

TL;DR
This paper introduces Swap Path Net (SPNet), a novel end-to-end pre-training framework for person search that combines query-centric and object-centric training, leading to state-of-the-art results and robustness to label noise.
Contribution
The paper presents the first end-to-end person search pre-training framework with a novel model that swaps between training objectives, improving robustness and performance.
Findings
Achieves 96.4% mAP on CUHK-SYSU
Achieves 61.2% mAP on PRW
Outperforms recent backbone-only pre-training methods
Abstract
In person search, we detect and rank matches to a query person image within a set of gallery scenes. Most person search models make use of a feature extraction backbone, followed by separate heads for detection and re-identification. While pre-training methods for vision backbones are well-established, pre-training additional modules for the person search task has not been previously examined. In this work, we present the first framework for end-to-end person search pre-training. Our framework splits person search into object-centric and query-centric methodologies, and we show that the query-centric framing is robust to label noise, and trainable using only weakly-labeled person bounding boxes. Further, we provide a novel model dubbed Swap Path Net (SPNet) which implements both query-centric and object-centric training objectives, and can swap between the two while using the same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition
MethodsSparse Evolutionary Training · Strip Pooling Network
