Generalizable Person Search on Open-world User-Generated Video Content
Junjie Li, Guanshuo Wang, Yichao Yan, Fufu Yu, Qiong Jia, Jie Qin,, Shouhong Ding, Xiaokang Yang

TL;DR
This paper proposes a novel framework for person search that enhances out-of-domain generalization by learning domain-invariant features and addressing open-world data noise, enabling effective performance without target domain samples.
Contribution
It introduces a multi-task prototype-based domain-specific batch normalization and a feature decorrelation strategy to improve open-world person search models.
Findings
Achieves promising results on challenging benchmarks
No human annotation or target domain samples needed
Enhances generalization to arbitrary scenarios
Abstract
Person search is a challenging task that involves detecting and retrieving individuals from a large set of un-cropped scene images. Existing person search applications are mostly trained and deployed in the same-origin scenarios. However, collecting and annotating training samples for each scene is often difficult due to the limitation of resources and the labor cost. Moreover, large-scale intra-domain data for training are generally not legally available for common developers, due to the regulation of privacy and public security. Leveraging easily accessible large-scale User Generated Video Contents (\emph{i.e.} UGC videos) to train person search models can fit the open-world distribution, but still suffering a performance gap from the domain difference to surveillance scenes. In this work, we explore enhancing the out-of-domain generalization capabilities of person search models, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
MethodsFocus
