# Self-similarity Driven Scale-invariant Learning for Weakly Supervised   Person Search

**Authors:** Benzhi Wang, Yang Yang, Jinlin Wu, Guo-jun Qi, Zhen Lei

arXiv: 2302.12986 · 2024-01-04

## TL;DR

This paper introduces a novel scale-invariant learning framework for weakly supervised person search, addressing scale variation challenges by leveraging self-similarity and multi-scale features, resulting in improved detection and matching accuracy.

## Contribution

The paper proposes a one-step SSL framework with a multi-scale exemplar branch and dynamic multi-label prediction to handle scale variations in weakly supervised person search.

## Key findings

- Effective in handling scale variations in person search
- Improves detection and matching accuracy on PRW and CUHK-SYSU datasets
- Introduces a novel self-similarity based scale-invariant learning approach

## Abstract

Weakly supervised person search aims to jointly detect and match persons with only bounding box annotations. Existing approaches typically focus on improving the features by exploring relations of persons. However, scale variation problem is a more severe obstacle and under-studied that a person often owns images with different scales (resolutions). On the one hand, small-scale images contain less information of a person, thus affecting the accuracy of the generated pseudo labels. On the other hand, the similarity of cross-scale images is often smaller than that of images with the same scale for a person, which will increase the difficulty of matching. In this paper, we address this problem by proposing a novel one-step framework, named Self-similarity driven Scale-invariant Learning (SSL). Scale invariance can be explored based on the self-similarity prior that it shows the same statistical properties of an image at different scales. To this end, we introduce a Multi-scale Exemplar Branch to guide the network in concentrating on the foreground and learning scale-invariant features by hard exemplars mining. To enhance the discriminative power of the features in an unsupervised manner, we introduce a dynamic multi-label prediction which progressively seeks true labels for training. It is adaptable to different types of unlabeled data and serves as a compensation for clustering based strategy. Experiments on PRW and CUHK-SYSU databases demonstrate the effectiveness of our method.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.12986/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/2302.12986/full.md

## References

49 references — full list in the complete paper: https://tomesphere.com/paper/2302.12986/full.md

---
Source: https://tomesphere.com/paper/2302.12986