PersonMAE: Person Re-Identification Pre-Training with Masked   AutoEncoders

Hezhen Hu; Xiaoyi Dong; Jianmin Bao; Dongdong Chen; Lu Yuan; Dong; Chen; Houqiang Li

arXiv:2311.04496·cs.CV·November 9, 2023·1 cites

PersonMAE: Person Re-Identification Pre-Training with Masked AutoEncoders

Hezhen Hu, Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Lu Yuan, Dong, Chen, Houqiang Li

PDF

Open Access

TL;DR

PersonMAE introduces a masked autoencoder pre-training framework for Person Re-Identification that enhances occlusion robustness, cross-region invariance, and multi-level awareness, leading to state-of-the-art results across multiple datasets.

Contribution

The paper proposes PersonMAE, a novel masked autoencoder approach specifically designed for Person ReID, incorporating region prediction and occlusion simulation to improve representation quality.

Findings

01

Achieves 79.8% mAP on MSMT17, surpassing previous SOTA by 8.0.

02

Attains 69.5% mAP on OccDuke, outperforming prior methods by 5.3%.

03

Effective in both supervised and unsupervised ReID settings.

Abstract

Pre-training is playing an increasingly important role in learning generic feature representation for Person Re-identification (ReID). We argue that a high-quality ReID representation should have three properties, namely, multi-level awareness, occlusion robustness, and cross-region invariance. To this end, we propose a simple yet effective pre-training framework, namely PersonMAE, which involves two core designs into masked autoencoders to better serve the task of Person Re-ID. 1) PersonMAE generates two regions from the given image with RegionA as the input and \textit{RegionB} as the prediction target. RegionA is corrupted with block-wise masking to mimic common occlusion in ReID and its remaining visible parts are fed into the encoder. 2) Then PersonMAE aims to predict the whole RegionB at both pixel level and semantic feature level. It encourages its pre-trained feature…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Human Pose and Action Recognition