PersonMAE: Person Re-Identification Pre-Training with Masked AutoEncoders
Hezhen Hu, Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Lu Yuan, Dong, Chen, Houqiang Li

TL;DR
PersonMAE introduces a masked autoencoder pre-training framework for Person Re-Identification that enhances occlusion robustness, cross-region invariance, and multi-level awareness, leading to state-of-the-art results across multiple datasets.
Contribution
The paper proposes PersonMAE, a novel masked autoencoder approach specifically designed for Person ReID, incorporating region prediction and occlusion simulation to improve representation quality.
Findings
Achieves 79.8% mAP on MSMT17, surpassing previous SOTA by 8.0.
Attains 69.5% mAP on OccDuke, outperforming prior methods by 5.3%.
Effective in both supervised and unsupervised ReID settings.
Abstract
Pre-training is playing an increasingly important role in learning generic feature representation for Person Re-identification (ReID). We argue that a high-quality ReID representation should have three properties, namely, multi-level awareness, occlusion robustness, and cross-region invariance. To this end, we propose a simple yet effective pre-training framework, namely PersonMAE, which involves two core designs into masked autoencoders to better serve the task of Person Re-ID. 1) PersonMAE generates two regions from the given image with RegionA as the input and \textit{RegionB} as the prediction target. RegionA is corrupted with block-wise masking to mimic common occlusion in ReID and its remaining visible parts are fed into the encoder. 2) Then PersonMAE aims to predict the whole RegionB at both pixel level and semantic feature level. It encourages its pre-trained feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Human Pose and Action Recognition
