Unsupervised Person Re-identification via Simultaneous Clustering and Consistency Learning
Junhui Yin, Jiayan Qiu, Siqing Zhang, Jiyang Xie, Zhanyu Ma, and Jun, Guo

TL;DR
This paper introduces an unsupervised person re-identification method that leverages simultaneous clustering and consistency learning to improve semantic representation and clustering accuracy, outperforming existing methods.
Contribution
The paper proposes a novel pretext task combining visual and temporal consistency learning to enhance unsupervised person re-ID performance.
Findings
Outperforms state-of-the-art methods on multiple datasets
Learns semantically meaningful representations without labels
Enhances clustering accuracy through consistency-based training
Abstract
Unsupervised person re-identification (re-ID) has become an important topic due to its potential to resolve the scalability problem of supervised re-ID models. However, existing methods simply utilize pseudo labels from clustering for supervision and thus have not yet fully explored the semantic information in data itself, which limits representation capabilities of learned models. To address this problem, we design a pretext task for unsupervised re-ID by learning visual consistency from still images and temporal consistency during training process, such that the clustering network can separate the images into semantic clusters automatically. Specifically, the pretext task learns semantically meaningful representations by maximizing the agreement between two encoded views of the same image via a consistency loss in latent space. Meanwhile, we optimize the model by grouping the two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Gait Recognition and Analysis · Human Pose and Action Recognition
