TransReID: Transformer-based Object Re-Identification
Shuting He, Hao Luo, Pichao Wang, Fan Wang, Hao Li, Wei Jiang

TL;DR
TransReID introduces a novel pure transformer framework for object re-identification, addressing CNN limitations by enhancing feature robustness and diversity through innovative modules, achieving state-of-the-art results on multiple benchmarks.
Contribution
This work is the first to apply a pure transformer approach to object ReID, introducing modules that improve feature discrimination and mitigate view bias.
Findings
Achieves state-of-the-art performance on person ReID benchmarks.
Outperforms CNN-based methods in robustness and diversity of features.
Demonstrates the effectiveness of transformer-based models in ReID tasks.
Abstract
Extracting robust feature representation is one of the key challenges in object re-identification (ReID). Although convolution neural network (CNN)-based methods have achieved great success, they only process one local neighborhood at a time and suffer from information loss on details caused by convolution and downsampling operators (e.g. pooling and strided convolution). To overcome these limitations, we propose a pure transformer-based object ReID framework named TransReID. Specifically, we first encode an image as a sequence of patches and build a transformer-based strong baseline with a few critical improvements, which achieves competitive results on several ReID benchmarks with CNN-based methods. To further enhance the robust feature learning in the context of transformers, two novel modules are carefully designed. (i) The jigsaw patch module (JPM) is proposed to rearrange the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Jigsaw · Attention Is All You Need · Byte Pair Encoding · Softmax · Dropout · Label Smoothing · Layer Normalization
