Tran-GCN: A Transformer-Enhanced Graph Convolutional Network for Person Re-Identification in Monitoring Videos
Xiaobin Hong, Tarmizi Adam, Masitah Ghazali

TL;DR
This paper introduces Tran-GCN, a novel model combining transformer, graph convolution, and pose estimation to enhance person re-identification accuracy in monitoring videos by effectively capturing local and global features.
Contribution
The paper presents a Transformer-enhanced Graph Convolutional Network that integrates pose estimation, global dependency learning, and local feature extraction for improved person Re-ID.
Findings
Significantly improves identification accuracy on multiple datasets.
Effectively captures pose and local feature relationships.
Outperforms existing methods in challenging scenarios.
Abstract
Person Re-Identification (Re-ID) has gained popularity in computer vision, enabling cross-camera pedestrian recognition. Although the development of deep learning has provided a robust technical foundation for person Re-ID research, most existing person Re-ID methods overlook the potential relationships among local person features, failing to adequately address the impact of pedestrian pose variations and local body parts occlusion. Therefore, we propose a Transformer-enhanced Graph Convolutional Network (Tran-GCN) model to improve Person Re-Identification performance in monitoring videos. The model comprises four key components: (1) A Pose Estimation Learning branch is utilized to estimate pedestrian pose information and inherent skeletal structure data, extracting pedestrian key point information; (2) A Transformer learning branch learns the global dependencies between fine-grained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Gait Recognition and Analysis · Human Pose and Action Recognition
MethodsAttention Is All You Need · Kaiming Initialization · Max Pooling · Byte Pair Encoding · Absolute Position Encodings · Average Pooling · Softmax · Label Smoothing · Layer Normalization · Dropout
