Tran-GCN: A Transformer-Enhanced Graph Convolutional Network for Person   Re-Identification in Monitoring Videos

Xiaobin Hong; Tarmizi Adam; Masitah Ghazali

arXiv:2409.09391·cs.CV·September 17, 2024

Tran-GCN: A Transformer-Enhanced Graph Convolutional Network for Person Re-Identification in Monitoring Videos

Xiaobin Hong, Tarmizi Adam, Masitah Ghazali

PDF

Open Access

TL;DR

This paper introduces Tran-GCN, a novel model combining transformer, graph convolution, and pose estimation to enhance person re-identification accuracy in monitoring videos by effectively capturing local and global features.

Contribution

The paper presents a Transformer-enhanced Graph Convolutional Network that integrates pose estimation, global dependency learning, and local feature extraction for improved person Re-ID.

Findings

01

Significantly improves identification accuracy on multiple datasets.

02

Effectively captures pose and local feature relationships.

03

Outperforms existing methods in challenging scenarios.

Abstract

Person Re-Identification (Re-ID) has gained popularity in computer vision, enabling cross-camera pedestrian recognition. Although the development of deep learning has provided a robust technical foundation for person Re-ID research, most existing person Re-ID methods overlook the potential relationships among local person features, failing to adequately address the impact of pedestrian pose variations and local body parts occlusion. Therefore, we propose a Transformer-enhanced Graph Convolutional Network (Tran-GCN) model to improve Person Re-Identification performance in monitoring videos. The model comprises four key components: (1) A Pose Estimation Learning branch is utilized to estimate pedestrian pose information and inherent skeletal structure data, extracting pedestrian key point information; (2) A Transformer learning branch learns the global dependencies between fine-grained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Gait Recognition and Analysis · Human Pose and Action Recognition

MethodsAttention Is All You Need · Kaiming Initialization · Max Pooling · Byte Pair Encoding · Absolute Position Encodings · Average Pooling · Softmax · Label Smoothing · Layer Normalization · Dropout