Video Person Re-identification by Temporal Residual Learning
Ju Dai, Pingping Zhang, Huchuan Lu, Hongyu Wang

TL;DR
This paper introduces a novel framework for video person re-identification that leverages temporal residual learning and spatial-temporal transformer networks to improve feature extraction and spatial alignment, achieving superior results on multiple datasets.
Contribution
The paper proposes a new feature learning framework combining temporal residual learning with spatial-temporal transformer networks for enhanced video person re-ID.
Findings
Achieves superior performance on large-scale datasets.
Effectively handles spatial misalignment and appearance changes.
Outperforms recent state-of-the-art methods.
Abstract
In this paper, we propose a novel feature learning framework for video person re-identification (re-ID). The proposed framework largely aims to exploit the adequate temporal information of video sequences and tackle the poor spatial alignment of moving pedestrians. More specifically, for exploiting the temporal information, we design a temporal residual learning (TRL) module to simultaneously extract the generic and specific features of consecutive frames. The TRL module is equipped with two bi-directional LSTM (BiLSTM), which are respectively responsible to describe a moving person in different aspects, providing complementary information for better feature representations. To deal with the poor spatial alignment in video re-ID datasets, we propose a spatial-temporal transformer network (ST^2N) module. Transformation parameters in the ST^2N module are learned by leveraging the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Sigmoid Activation · Tanh Activation · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia?
