Multi-Granularity Graph Pooling for Video-based Person Re-Identification
Honghu Pan, Yongyong Chen, Zhenyu He

TL;DR
This paper introduces GPNet, a novel graph pooling network with multi-granularity representation and attention-based downsampling, significantly improving video-based person re-identification accuracy across multiple datasets.
Contribution
The paper proposes a multi-granularity graph pooling method with an attention-based pooling layer for better graph representation in person ReID tasks.
Findings
Achieves competitive results on four major datasets.
Outperforms existing graph pooling methods in ReID accuracy.
Effectively captures global and local graph information.
Abstract
The video-based person re-identification (ReID) aims to identify the given pedestrian video sequence across multiple non-overlapping cameras. To aggregate the temporal and spatial features of the video samples, the graph neural networks (GNNs) are introduced. However, existing graph-based models, like STGCN, perform the \textit{mean}/\textit{max pooling} on node features to obtain the graph representation, which neglect the graph topology and node importance. In this paper, we propose the graph pooling network (GPNet) to learn the multi-granularity graph representation for the video retrieval, where the \textit{graph pooling layer} is implemented to downsample the graph. We first construct a multi-granular graph, whose node features denote image embedding learned by backbone, and edges are established between the temporal and Euclidean neighborhood nodes. We then implement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Anomaly Detection Techniques and Applications
