TL;DR
This paper analyzes spatial attention in spatio-temporal GCNs for skeleton-based human action recognition, proposing a symmetric attention mechanism and a bilinear network that enhances model flexibility and efficiency.
Contribution
It introduces a symmetric spatial attention mechanism and a bilinear network architecture that removes the need for predefined adjacency matrices in spatio-temporal GCNs.
Findings
Symmetric spatial attention improves reflection of joint relationships.
ST-BLN achieves comparable performance without predefined adjacency matrices.
The proposed models increase efficiency while maintaining accuracy.
Abstract
Graph convolutional networks (GCNs) achieved promising performance in skeleton-based human action recognition by modeling a sequence of skeletons as a spatio-temporal graph. Most of the recently proposed GCN-based methods improve the performance by learning the graph structure at each layer of the network using a spatial attention applied on a predefined graph Adjacency matrix that is optimized jointly with model's parameters in an end-to-end manner. In this paper, we analyze the spatial attention used in spatio-temporal GCN layers and propose a symmetric spatial attention for better reflecting the symmetric property of the relative positions of the human body joints when executing actions. We also highlight the connection of spatio-temporal GCN layers employing additive spatial attention to bilinear layers, and we propose the spatio-temporal bilinear network (ST-BLN) which does not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGraph Convolutional Network
