TL;DR
This paper introduces an efficient GCN-based baseline for skeleton-based action recognition that outperforms many SOTA models while significantly reducing computational complexity and enhancing explainability.
Contribution
The authors propose a novel GCN architecture with early fused multi-input branches, residual bottleneck modules, and part-wise attention, improving efficiency and interpretability.
Findings
Outperforms other SOTA models on NTU datasets
Requires up to 34 times fewer parameters
Maintains high accuracy with reduced complexity
Abstract
One essential problem in skeleton-based action recognition is how to extract discriminative features over all skeleton joints. However, the complexity of the State-Of-The-Art (SOTA) models of this task tends to be exceedingly sophisticated and over-parameterized, where the low efficiency in model training and inference has obstructed the development in the field, especially for large-scale action datasets. In this work, we propose an efficient but strong baseline based on Graph Convolutional Network (GCN), where three main improvements are aggregated, i.e., early fused Multiple Input Branches (MIB), Residual GCN (ResGCN) with bottleneck structure and Part-wise Attention (PartAtt) block. Firstly, an MIB is designed to enrich informative skeleton features and remain compact representations at an early fusion stage. Then, inspired by the success of the ResNet architecture in Convolutional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods1x1 Convolution · Average Pooling · Batch Normalization · Residual Connection · Residual Block · *Communicated@Fast*How Do I Communicate to Expedia? · Bottleneck Residual Block · Max Pooling · Convolution · Graph Convolutional Network
