Improving Skeleton-based Action Recognition with Interactive Object Information
Hao Wen, Ziqian Lu, Fengli Shen, Zhe-Ming Lu, Jialin Cui

TL;DR
This paper enhances skeleton-based action recognition by integrating interactive object information through a novel graph convolutional network, datasets, and augmentation techniques, achieving state-of-the-art accuracy.
Contribution
It introduces object nodes into skeleton graphs, proposes ST-VGCN for variable graph modeling, and develops data augmentation and fusion modules to improve recognition performance.
Findings
Achieved 96.7% accuracy on NTU RGB+D 60 cross-subject split.
Created new datasets JXGC 24 and NTU RGB+D+Object 60 with over 2 million object nodes.
Surpassed previous state-of-the-art results on multiple benchmarks.
Abstract
Human skeleton information is important in skeleton-based action recognition, which provides a simple and efficient way to describe human pose. However, existing skeleton-based methods focus more on the skeleton, ignoring the objects interacting with humans, resulting in poor performance in recognizing actions that involve object interactions. We propose a new action recognition framework introducing object nodes to supplement absent interactive object information. We also propose Spatial Temporal Variable Graph Convolutional Networks (ST-VGCN) to effectively model the Variable Graph (VG) containing object nodes. Specifically, in order to validate the role of interactive object information, by leveraging a simple self-training approach, we establish a new dataset, JXGC 24, and an extended dataset, NTU RGB+D+Object 60, including more than 2 million additional object nodes. At the same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Multimodal Machine Learning Applications
MethodsFocus
