Improving Skeleton-based Action Recognition with Interactive Object   Information

Hao Wen; Ziqian Lu; Fengli Shen; Zhe-Ming Lu; Jialin Cui

arXiv:2501.05066·cs.CV·January 10, 2025

Improving Skeleton-based Action Recognition with Interactive Object Information

Hao Wen, Ziqian Lu, Fengli Shen, Zhe-Ming Lu, Jialin Cui

PDF

Open Access 1 Repo

TL;DR

This paper enhances skeleton-based action recognition by integrating interactive object information through a novel graph convolutional network, datasets, and augmentation techniques, achieving state-of-the-art accuracy.

Contribution

It introduces object nodes into skeleton graphs, proposes ST-VGCN for variable graph modeling, and develops data augmentation and fusion modules to improve recognition performance.

Findings

01

Achieved 96.7% accuracy on NTU RGB+D 60 cross-subject split.

02

Created new datasets JXGC 24 and NTU RGB+D+Object 60 with over 2 million object nodes.

03

Surpassed previous state-of-the-art results on multiple benchmarks.

Abstract

Human skeleton information is important in skeleton-based action recognition, which provides a simple and efficient way to describe human pose. However, existing skeleton-based methods focus more on the skeleton, ignoring the objects interacting with humans, resulting in poor performance in recognizing actions that involve object interactions. We propose a new action recognition framework introducing object nodes to supplement absent interactive object information. We also propose Spatial Temporal Variable Graph Convolutional Networks (ST-VGCN) to effectively model the Variable Graph (VG) containing object nodes. Specifically, in order to validate the role of interactive object information, by leveraging a simple self-training approach, we establish a new dataset, JXGC 24, and an extended dataset, NTU RGB+D+Object 60, including more than 2 million additional object nodes. At the same…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

moonlight52137/st-vgcn
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Multimodal Machine Learning Applications

MethodsFocus