Multi-Semantic Fusion Model for Generalized Zero-Shot Skeleton-Based   Action Recognition

Ming-Zhe Li; Zhen Jia; Zhang Zhang; Zhanyu Ma; and Liang Wang

arXiv:2309.09592·cs.CV·September 19, 2023

Multi-Semantic Fusion Model for Generalized Zero-Shot Skeleton-Based Action Recognition

Ming-Zhe Li, Zhen Jia, Zhang Zhang, Zhanyu Ma, and Liang Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a multi-semantic fusion model that leverages rich textual descriptions and skeleton features to improve generalized zero-shot skeleton-based action recognition, addressing the limitations of previous label-only approaches.

Contribution

The proposed MSF model integrates multiple semantic sources and a VAE-based alignment to enhance recognition of unseen actions in GZSSAR.

Findings

01

Outperforms previous models in GZSSAR tasks

02

Effectively utilizes action and motion descriptions for better generalization

03

Demonstrates robustness in recognizing unseen actions

Abstract

Generalized zero-shot skeleton-based action recognition (GZSSAR) is a new challenging problem in computer vision community, which requires models to recognize actions without any training samples. Previous studies only utilize the action labels of verb phrases as the semantic prototypes for learning the mapping from skeleton-based actions to a shared semantic space. However, the limited semantic information of action labels restricts the generalization ability of skeleton features for recognizing unseen actions. In order to solve this dilemma, we propose a multi-semantic fusion (MSF) model for improving the performance of GZSSAR, where two kinds of class-level textual descriptions (i.e., action descriptions and motion descriptions), are collected as auxiliary semantic information to enhance the learning efficacy of generalizable skeleton features. Specially, a pre-trained language…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

EHZ9NIWI7/MSF-GZSSAR
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Hand Gesture Recognition Systems