SA-DVAE: Improving Zero-Shot Skeleton-Based Action Recognition by   Disentangled Variational Autoencoders

Sheng-Wei Li; Zi-Xiang Wei; Wei-Jie Chen; Yi-Hsin Yu and; Chih-Yuan Yang; Jane Yung-jen Hsu

arXiv:2407.13460·cs.CV·July 19, 2024

SA-DVAE: Improving Zero-Shot Skeleton-Based Action Recognition by Disentangled Variational Autoencoders

Sheng-Wei Li, Zi-Xiang Wei, Wei-Jie Chen, Yi-Hsin Yu and, Chih-Yuan Yang, Jane Yung-jen Hsu

PDF

Open Access 1 Repo

TL;DR

SA-DVAE introduces a novel disentangled variational autoencoder approach to improve zero-shot skeleton-based action recognition by better aligning semantic and skeleton features, addressing dataset imbalance issues.

Contribution

The paper proposes a disentangled VAE framework that separates semantic and irrelevant features to enhance feature alignment in zero-shot action recognition.

Findings

01

SA-DVAE outperforms existing methods on benchmark datasets.

02

Feature disentanglement improves semantic-skeleton feature alignment.

03

Experimental results demonstrate significant accuracy gains.

Abstract

Existing zero-shot skeleton-based action recognition methods utilize projection networks to learn a shared latent space of skeleton features and semantic embeddings. The inherent imbalance in action recognition datasets, characterized by variable skeleton sequences yet constant class labels, presents significant challenges for alignment. To address the imbalance, we propose SA-DVAE -- Semantic Alignment via Disentangled Variational Autoencoders, a method that first adopts feature disentanglement to separate skeleton features into two independent parts -- one is semantic-related and another is irrelevant -- to better align skeleton and semantic features. We implement this idea via a pair of modality-specific variational autoencoders coupled with a total correction penalty. We conduct experiments on three benchmark datasets: NTU RGB+D, NTU RGB+D 120 and PKU-MMD, and our experimental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pha123661/SA-DVAE
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications

MethodsALIGN