Learning Discriminative Spatio-temporal Representations for   Semi-supervised Action Recognition

Yu Wang; Sanping Zhou; Kun Xia; Le Wang

arXiv:2404.16416·cs.CV·April 26, 2024·1 cites

Learning Discriminative Spatio-temporal Representations for Semi-supervised Action Recognition

Yu Wang, Sanping Zhou, Kun Xia, Le Wang

PDF

Open Access

TL;DR

This paper introduces a novel semi-supervised action recognition framework that enhances spatio-temporal discrimination using adaptive contrastive learning and multi-scale temporal modeling, significantly improving accuracy on benchmark datasets.

Contribution

It proposes two new techniques, ACL and MTL, integrated into a unified framework to better distinguish actions with limited labeled data.

Findings

01

Outperforms state-of-the-art methods on UCF101, HMDB51, Kinetics400

02

Effectively leverages unlabeled data for improved recognition

03

Enhances discriminative spatio-temporal feature learning

Abstract

Semi-supervised action recognition aims to improve spatio-temporal reasoning ability with a few labeled data in conjunction with a large amount of unlabeled data. Albeit recent advancements, existing powerful methods are still prone to making ambiguous predictions under scarce labeled data, embodied as the limitation of distinguishing different actions with similar spatio-temporal information. In this paper, we approach this problem by empowering the model two aspects of capability, namely discriminative spatial modeling and temporal structure modeling for learning discriminative spatio-temporal representations. Specifically, we propose an Adaptive Contrastive Learning~(ACL) strategy. It assesses the confidence of all unlabeled samples by the class prototypes of the labeled data, and adaptively selects positive-negative samples from a pseudo-labeled sample bank to construct contrastive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Gait Recognition and Analysis

MethodsContrastive Language-Image Pre-training