Unsupervised Few-Shot Action Recognition via Action-Appearance Aligned   Meta-Adaptation

Jay Patravali; Gaurav Mittal; Ye Yu; Fuxin Li; Mei Chen

arXiv:2109.15317·cs.CV·October 12, 2021

Unsupervised Few-Shot Action Recognition via Action-Appearance Aligned Meta-Adaptation

Jay Patravali, Gaurav Mittal, Ye Yu, Fuxin Li, Mei Chen

PDF

Open Access

TL;DR

MetaUVFS is an unsupervised meta-learning approach for video few-shot action recognition that leverages large-scale unlabeled videos and a novel alignment module to outperform supervised methods on standard benchmarks.

Contribution

It introduces MetaUVFS, the first unsupervised meta-learning algorithm for video few-shot action recognition, with a novel Action-Appearance Aligned Meta-adaptation module.

Findings

01

Outperforms all unsupervised methods on few-shot benchmarks.

02

Requires no labeled base classes or supervised pretraining.

03

Can sometimes outperform supervised methods on popular datasets.

Abstract

We present MetaUVFS as the first Unsupervised Meta-learning algorithm for Video Few-Shot action recognition. MetaUVFS leverages over 550K unlabeled videos to train a two-stream 2D and 3D CNN architecture via contrastive learning to capture the appearance-specific spatial and action-specific spatio-temporal video features respectively. MetaUVFS comprises a novel Action-Appearance Aligned Meta-adaptation (A3M) module that learns to focus on the action-oriented video features in relation to the appearance features via explicit few-shot episodic meta-learning over unsupervised hard-mined episodes. Our action-appearance alignment and explicit few-shot learner conditions the unsupervised training to mimic the downstream few-shot task, enabling MetaUVFS to significantly outperform all unsupervised methods on few-shot benchmarks. Moreover, unlike previous few-shot action recognition methods…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Diabetic Foot Ulcer Assessment and Management · Domain Adaptation and Few-Shot Learning

Methods3 Dimensional Convolutional Neural Network · Contrastive Learning