Universal Prototype Transport for Zero-Shot Action Recognition and   Localization

Pascal Mettes

arXiv:2203.03971·cs.CV·August 2, 2023

Universal Prototype Transport for Zero-Shot Action Recognition and Localization

Pascal Mettes

PDF

Open Access

TL;DR

This paper introduces universal prototype transport, a novel method that repositions semantic prototypes to improve zero-shot action recognition and localization by reducing bias and matching test video distributions.

Contribution

It proposes a new transport-based approach to re-position semantic prototypes, enhancing zero-shot recognition and localization performance over existing universal models.

Findings

01

Boosts zero-shot classification accuracy.

02

Reduces bias in unseen action inference.

03

Improves localization of actions in videos.

Abstract

This work addresses the problem of recognizing action categories in videos when no training examples are available. The current state-of-the-art enables such a zero-shot recognition by learning universal mappings from videos to a semantic space, either trained on large-scale seen actions or on objects. While effective, we find that universal action and object mappings are biased to specific regions in the semantic space. These biases lead to a fundamental problem: many unseen action categories are simply never inferred during testing. For example on UCF-101, a quarter of the unseen actions are out of reach with a state-of-the-art universal action model. To that end, this paper introduces universal prototype transport for zero-shot action recognition. The main idea is to re-position the semantic prototypes of unseen actions by matching them to the distribution of all test videos. For…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications