Learning Using Privileged Information for Zero-Shot Action Recognition

Zhiyi Gao; Yonghong Hou; Wanqing Li; Zihui Guo; Bin Yu

arXiv:2206.08632·cs.CV·June 23, 2022

Learning Using Privileged Information for Zero-Shot Action Recognition

Zhiyi Gao, Yonghong Hou, Wanqing Li, Zihui Guo, Bin Yu

PDF

Open Access

TL;DR

This paper introduces a novel approach for zero-shot action recognition that leverages object semantics as privileged information, using a hallucination network and cross-attention to improve recognition of unseen video actions.

Contribution

It proposes a new method combining object semantics with visual features via a hallucination network and cross-attention, effectively narrowing the semantic gap in ZSAR.

Findings

01

Outperforms state-of-the-art on Olympic Sports, HMDB51, UCF101

02

Effectively narrows semantic gap with privileged object information

03

Demonstrates significant accuracy improvements in zero-shot action recognition

Abstract

Zero-Shot Action Recognition (ZSAR) aims to recognize video actions that have never been seen during training. Most existing methods assume a shared semantic space between seen and unseen actions and intend to directly learn a mapping from a visual space to the semantic space. This approach has been challenged by the semantic gap between the visual space and semantic space. This paper presents a novel method that uses object semantics as privileged information to narrow the semantic gap and, hence, effectively, assist the learning. In particular, a simple hallucination network is proposed to implicitly extract object semantics during testing without explicitly extracting objects and a cross-attention module is developed to augment visual feature with the object semantics. Experiments on the Olympic Sports, HMDB51 and UCF101 datasets have shown that the proposed method outperforms the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Human Pose and Action Recognition · Adversarial Robustness in Machine Learning

MethodsSoftmax · Concatenated Skip Connection