Learning Using Privileged Information for Zero-Shot Action Recognition
Zhiyi Gao, Yonghong Hou, Wanqing Li, Zihui Guo, Bin Yu

TL;DR
This paper introduces a novel approach for zero-shot action recognition that leverages object semantics as privileged information, using a hallucination network and cross-attention to improve recognition of unseen video actions.
Contribution
It proposes a new method combining object semantics with visual features via a hallucination network and cross-attention, effectively narrowing the semantic gap in ZSAR.
Findings
Outperforms state-of-the-art on Olympic Sports, HMDB51, UCF101
Effectively narrows semantic gap with privileged object information
Demonstrates significant accuracy improvements in zero-shot action recognition
Abstract
Zero-Shot Action Recognition (ZSAR) aims to recognize video actions that have never been seen during training. Most existing methods assume a shared semantic space between seen and unseen actions and intend to directly learn a mapping from a visual space to the semantic space. This approach has been challenged by the semantic gap between the visual space and semantic space. This paper presents a novel method that uses object semantics as privileged information to narrow the semantic gap and, hence, effectively, assist the learning. In particular, a simple hallucination network is proposed to implicitly extract object semantics during testing without explicitly extracting objects and a cross-attention module is developed to augment visual feature with the object semantics. Experiments on the Olympic Sports, HMDB51 and UCF101 datasets have shown that the proposed method outperforms the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Human Pose and Action Recognition · Adversarial Robustness in Machine Learning
MethodsSoftmax · Concatenated Skip Connection
