Zero-Shot Action Recognition in Videos: A Survey
Valter Estevam, Helio Pedrini, David Menotti

TL;DR
This survey reviews methods for zero-shot action recognition in videos, highlighting techniques for feature extraction, mapping, datasets, and protocols, and discusses open challenges and future research directions.
Contribution
It provides the first comprehensive survey focused specifically on zero-shot action recognition in videos, detailing existing methods, datasets, and experimental protocols.
Findings
Identifies key techniques for visual and semantic feature extraction.
Summarizes datasets and experimental protocols used in the field.
Highlights open issues and future research directions.
Abstract
Zero-Shot Action Recognition has attracted attention in the last years and many approaches have been proposed for recognition of objects, events and actions in images and videos. There is a demand for methods that can classify instances from classes that are not present in the training of models, especially in the complex problem of automatic video understanding, since collecting, annotating and labeling videos are difficult and laborious tasks. We have identified that there are many methods available in the literature, however, it is difficult to categorize which techniques can be considered state of the art. Despite the existence of some surveys about zero-shot action recognition in still images and experimental protocol, there is no work focused on videos. Therefore, we present a survey of the methods that comprise techniques to perform visual feature extraction and semantic feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
