A New Split for Evaluating True Zero-Shot Action Recognition
Shreyank N Gowda, Laura Sevilla-Lara, Kiyoon Kim, Frank Keller, and, Marcus Rohrbach

TL;DR
This paper introduces a new evaluation split for zero-shot action recognition that prevents class overlap with pre-training data, leading to more realistic and challenging assessments of model performance.
Contribution
It proposes a true zero-shot split avoiding class overlap with pre-training datasets and benchmarks recent methods on this split, highlighting the need for more rigorous evaluation.
Findings
Zero-shot performance drops by up to 8.9% with the new split.
The new split is significantly more challenging than random splits.
Similar issues are found in few-shot splits, with performance differences up to 17.1%.
Abstract
Zero-shot action recognition is the task of classifying action categories that are not available in the training set. In this setting, the standard evaluation protocol is to use existing action recognition datasets(e.g. UCF101) and randomly split the classes into seen and unseen. However, most recent work builds on representations pre-trained on the Kinetics dataset, where classes largely overlap with classes in the zero-shot evaluation datasets. As a result, classes which are supposed to be unseen, are present during supervised pre-training, invalidating the condition of the zero-shot setting. A similar concern was previously noted several years ago for image based zero-shot recognition but has not been considered by the zero-shot action recognition community. In this paper, we propose a new split for true zero-shot action recognition with no overlap between unseen test classes and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Human Pose and Action Recognition · Advanced Neural Network Applications
