How Severe is Benchmark-Sensitivity in Video Self-Supervised Learning?
Fida Mohammad Thoker, Hazel Doughty, Piyush Bagad, Cees Snoek

TL;DR
This paper investigates the sensitivity of video self-supervised learning models to various factors like domain and task, revealing that current benchmarks poorly predict their generalization ability and that supervised pre-training often outperforms self-supervised methods.
Contribution
The study provides a comprehensive analysis of benchmark sensitivity in video self-supervised learning and introduces the SEVERE-benchmark for better evaluation of generalization.
Findings
Current benchmarks do not reliably indicate generalization.
Self-supervised methods lag behind supervised pre-training under domain shifts.
The SEVERE-benchmark offers a more robust evaluation framework.
Abstract
Despite the recent success of video self-supervised learning models, there is much still to be understood about their generalization capability. In this paper, we investigate how sensitive video self-supervised learning is to the current conventional benchmark and whether methods generalize beyond the canonical evaluation setting. We do this across four different factors of sensitivity: domain, samples, actions and task. Our study which encompasses over 500 experiments on 7 video datasets, 9 self-supervised methods and 6 video understanding tasks, reveals that current benchmarks in video self-supervised learning are not good indicators of generalization along these sensitivity factors. Further, we find that self-supervised methods considerably lag behind vanilla supervised pre-training, especially when domain shift is large and the amount of available downstream samples are low. From…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Cancer-related molecular mechanisms research · Human Pose and Action Recognition
