SLIC: Self-Supervised Learning with Iterative Clustering for Human Action Videos
Salar Hosseini Khorasgani, Yuxuan Chen, Florian Shkurti

TL;DR
SLIC introduces an iterative clustering approach to improve self-supervised learning for human action videos, enhancing the quality of positive and negative sample selection and achieving state-of-the-art results on standard benchmarks.
Contribution
The paper proposes a novel clustering-based method, SLIC, that improves positive sampling in self-supervised video learning by leveraging pseudo-labels from iterative clustering.
Findings
SLIC outperforms state-of-the-art video retrieval baselines by +15.4% top-1 recall on UCF101.
SLIC achieves 83.2% top-1 accuracy on UCF101 after finetuning.
SLIC is competitive with the state-of-the-art in action classification after self-supervised pretraining.
Abstract
Self-supervised methods have significantly closed the gap with end-to-end supervised learning for image classification. In the case of human action videos, however, where both appearance and motion are significant factors of variation, this gap remains significant. One of the key reasons for this is that sampling pairs of similar video clips, a required step for many self-supervised contrastive learning methods, is currently done conservatively to avoid false positives. A typical assumption is that similar clips only occur temporally close within a single video, leading to insufficient examples of motion similarity. To mitigate this, we propose SLIC, a clustering-based self-supervised contrastive learning method for human action videos. Our key contribution is that we improve upon the traditional intra-video positive sampling by using iterative clustering to group similar video…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Cancer-related molecular mechanisms research · Multimodal Machine Learning Applications
MethodsContrastive Learning
