"Knights": First Place Submission for VIPriors21 Action Recognition Challenge at ICCV 2021
Ishan Dave, Naman Biyani, Brandon Clark, Rohit Gupta, Yogesh Rawat and, Mubarak Shah

TL;DR
The paper introduces 'Knights', a novel approach combining self-supervised pretraining, video transformers, and optical flow to achieve top performance in small dataset action recognition at ICCV 2021.
Contribution
It presents a new method that effectively leverages self-supervised learning and multimodal inputs for data-efficient action recognition without external data.
Findings
Achieved 73% accuracy on Kinetics400ViPriors test set.
Outperformed all other entries in the ICCV 2021 challenge.
Demonstrated effectiveness of self-supervised pretraining with video transformers.
Abstract
This technical report presents our approach "Knights" to solve the action recognition task on a small subset of Kinetics-400 i.e. Kinetics400ViPriors without using any extra-data. Our approach has 3 main components: state-of-the-art Temporal Contrastive self-supervised pretraining, video transformer models, and optical flow modality. Along with the use of standard test-time augmentation, our proposed solution achieves 73% on Kinetics400ViPriors test set, which is the best among all of the other entries Visual Inductive Priors for Data-Efficient Computer Vision's Action Recognition Challenge, ICCV 2021.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Multimodal Machine Learning Applications
MethodsTest
