"Knights": First Place Submission for VIPriors21 Action Recognition   Challenge at ICCV 2021

Ishan Dave; Naman Biyani; Brandon Clark; Rohit Gupta; Yogesh Rawat and; Mubarak Shah

arXiv:2110.07758·cs.CV·October 18, 2021·1 cites

"Knights": First Place Submission for VIPriors21 Action Recognition Challenge at ICCV 2021

Ishan Dave, Naman Biyani, Brandon Clark, Rohit Gupta, Yogesh Rawat and, Mubarak Shah

PDF

Open Access

TL;DR

The paper introduces 'Knights', a novel approach combining self-supervised pretraining, video transformers, and optical flow to achieve top performance in small dataset action recognition at ICCV 2021.

Contribution

It presents a new method that effectively leverages self-supervised learning and multimodal inputs for data-efficient action recognition without external data.

Findings

01

Achieved 73% accuracy on Kinetics400ViPriors test set.

02

Outperformed all other entries in the ICCV 2021 challenge.

03

Demonstrated effectiveness of self-supervised pretraining with video transformers.

Abstract

This technical report presents our approach "Knights" to solve the action recognition task on a small subset of Kinetics-400 i.e. Kinetics400ViPriors without using any extra-data. Our approach has 3 main components: state-of-the-art Temporal Contrastive self-supervised pretraining, video transformer models, and optical flow modality. Along with the use of standard test-time augmentation, our proposed solution achieves 73% on Kinetics400ViPriors test set, which is the best among all of the other entries Visual Inductive Priors for Data-Efficient Computer Vision's Action Recognition Challenge, ICCV 2021.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Multimodal Machine Learning Applications

MethodsTest