A Short Note about Kinetics-600
Joao Carreira, Eric Noland, Andras Banki-Horvath, Chloe Hillier and, Andrew Zisserman

TL;DR
This paper details the extension of the Kinetics-600 dataset from 400 to 600 classes, increasing data volume and diversity through multilingual queries, and provides baseline results using I3D architecture.
Contribution
It introduces a larger, more diverse version of the Kinetics dataset with new data collection methods and baseline benchmarks.
Findings
Expanded dataset to 600 classes with at least 600 clips each
Implemented multilingual data collection process
Provided baseline results with I3D neural network
Abstract
We describe an extension of the DeepMind Kinetics human action dataset from 400 classes, each with at least 400 video clips, to 600 classes, each with at least 600 video clips. In order to scale up the dataset we changed the data collection process so it uses multiple queries per class, with some of them in a language other than english -- portuguese. This paper details the changes between the two versions of the dataset and includes a comprehensive set of statistics of the new version as well as baseline results using the I3D neural network architecture. The paper is a companion to the release of the ground truth labels for the public test set.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Analysis and Summarization
