UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
Khurram Soomro, Amir Roshan Zamir, Mubarak Shah

TL;DR
UCF101 is a large, challenging dataset of 101 human action classes from realistic videos, designed to advance research in action recognition with baseline results provided.
Contribution
This paper introduces UCF101, the largest and most challenging dataset of human actions from unconstrained videos, with baseline recognition results.
Findings
Baseline accuracy of 44.5% using bag of words approach
Contains over 13,000 clips and 27 hours of video data
Includes realistic, user-uploaded videos with cluttered backgrounds
Abstract
We introduce UCF101 which is currently the largest dataset of human actions. It consists of 101 action classes, over 13k clips and 27 hours of video data. The database consists of realistic user uploaded videos containing camera motion and cluttered background. Additionally, we provide baseline action recognition results on this new dataset using standard bag of words approach with overall performance of 44.5%. To the best of our knowledge, UCF101 is currently the most challenging dataset of actions due to its large number of classes, large number of clips and also unconstrained nature of such clips.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Multimodal Machine Learning Applications
