Rescaling Egocentric Vision
Dima Damen, Hazel Doughty, Giovanni Maria Farinella, Antonino Furnari,, Evangelos Kazakos, Jian Ma, Davide Moltisanti, Jonathan Munro, Toby Perrett,, Will Price, Michael Wray

TL;DR
This paper presents EPIC-KITCHENS-100, a significantly expanded egocentric video dataset with dense annotations, enabling new research challenges like action detection, recognition, and generalisation over time.
Contribution
The paper introduces a novel pipeline for annotating egocentric videos, resulting in a larger, more detailed dataset that supports multiple new research challenges.
Findings
EPIC-KITCHENS-100 contains 100 hours of video with 20 million frames.
Annotations are 54% denser and 128% more complete than previous versions.
The dataset supports six new challenges including action detection and cross-modal retrieval.
Abstract
This paper introduces the pipeline to extend the largest dataset in egocentric vision, EPIC-KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M frames, 90K actions in 700 variable-length videos, capturing long-term unscripted activities in 45 environments, using head-mounted cameras. Compared to its previous version, EPIC-KITCHENS-100 has been annotated using a novel pipeline that allows denser (54% more actions per minute) and more complete annotations of fine-grained actions (+128% more action segments). This collection enables new challenges such as action detection and evaluating the "test of time" - i.e. whether models trained on data collected in 2018 can generalise to new footage collected two years later. The dataset is aligned with 6 challenges: action recognition (full and weak supervision), action detection, action anticipation, cross-modal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods
