Streaming egocentric action anticipation: An evaluation scheme and approach
Antonino Furnari, Giovanni Maria Farinella

TL;DR
This paper introduces a streaming evaluation scheme for egocentric action anticipation that accounts for model runtime, and proposes a lightweight model optimized with knowledge distillation, leading to more realistic performance assessments.
Contribution
It presents a new streaming evaluation framework and a lightweight 3D CNN model with knowledge distillation for egocentric action anticipation.
Findings
Streaming evaluation alters model rankings compared to traditional methods.
Lightweight models outperform more complex ones in streaming scenarios.
The proposed model surpasses current state-of-the-art in streaming egocentric action anticipation.
Abstract
Egocentric action anticipation aims to predict the future actions the camera wearer will perform from the observation of the past. While predictions about the future should be available before the predicted events take place, most approaches do not pay attention to the computational time required to make such predictions. As a result, current evaluation schemes assume that predictions are available right after the input video is observed, i.e., presuming a negligible runtime, which may lead to overly optimistic evaluations. We propose a streaming egocentric action evaluation scheme which assumes that predictions are performed online and made available only after the model has processed the current input segment, which depends on its runtime. To evaluate all models considering the same prediction horizon, we hence propose that slower models should base their predictions on temporal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsKnowledge Distillation · Balanced Selection
