Intention-Conditioned Long-Term Human Egocentric Action Forecasting
Esteve Valls Mascaro, Hyemin Ahn, Dongheui Lee

TL;DR
This paper introduces a hierarchical, intention-conditioned model for long-term egocentric action forecasting, significantly improving the plausibility and consistency of predicted action sequences in first-person videos.
Contribution
It proposes a novel hierarchical architecture with intention conditioning via a variational auto-encoder, advancing long-term action anticipation in egocentric videos.
Findings
Ranked first in EGO4D Challenge for long-term action prediction
Generated more plausible and time-consistent action sequences
Outperformed baseline methods in anticipation accuracy
Abstract
To anticipate how a human would act in the future, it is essential to understand the human intention since it guides the human towards a certain goal. In this paper, we propose a hierarchical architecture which assumes a sequence of human action (low-level) can be driven from the human intention (high-level). Based on this, we deal with Long-Term Action Anticipation task in egocentric videos. Our framework first extracts two level of human information over the N observed videos human actions through a Hierarchical Multi-task MLP Mixer (H3M). Then, we condition the uncertainty of the future through an Intention-Conditioned Variational Auto-Encoder (I-CVAE) that generates K stable predictions of the next Z=20 actions that the observed human might perform. By leveraging human intention as high-level information, we claim that our model is able to anticipate more time-consistent actions in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)
