Unified Recurrence Modeling for Video Action Anticipation
Tsung-Ming Tai, Giuseppe Fiameni, Cheng-Kuang Lee, Simon See, Oswald, Lanz

TL;DR
This paper introduces a unified recurrence modeling approach using message passing and self-attention for video action anticipation, significantly improving prediction accuracy on large-scale datasets.
Contribution
It proposes a novel message passing framework with learnable edge strategies and self-attention, enhancing temporal inference in action anticipation models.
Findings
Outperforms previous methods on EPIC-Kitchen dataset
Leverages self-attention for message passing in video modeling
Provides flexible, end-to-end trainable connectivity strategies
Abstract
Forecasting future events based on evidence of current conditions is an innate skill of human beings, and key for predicting the outcome of any decision making. In artificial vision for example, we would like to predict the next human action before it happens, without observing the future video frames associated to it. Computer vision models for action anticipation are expected to collect the subtle evidence in the preamble of the target actions. In prior studies recurrence modeling often leads to better performance, the strong temporal inference is assumed to be a key element for reasonable prediction. To this end, we propose a unified recurrence modeling for video action anticipation via message passing framework. The information flow in space-time can be described by the interaction between vertices and edges, and the changes of vertices for each incoming frame reflects the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Data Visualization and Analytics
