Inductive Attention for Video Action Anticipation

Tsung-Ming Tai; Giuseppe Fiameni; Cheng-Kuang Lee; Simon See; Oswald; Lanz

arXiv:2212.08830·cs.CV·March 21, 2023·1 cites

Inductive Attention for Video Action Anticipation

Tsung-Ming Tai, Giuseppe Fiameni, Cheng-Kuang Lee, Simon See, Oswald, Lanz

PDF

Open Access

TL;DR

This paper introduces IAM, an inductive attention model that uses current prediction priors as queries to better anticipate future actions in videos, outperforming existing models on large-scale datasets.

Contribution

The paper proposes a novel inductive attention mechanism that leverages prediction priors as queries, improving future action inference in video understanding tasks.

Findings

01

Outperforms state-of-the-art models on egocentric video datasets

02

Uses fewer parameters than existing methods

03

Effectively models uncertainty in future action prediction

Abstract

Anticipating future actions based on spatiotemporal observations is essential in video understanding and predictive computer vision. Moreover, a model capable of anticipating the future has important applications, it can benefit precautionary systems to react before an event occurs. However, unlike in the action recognition task, future information is inaccessible at observation time -- a model cannot directly map the video frames to the target action to solve the anticipation task. Instead, the temporal inference is required to associate the relevant evidence with possible future actions. Consequently, existing solutions based on the action recognition models are only suboptimal. Recently, researchers proposed extending the observation window to capture longer pre-action profiles from past moments and leveraging attention to retrieve the subtle evidence to improve the anticipation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Visual Attention and Saliency Detection · Advanced Neural Network Applications