Action Forecasting with Feature-wise Self-Attention

Yan Bin Ng; Basura Fernando

arXiv:2107.08579·cs.CV·July 20, 2021

Action Forecasting with Feature-wise Self-Attention

Yan Bin Ng, Basura Fernando

PDF

Open Access

TL;DR

This paper introduces a novel human action forecasting architecture combining recurrent encoding, feature-wise self-attention, and temporal masking, achieving state-of-the-art results on benchmark datasets.

Contribution

The paper proposes a new architecture integrating self-attention and temporal masking for improved action forecasting from videos.

Findings

01

Self-attention effectively identifies relevant feature dimensions.

02

Temporal masking improves handling of temporal variations.

03

Achieved state-of-the-art results on standard benchmarks.

Abstract

We present a new architecture for human action forecasting from videos. A temporal recurrent encoder captures temporal information of input videos while a self-attention model is used to attend on relevant feature dimensions of the input space. To handle temporal variations in observed video data, a feature masking techniques is employed. We classify observed actions accurately using an auxiliary classifier which helps to understand what has happened so far. Then the decoder generates actions for the future based on the output of the recurrent encoder and the self-attention model. Experimentally, we validate each component of our architecture where we see that the impact of self-attention to identify relevant feature dimensions, temporal masking, and observed auxiliary classifier. We evaluate our method on two standard action forecasting benchmarks and obtain state-of-the-art results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods

MethodsAuxiliary Classifier