Attentional Pooling for Action Recognition

Rohit Girdhar; Deva Ramanan

arXiv:1711.01467·cs.CV·January 3, 2018·208 cites

Attentional Pooling for Action Recognition

Rohit Girdhar, Deva Ramanan

PDF

Open Access 1 Repo

TL;DR

This paper presents a simple attention-based model that significantly improves action recognition accuracy across multiple benchmarks, offering a new perspective by framing action recognition as a fine-grained recognition task.

Contribution

The paper introduces a novel attention module for action recognition that can be trained with or without supervision, achieving state-of-the-art results and providing a new theoretical understanding of attention as low-rank bilinear pooling.

Findings

01

Significant accuracy improvements on three benchmarks.

02

Establishment of new state-of-the-art on MPII dataset.

03

Analytical derivation of attention as low-rank bilinear pooling.

Abstract

We introduce a simple yet surprisingly powerful model to incorporate attention in action recognition and human object interaction tasks. Our proposed attention module can be trained with or without extra supervision, and gives a sizable boost in accuracy while keeping the network size and computational cost nearly the same. It leads to significant improvements over state of the art base architecture on three standard action recognition benchmarks across still images and videos, and establishes new state of the art on MPII dataset with 12.5% relative improvement. We also perform an extensive analysis of our attention module both empirically and analytically. In terms of the latter, we introduce a novel derivation of bottom-up and top-down attention as low-rank approximations of bilinear pooling methods (typically used for fine-grained classification). From this perspective, our attention…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rohitgirdhar/AttentionalPoolingAction
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Multimodal Machine Learning Applications