Action Recognition in Untrimmed Videos with Composite Self-Attention   Two-Stream Framework

Dong Cao; Lisha Xu; HaiBo Chen

arXiv:1908.04353·cs.CV·April 24, 2020

Action Recognition in Untrimmed Videos with Composite Self-Attention Two-Stream Framework

Dong Cao, Lisha Xu, HaiBo Chen

PDF

TL;DR

This paper introduces a composite self-attention two-stream framework with graph networks for improved zero-shot action recognition in untrimmed videos, emphasizing key frame weighting and multi-aspect attention.

Contribution

The paper proposes a novel composite two-stream framework with 3-channel self-attention and graph networks for zero-shot action recognition in untrimmed videos, enhancing feature extraction and key frame focus.

Findings

01

Effective in zero-shot action recognition

02

Improves focus on key frames in untrimmed videos

03

Validated on relevant datasets with positive results

Abstract

With the rapid development of deep learning algorithms, action recognition in video has achieved many important research results. One issue in action recognition, Zero-Shot Action Recognition (ZSAR), has recently attracted considerable attention, which classify new categories without any positive examples. Another difficulty in action recognition is that untrimmed data may seriously affect model performance. We propose a composite two-stream framework with a pre-trained model. Our proposed framework includes a classifier branch and a composite feature branch. The graph network model is adopted in each of the two branches, which effectively improves the feature extraction and reasoning ability of the framework. In the composite feature branch, a 3-channel self-attention models are constructed to weight each frame in the video and give more attention to the key frames. Each self-attention…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.