Chained Multi-stream Networks Exploiting Pose, Motion, and Appearance   for Action Classification and Detection

Mohammadreza Zolfaghari; Gabriel L. Oliveira; Nima Sedaghat; and; Thomas Brox

arXiv:1704.00616·cs.CV·May 30, 2017·1 cites

Chained Multi-stream Networks Exploiting Pose, Motion, and Appearance for Action Classification and Detection

Mohammadreza Zolfaghari, Gabriel L. Oliveira, Nima Sedaghat, and, Thomas Brox

PDF

Open Access 1 Repo

TL;DR

This paper introduces a multi-stream network architecture that effectively combines pose, motion, and appearance cues using a Markov chain model, achieving state-of-the-art results in action recognition and localization.

Contribution

The novel integration of pose, motion, and appearance cues via a Markov chain model enhances action recognition and localization performance.

Findings

01

Achieves state-of-the-art accuracy on HMDB51, J-HMDB, and NTU RGB+D datasets.

02

Yields top results in spatio-temporal localization on UCF101 and J-HMDB.

03

Efficient approach applicable to both classification and localization tasks.

Abstract

General human action recognition requires understanding of various visual cues. In this paper, we propose a network architecture that computes and integrates the most important visual cues for action recognition: pose, motion, and the raw images. For the integration, we introduce a Markov chain model which adds cues successively. The resulting approach is efficient and applicable to action classification as well as to spatial and temporal action localization. The two contributions clearly improve the performance over respective baselines. The overall approach achieves state-of-the-art action classification performance on HMDB51, J-HMDB and NTU RGB+D datasets. Moreover, it yields state-of-the-art spatio-temporal action localization results on UCF101 and J-HMDB.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mzolfaghari/chained-multistream-networks
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods