Rethinking Motion Representation: Residual Frames with 3D ConvNets for   Better Action Recognition

Li Tao; Xueting Wang; Toshihiko Yamasaki

arXiv:2001.05661·cs.CV·January 17, 2020·19 cites

Rethinking Motion Representation: Residual Frames with 3D ConvNets for Better Action Recognition

Li Tao, Xueting Wang, Toshihiko Yamasaki

PDF

Open Access 3 Repos

TL;DR

This paper introduces a novel residual frame approach for 3D ConvNets that significantly improves action recognition accuracy without relying on computationally expensive optical flow, by combining residual and appearance features in a two-path model.

Contribution

It proposes using residual frames as motion features in 3D ConvNets and combines them with appearance features via a two-path network, outperforming optical flow-based methods.

Findings

01

20.5% and 12.5% accuracy improvements on UCF101 and HMDB51

02

Outperforms state-of-the-art on Mini-kinetics

03

Residual frames effectively enhance motion feature extraction

Abstract

Recently, 3D convolutional networks yield good performance in action recognition. However, optical flow stream is still needed to ensure better performance, the cost of which is very high. In this paper, we propose a fast but effective way to extract motion features from videos utilizing residual frames as the input data in 3D ConvNets. By replacing traditional stacked RGB frames with residual ones, 20.5% and 12.5% points improvements over top-1 accuracy can be achieved on the UCF101 and HMDB51 datasets when trained from scratch. Because residual frames contain little information of object appearance, we further use a 2D convolutional network to extract appearance features and combine them with the results from residual frames to form a two-path solution. In three benchmark datasets, our two-path solution achieved better or comparable performances than those using additional optical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Gait Recognition and Analysis · Anomaly Detection Techniques and Applications