Cross-Enhancement Transform Two-Stream 3D ConvNets for Action   Recognition

Dong Cao; Lisha Xu; and Dongdong Zhang

arXiv:1908.08916·cs.CV·October 23, 2019·1 cites

Cross-Enhancement Transform Two-Stream 3D ConvNets for Action Recognition

Dong Cao, Lisha Xu, and Dongdong Zhang

PDF

Open Access

TL;DR

This paper introduces a Cross-Enhancement Transform Two-Stream 3D ConvNets method that leverages the performance of one stream to improve the other in action recognition tasks across various datasets.

Contribution

It proposes a novel cross-enhancement transform approach that uses a teaching model to enhance training of two-stream 3D ConvNets for better action recognition.

Findings

01

Improved accuracy on UCF-101, HMDB-51, and Kinetics-400 datasets.

02

Effective in handling action variations across different environments.

03

Demonstrates the benefit of mutual stream enhancement in 3D ConvNets.

Abstract

Action recognition is an important research topic in computer vision. It is the basic work for visual understanding and has been applied in many fields. Since human actions can vary in different environments, it is difficult to infer actions in completely different states with a same structural model. For this case, we propose a Cross-Enhancement Transform Two-Stream 3D ConvNets algorithm, which considers the action distribution characteristics on the specific dataset. As a teaching model, stream with better performance in both streams is expected to assist in training another stream. In this way, the enhanced-trained stream and teacher stream are combined to infer actions. We implement experiments on the video datasets UCF-101, HMDB-51, and Kinetics-400, and the results confirm the effectiveness of our algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Multimodal Machine Learning Applications