R-C3D: Region Convolutional 3D Network for Temporal Activity Detection

Huijuan Xu; Abir Das; Kate Saenko

arXiv:1703.07814·cs.CV·September 4, 2017·129 cites

R-C3D: Region Convolutional 3D Network for Temporal Activity Detection

Huijuan Xu, Abir Das, Kate Saenko

PDF

Open Access 3 Repos

TL;DR

R-C3D is an end-to-end 3D convolutional network that efficiently detects and classifies activities in untrimmed videos, achieving state-of-the-art results and high speed across multiple datasets.

Contribution

The paper introduces R-C3D, a novel fully convolutional 3D network for activity detection that shares features for proposal and classification, enabling fast and accurate detection.

Findings

01

Achieves real-time detection at 569 fps.

02

Outperforms existing methods on THUMOS'14.

03

Effective across multiple datasets like ActivityNet and Charades.

Abstract

We address the problem of activity detection in continuous, untrimmed video streams. This is a difficult task that requires extracting meaningful spatio-temporal features to capture activities, accurately localizing the start and end times of each activity. We introduce a new model, Region Convolutional 3D Network (R-C3D), which encodes the video streams using a three-dimensional fully convolutional network, then generates candidate temporal regions containing activities, and finally classifies selected regions into specific activities. Computation is saved due to the sharing of convolutional features between the proposal and the classification pipelines. The entire model is trained end-to-end with jointly optimized localization and classification losses. R-C3D is faster than existing methods (569 frames per second on a single Titan X Maxwell GPU) and achieves state-of-the-art results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods