TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals
Jiyang Gao, Zhenheng Yang, Chen Sun, Kan Chen, Ram Nevatia

TL;DR
TURN is a fast and accurate temporal action proposal network that jointly predicts and refines action segments in untrimmed videos, significantly improving over previous methods in both accuracy and speed.
Contribution
The paper introduces TURN, a novel network that combines proposal prediction and boundary refinement with efficient unit feature reuse for superior temporal action proposals.
Findings
Outperforms state-of-the-art methods in average recall on THUMOS-14 and ActivityNet.
Achieves over 880 FPS on a TITAN X GPU.
Enhances existing action localization pipelines with better proposal generation.
Abstract
Temporal Action Proposal (TAP) generation is an important problem, as fast and accurate extraction of semantically important (e.g. human actions) segments from untrimmed videos is an important step for large-scale video analysis. We propose a novel Temporal Unit Regression Network (TURN) model. There are two salient aspects of TURN: (1) TURN jointly predicts action proposals and refines the temporal boundaries by temporal coordinate regression; (2) Fast computation is enabled by unit feature reuse: a long untrimmed video is decomposed into video units, which are reused as basic building blocks of temporal proposals. TURN outperforms the state-of-the-art methods under average recall (AR) by a large margin on THUMOS-14 and ActivityNet datasets, and runs at over 880 frames per second (FPS) on a TITAN X GPU. We further apply TURN as a proposal generation stage for existing temporal action…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Video Surveillance and Tracking Methods
