Deep Point-wise Prediction for Action Temporal Proposal

Luxuan Li; Tao Kong; Fuchun Sun; Huaping Liu

arXiv:1909.07725·cs.CV·September 18, 2019

Deep Point-wise Prediction for Action Temporal Proposal

Luxuan Li, Tao Kong, Fuchun Sun, Huaping Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces Deep Point-wise Prediction (DPP), a fast, end-to-end method for generating temporal action proposals in videos without relying on sliding windows or grouping, achieving real-time performance.

Contribution

The paper proposes a novel end-to-end approach for temporal action proposal generation that predicts action likelihoods and locations simultaneously, eliminating the need for handcrafted strategies.

Findings

01

DPP achieves over 1000 frames per second in processing.

02

DPP demonstrates superior effectiveness, generality, and robustness on THUMOS14.

03

The method outperforms previous approaches in speed and accuracy.

Abstract

Detecting actions in videos is an important yet challenging task. Previous works usually utilize (a) sliding window paradigms, or (b) per-frame action scoring and grouping to enumerate the possible temporal locations. Their performances are also limited to the designs of sliding windows or grouping strategies. In this paper, we present a simple and effective method for temporal action proposal generation, named Deep Point-wise Prediction (DPP). DPP simultaneously predicts the action existing possibility and the corresponding temporal locations, without the utilization of any handcrafted sliding window or grouping. The whole system is end-to-end trained with joint loss of temporal action proposal classification and location prediction. We conduct extensive experiments to verify its effectiveness, generality and robustness on standard THUMOS14 dataset. DPP runs more than 1000 frames per…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liluxuan1997/DPP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Video Analysis and Summarization · Multimodal Machine Learning Applications