Searching Action Proposals via Spatial Actionness Estimation and Temporal Path Inference and Tracking
Nannan Li, Dan Xu, Zhenqiang Ying, Zhihao Li, Ge Li

TL;DR
This paper introduces a novel method for searching action proposals in videos by combining spatial actionness estimation with temporal path inference and tracking, improving accuracy and proposal quantity.
Contribution
It proposes a new actionness estimation technique using appearance and motion cues, and formulates path association as a maximum set coverage problem with an optimized greedy solution.
Findings
Outperforms state-of-the-art in proposal accuracy
Generates more comprehensive action proposals
Effective on challenging datasets UCF-Sports and UCF-101
Abstract
In this paper, we address the problem of searching action proposals in unconstrained video clips. Our approach starts from actionness estimation on frame-level bounding boxes, and then aggregates the bounding boxes belonging to the same actor across frames via linking, associating, tracking to generate spatial-temporal continuous action paths. To achieve the target, a novel actionness estimation method is firstly proposed by utilizing both human appearance and motion cues. Then, the association of the action paths is formulated as a maximum set coverage problem with the results of actionness estimation as a priori. To further promote the performance, we design an improved optimization objective for the problem and provide a greedy search algorithm to solve it. Finally, a tracking-by-detection scheme is designed to further refine the searched action paths. Extensive experiments on two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Analysis and Summarization · Advanced Vision and Imaging
