Complementary Boundary Generator with Scale-Invariant Relation Modeling   for Temporal Action Localization: Submission to ActivityNet Challenge 2020

Haisheng Su; Jinyuan Feng; Hao Shao; Zhenyu Jiang; Manyuan Zhang; Wei; Wu; Yu Liu; Hongsheng Li; Junjie Yan

arXiv:2007.09883·cs.CV·August 27, 2020

Complementary Boundary Generator with Scale-Invariant Relation Modeling for Temporal Action Localization: Submission to ActivityNet Challenge 2020

Haisheng Su, Jinyuan Feng, Hao Shao, Zhenyu Jiang, Manyuan Zhang, Wei, Wu, Yu Liu, Hongsheng Li, Junjie Yan

PDF

Open Access

TL;DR

This paper introduces a novel approach for temporal action localization that enhances proposal diversity and accuracy through a boundary generator and scale-invariant relation modeling, achieving state-of-the-art results.

Contribution

It proposes a complementary boundary generator with scale-invariant relation modeling to improve proposal quality and classification in temporal action localization.

Findings

01

Achieved 42.26% average mAP on ActivityNet Challenge 2020 test set.

02

Enhanced proposal diversity by exploring multiple components and strategies.

03

State-of-the-art performance in temporal action localization.

Abstract

This technical report presents an overview of our solution used in the submission to ActivityNet Challenge 2020 Task 1 (\textbf{temporal action localization/detection}). Temporal action localization requires to not only precisely locate the temporal boundaries of action instances, but also accurately classify the untrimmed videos into specific categories. In this paper, we decouple the temporal action localization task into two stages (i.e. proposal generation and classification) and enrich the proposal diversity through exhaustively exploring the influences of multiple components from different but complementary perspectives. Specifically, in order to generate high-quality proposals, we consider several factors including the video feature encoder, the proposal generator, the proposal-proposal relations, the scale imbalance, and ensemble strategy. Finally, in order to obtain accurate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Anomaly Detection Techniques and Applications