TL;DR
This paper introduces A2Net, a novel action localization framework combining anchor-based and anchor-free modules, significantly improving detection of diverse action durations and achieving state-of-the-art results on THUMOS14.
Contribution
It proposes an innovative anchor-free module for temporal action localization that complements traditional anchor-based methods, enhancing flexibility and performance.
Findings
Achieves 45.5% mAP on THUMOS14, surpassing previous methods.
Demonstrates the complementarity between anchor-free and anchor-based modules.
Shows improved detection of extremely short and long actions.
Abstract
Most of the current action localization methods follow an anchor-based pipeline: depicting action instances by pre-defined anchors, learning to select the anchors closest to the ground truth, and predicting the confidence of anchors with refinements. Pre-defined anchors set prior about the location and duration for action instances, which facilitates the localization for common action instances but limits the flexibility for tackling action instances with drastic varieties, especially for extremely short or extremely long ones. To address this problem, this paper proposes a novel anchor-free action localization module that assists action localization by temporal points. Specifically, this module represents an action instance as a point with its distances to the starting boundary and ending boundary, alleviating the pre-defined anchor restrictions in terms of action localization and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
