TL;DR
This paper introduces Agent-Aware Boundary Networks (ABN), a novel framework that captures agent-environment interactions to improve temporal action proposal generation in videos, outperforming existing methods across multiple datasets and backbones.
Contribution
The paper proposes a new agent-aware representation network that models agent-agent and agent-environment interactions for better temporal proposal generation.
Findings
ABN outperforms state-of-the-art methods on THUMOS-14 and ActivityNet-1.3 datasets.
ABN improves proposal quality when integrated with TAD frameworks.
The framework is effective across different backbone networks.
Abstract
Temporal action proposal generation (TAPG) aims to estimate temporal intervals of actions in untrimmed videos, which is a challenging yet plays an important role in many tasks of video analysis and understanding. Despite the great achievement in TAPG, most existing works ignore the human perception of interaction between agents and the surrounding environment by applying a deep learning model as a black-box to the untrimmed videos to extract video visual representation. Therefore, it is beneficial and potentially improve the performance of TAPG if we can capture these interactions between agents and the environment. In this paper, we propose a novel framework named Agent-Aware Boundary Network (ABN), which consists of two sub-networks (i) an Agent-Aware Representation Network to obtain both agent-agent and agents-environment relationships in the video representation, and (ii) a Boundary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
