Agent-Environment Network for Temporal Action Proposal Generation

Viet-Khoa Vo-Ho; Ngan Le; Kashu Yamazaki; Akihiro Sugimoto; Minh-Triet; Tran

arXiv:2107.08323·cs.CV·March 18, 2022

Agent-Environment Network for Temporal Action Proposal Generation

Viet-Khoa Vo-Ho, Ngan Le, Kashu Yamazaki, Akihiro Sugimoto, Minh-Triet, Tran

PDF

TL;DR

This paper introduces a novel contextual Agent-Environment Network that models human agents and their interactions with the environment to improve temporal action proposal generation in videos, outperforming existing methods.

Contribution

It proposes a new agent-environment network architecture that captures local agent actions and global interactions, addressing limitations of previous attention-based models.

Findings

01

Outperforms state-of-the-art methods on THUMOS-14 and ActivityNet-1.3 datasets

02

Effective with different backbone networks like C3D and SlowFast

03

Demonstrates robustness across various action categories

Abstract

Temporal action proposal generation is an essential and challenging task that aims at localizing temporal intervals containing human actions in untrimmed videos. Most of existing approaches are unable to follow the human cognitive process of understanding the video context due to lack of attention mechanism to express the concept of an action or an agent who performs the action or the interaction between the agent and the environment. Based on the action definition that a human, known as an agent, interacts with the environment and performs an action that affects the environment, we propose a contextual Agent-Environment Network. Our proposed contextual AEN involves (i) agent pathway, operating at a local level to tell about which humans/agents are acting and (ii) environment pathway operating at a global level to tell about how the agents interact with the environment. Comprehensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.