Learning to Refactor Action and Co-occurrence Features for Temporal   Action Localization

Kun Xia; Le Wang; Sanping Zhou; Nanning Zheng; Wei Tang

arXiv:2206.11493·cs.CV·June 24, 2022

Learning to Refactor Action and Co-occurrence Features for Temporal Action Localization

Kun Xia, Le Wang, Sanping Zhou, Nanning Zheng, Wei Tang

PDF

Open Access

TL;DR

This paper introduces RefactorNet, a novel approach that decouples and recombines action and co-occurrence features in videos to improve the accuracy of temporal action localization.

Contribution

The paper proposes a new feature decoupling and recombination method that enhances action localization by emphasizing salient action content.

Findings

01

Significant performance improvements on THUMOS14 and ActivityNet v1.3 datasets.

02

Effective decoupling of action and co-occurrence features.

03

Improved localization accuracy with a simple detector.

Abstract

The main challenge of Temporal Action Localization is to retrieve subtle human actions from various co-occurring ingredients, e.g., context and background, in an untrimmed video. While prior approaches have achieved substantial progress through devising advanced action detectors, they still suffer from these co-occurring ingredients which often dominate the actual action content in videos. In this paper, we explore two orthogonal but complementary aspects of a video snippet, i.e., the action features and the co-occurrence features. Especially, we develop a novel auxiliary task by decoupling these two types of features within a video snippet and recombining them to generate a new feature representation with more salient action information for accurate action localization. We term our method RefactorNet, which first explicitly factorizes the action content and regularizes its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Anomaly Detection Techniques and Applications