Temporal Action Localization with Cross Layer Task Decoupling and   Refinement

Qiang Li; Di Liu; Jun Kong; Sen Li; Hui Xu; Jianzhong Wang

arXiv:2412.09202·cs.CV·December 16, 2024

Temporal Action Localization with Cross Layer Task Decoupling and Refinement

Qiang Li, Di Liu, Jun Kong, Sen Li, Hui Xu, Jianzhong Wang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel approach for temporal action localization that effectively disentangles classification and localization tasks using cross layer feature decoupling and refinement, leading to state-of-the-art results.

Contribution

The proposed CLTDR method integrates multi-layer features for better task decoupling and refinement, and introduces the lightweight GMG module for multi-granularity feature extraction.

Findings

01

Achieves state-of-the-art performance on five benchmarks.

02

Effectively disentangles classification and localization tasks.

03

Improves feature utilization with the GMG module.

Abstract

Temporal action localization (TAL) involves dual tasks to classify and localize actions within untrimmed videos. However, the two tasks often have conflicting requirements for features. Existing methods typically employ separate heads for classification and localization tasks but share the same input feature, leading to suboptimal performance. To address this issue, we propose a novel TAL method with Cross Layer Task Decoupling and Refinement (CLTDR). Based on the feature pyramid of video, CLTDR strategy integrates semantically strong features from higher pyramid layers and detailed boundary-aware boundary features from lower pyramid layers to effectively disentangle the action classification and localization tasks. Moreover, the multiple features from cross layers are also employed to refine and align the disentangled classification and regression results. At last, a lightweight Gated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liqiang0307/cltdr-gmg
pytorchOfficial

Videos

Temporal Action Localization with Cross Layer Task Decoupling and Refinement· underline

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Gait Recognition and Analysis

MethodsALIGN