Hierarchical Deep Residual Reasoning for Temporal Moment Localization

Ziyang Ma; Xianjing Han; Xuemeng Song; Yiran Cui; Liqiang Nie

arXiv:2111.00417·cs.MM·November 2, 2021

Hierarchical Deep Residual Reasoning for Temporal Moment Localization

Ziyang Ma, Xianjing Han, Xuemeng Song, Yiran Cui, Liqiang Nie

PDF

1 Repo

TL;DR

This paper introduces a Hierarchical Deep Residual Reasoning model for more precise temporal moment localization in videos, leveraging multi-level semantic representations and adaptive feature fusion.

Contribution

The paper presents a novel hierarchical reasoning framework that decomposes video and sentence semantics for finer localization, and introduces Res-BiGRUs for adaptive feature fusion.

Findings

01

HDRR outperforms state-of-the-art methods on Charades-STA and ActivityNet-Captions datasets.

02

Hierarchical semantic decomposition improves localization accuracy.

03

Res-BiGRUs effectively handle varying video resolutions and sentence lengths.

Abstract

Temporal Moment Localization (TML) in untrimmed videos is a challenging task in the field of multimedia, which aims at localizing the start and end points of the activity in the video, described by a sentence query. Existing methods mainly focus on mining the correlation between video and sentence representations or investigating the fusion manner of the two modalities. These works mainly understand the video and sentence coarsely, ignoring the fact that a sentence can be understood from various semantics, and the dominant words affecting the moment localization in the semantics are the action and object reference. Toward this end, we propose a Hierarchical Deep Residual Reasoning (HDRR) model, which decomposes the video and sentence into multi-level representations with different semantics to achieve a finer-grained localization. Furthermore, considering that videos with different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ddlbojack/hdrr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.