Multi-Modal Interaction Graph Convolutional Network for Temporal   Language Localization in Videos

Zongmeng Zhang; Xianjing Han; Xuemeng Song; Yan Yan; Liqiang Nie

arXiv:2110.06058·cs.CV·October 13, 2021

Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

Zongmeng Zhang, Xianjing Han, Xuemeng Song, Yan Yan, Liqiang Nie

PDF

1 Repo

TL;DR

This paper introduces MIGCN, a novel graph convolutional network that models intra- and inter-modal relations to improve temporal language localization in videos, achieving better accuracy and efficiency.

Contribution

The work proposes a multi-modal interaction graph convolutional network with adaptive context-aware localization for more accurate video moment detection.

Findings

01

Outperforms existing methods on Charades-STA and ActivityNet datasets.

02

Effectively captures complex intra- and inter-modal relations.

03

Demonstrates superior efficiency in localization tasks.

Abstract

This paper focuses on tackling the problem of temporal language localization in videos, which aims to identify the start and end points of a moment described by a natural language sentence in an untrimmed video. However, it is non-trivial since it requires not only the comprehensive understanding of the video and sentence query, but also the accurate semantic correspondence capture between them. Existing efforts are mainly centered on exploring the sequential relation among video clips and query words to reason the video and sentence query, neglecting the other intra-modal relations (e.g., semantic similarity among video clips and syntactic dependency among the query words). Towards this end, in this work, we propose a Multi-modal Interaction Graph Convolutional Network (MIGCN), which jointly explores the complex intra-modal relations and inter-modal interactions residing in the video…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zmzhang2000/MIGCN
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.