Multi-Pair Temporal Sentence Grounding via Multi-Thread Knowledge   Transfer Network

Xiang Fang; Wanlong Fang; Changshuo Wang; Daizong Liu; Keke Tang,; Jianfeng Dong; Pan Zhou; Beibei Li

arXiv:2412.15678·cs.CV·April 7, 2025

Multi-Pair Temporal Sentence Grounding via Multi-Thread Knowledge Transfer Network

Xiang Fang, Wanlong Fang, Changshuo Wang, Daizong Liu, Keke Tang,, Jianfeng Dong, Pan Zhou, Beibei Li

PDF

Open Access 1 Video

TL;DR

This paper introduces a multi-pair temporal sentence grounding framework that co-trains multiple video-query pairs, leveraging shared semantics and prototypes to improve efficiency and accuracy in locating relevant video segments.

Contribution

It proposes a novel multi-pair co-training approach with cross-modal contrast, prototype alignment, and adaptive negative selection for more effective temporal sentence grounding.

Findings

01

Outperforms existing methods in accuracy and efficiency

02

Effectively models cross-modal semantic relationships

03

Reduces redundant knowledge re-obtaining

Abstract

Given some video-query pairs with untrimmed videos and sentence queries, temporal sentence grounding (TSG) aims to locate query-relevant segments in these videos. Although previous respectable TSG methods have achieved remarkable success, they train each video-query pair separately and ignore the relationship between different pairs. We observe that the similar video/query content not only helps the TSG model better understand and generalize the cross-modal representation but also assists the model in locating some complex video-query pairs. Previous methods follow a single-thread framework that cannot co-train different pairs and usually spends much time re-obtaining redundant knowledge, limiting their real-world applications. To this end, in this paper, we pose a brand-new setting: Multi-Pair TSG, which aims to co-train these pairs. In particular, we propose a novel video-query…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Multi-Pair Temporal Sentence Grounding via Multi-Thread Knowledge Transfer Network· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsALIGN