D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with   Glance Annotation

Hanjun Li; Xiujun Shu; Sunan He; Ruizhi Qiao; Wei Wen; Taian Guo; Bei; Gan; Xing Sun

arXiv:2308.04197·cs.CV·August 9, 2023

D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation

Hanjun Li, Xiujun Shu, Sunan He, Ruizhi Qiao, Wei Wen, Taian Guo, Bei, Gan, Xing Sun

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces D3G, a novel weakly supervised framework for temporal sentence grounding that uses glance annotations and Gaussian priors to effectively locate video moments with reduced annotation effort.

Contribution

The paper proposes a Dynamic Gaussian prior based Grounding framework with Glance annotation (D3G), combining semantic alignment and dynamic distribution adjustment for improved weakly supervised TSG.

Findings

01

Outperforms state-of-the-art weakly supervised methods significantly.

02

Narrows the performance gap with fully supervised approaches.

03

Proves effectiveness across three challenging benchmarks.

Abstract

Temporal sentence grounding (TSG) aims to locate a specific moment from an untrimmed video with a given natural language query. Recently, weakly supervised methods still have a large performance gap compared to fully supervised ones, while the latter requires laborious timestamp annotations. In this study, we aim to reduce the annotation cost yet keep competitive performance for TSG task compared to fully supervised ones. To achieve this goal, we investigate a recently proposed glance-supervised temporal sentence grounding task, which requires only single frame annotation (referred to as glance annotation) for each query. Under this setup, we propose a Dynamic Gaussian prior based Grounding framework with Glance annotation (D3G), which consists of a Semantic Alignment Group Contrastive Learning module (SA-GCL) and a Dynamic Gaussian prior Adjustment module (DGA). Specifically, SA-GCL…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

solicucu/d3g
pytorchOfficial

Videos

D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation· youtube

Taxonomy

TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Video Analysis and Summarization

MethodsContrastive Learning