Weakly-Supervised Temporal Article Grounding
Long Chen, Yulei Niu, Brian Chen, Xudong Lin, Guangxing Han,, Christopher Thomas, Hammad Ayyubi, Heng Ji, Shih-Fu Chang

TL;DR
This paper introduces a new weakly-supervised task called temporal article grounding, addressing limitations of existing video grounding methods by handling multiple semantic scales and ungroundable sentences, supported by a new dataset and a novel DualMIL method.
Contribution
It proposes the first WSAG task, creates the YouwikiHow dataset, and develops the DualMIL model with specialized loss functions for weak supervision and hierarchical semantics.
Findings
DualMIL outperforms baseline models in experiments.
The dataset enables research on multi-scale and ungroundable sentence grounding.
Extensive ablations confirm the effectiveness of the proposed method.
Abstract
Given a long untrimmed video and natural language queries, video grounding (VG) aims to temporally localize the semantically-aligned video segments. Almost all existing VG work holds two simple but unrealistic assumptions: 1) All query sentences can be grounded in the corresponding video. 2) All query sentences for the same video are always at the same semantic scale. Unfortunately, both assumptions make today's VG models fail to work in practice. For example, in real-world multimodal assets (eg, news articles), most of the sentences in the article can not be grounded in their affiliated videos, and they typically have rich hierarchical relations (ie, at different semantic scales). To this end, we propose a new challenging grounding task: Weakly-Supervised temporal Article Grounding (WSAG). Specifically, given an article and a relevant video, WSAG aims to localize all ``groundable''…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling
Methodsfail
