Learning Sample Importance for Cross-Scenario Video Temporal Grounding
Peijun Bao, Yadong Mu

TL;DR
This paper identifies superficial biases in video temporal grounding models and introduces DebiasTLL, a method that mitigates these biases by training dual models and re-weighting data, significantly improving cross-scenario performance.
Contribution
It is the first to analyze biases in temporal grounding and proposes a novel debiasing approach with dual models and data re-weighting for better generalization.
Findings
DebiasTLL outperforms state-of-the-art methods in cross-scenario tests.
Models relying on biases perform poorly in heterogeneous data.
Dual-model discrepancy effectively identifies biased samples.
Abstract
The task of temporal grounding aims to locate video moment in an untrimmed video, with a given sentence query. This paper for the first time investigates some superficial biases that are specific to the temporal grounding task, and proposes a novel targeted solution. Most alarmingly, we observe that existing temporal ground models heavily rely on some biases (e.g., high preference on frequent concepts or certain temporal intervals) in the visual modal. This leads to inferior performance when generalizing the model in cross-scenario test setting. To this end, we propose a novel method called Debiased Temporal Language Localizer (DebiasTLL) to prevent the model from naively memorizing the biases and enforce it to ground the query sentence based on true inter-modal relationship. Debias-TLL simultaneously trains two models. By our design, a large discrepancy of these two models' predictions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Human Pose and Action Recognition
