Unhackable Temporal Rewarding for Scalable Video MLLMs

En Yu; Kangheng Lin; Liang Zhao; Yana Wei; Zining Zhu; Haoran Wei,; Jianjian Sun; Zheng Ge; Xiangyu Zhang; Jingyu Wang; and Wenbing Tao

arXiv:2502.12081·cs.CV·February 18, 2025

Unhackable Temporal Rewarding for Scalable Video MLLMs

En Yu, Kangheng Lin, Liang Zhao, Yana Wei, Zining Zhu, Haoran Wei,, Jianjian Sun, Zheng Ge, Xiangyu Zhang, Jingyu Wang, and Wenbing Tao

PDF

Open Access

TL;DR

This paper identifies the problem of temporal hacking in video MLLMs, introduces a theoretical framework and a new reward method to improve temporal understanding, and demonstrates significant performance gains.

Contribution

It establishes a theory of temporal hacking, proposes the UTR framework to prevent it, and introduces TPL as a metric for temporal modeling quality.

Findings

01

TPL correlates with frame activation patterns

02

UTR effectively counters temporal hacking

03

UTR improves video comprehension performance

Abstract

In the pursuit of superior video-processing MLLMs, we have encountered a perplexing paradox: the "anti-scaling law", where more data and larger models lead to worse performance. This study unmasks the culprit: "temporal hacking", a phenomenon where models shortcut by fixating on select frames, missing the full video narrative. In this work, we systematically establish a comprehensive theory of temporal hacking, defining it from a reinforcement learning perspective, introducing the Temporal Perplexity (TPL) score to assess this misalignment, and proposing the Unhackable Temporal Rewarding (UTR) framework to mitigate the temporal hacking. Both theoretically and empirically, TPL proves to be a reliable indicator of temporal modeling quality, correlating strongly with frame activation patterns. Extensive experiments reveal that UTR not only counters temporal hacking but significantly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Wireless Network Optimization