Open-Ended Video Game Glitch Detection with Agentic Reasoning and Temporal Grounding
Muyang Zheng, Tong Zhou, Geyang Wu, Zihao Lin, Haibo Wang, Lifu Huang

TL;DR
This paper introduces VideoGlitchBench, a comprehensive benchmark for detecting and localizing glitches in gameplay videos, and proposes GliDe, an agentic framework that enhances reasoning and temporal grounding for this task.
Contribution
The paper presents the first benchmark for open-ended video game glitch detection and introduces GliDe, a novel agentic framework with multi-perspective reasoning and temporal localization capabilities.
Findings
GliDe outperforms baseline models in glitch detection accuracy.
VideoGlitchBench contains 5,238 annotated gameplay videos from 120 games.
Current multimodal models find the task highly challenging.
Abstract
Open-ended video game glitch detection aims to identify glitches in gameplay videos, describe them in natural language, and localize when they occur. Unlike conventional game glitch understanding tasks which have largely been framed as image-level recognition or closed-form question answering, this task requires reasoning about game-specific dynamics such as mechanics, physics, rendering, animation, and expected state transitions directly over continuous gameplay videos and distinguishing true glitches from unusual but valid in-game events. To support this task, we introduce VideoGlitchBench, the first benchmark for open-ended video game glitch detection with temporal localization. VideoGlitchBench contains 5,238 gameplay videos from 120 games, each annotated with detailed glitch descriptions and precise temporal spans, enabling unified evaluation of semantic understanding and temporal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
