ViBR: Automated Bug Replay from Video-based Reports using Vision-Language Models

Sidong Feng; Dingbang Wang; Nikola Tomic; Tingting Yu; Aldeida Aleti; Chunyang Chen

arXiv:2604.19905·cs.SE·April 23, 2026

ViBR: Automated Bug Replay from Video-based Reports using Vision-Language Models

Sidong Feng, Dingbang Wang, Nikola Tomic, Tingting Yu, Aldeida Aleti, Chunyang Chen

PDF

TL;DR

ViBR is an automated system that uses vision-language models to reproduce bugs from GUI videos, improving accuracy over previous methods and reducing setup complexity.

Contribution

It introduces a fully automated approach combining CLIP embeddings and VLMs for bug reproduction from videos, eliminating the need for app-specific instrumentation.

Findings

01

Successfully reproduces 72% of bug recordings

02

Outperforms state-of-the-art baselines

03

Significantly improves bug reproduction accuracy

Abstract

Bug reports play a critical role in software maintenance by helping users convey encountered issues to developers. Recently, GUI screen capture videos have gained popularity as a bug reporting artifact due to their ease of use and ability to retain rich contextual information. However, automatically reproducing bugs from such recordings remains a significant challenge. Existing methods often rely on fragile image-processing heuristics, explicit touch indicators, or pre-constructed UI transition graphs, which require non-trivial instrumentation and app-specific setup. This paper presents ViBR, a lightweight and fully automated approach that reproduces bugs directly from GUI recordings. Specifically, ViBR combines CLIP-based embedding similarity for action boundary segmentation with Vision-Language Models (VLMs) for region-aware GUI state comparison and guided bug replay. Experimental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.