Cerberus: Multi-Agent Reasoning and Coverage-Guided Exploration for Static Detection of Runtime Errors

Hridya Dhulipala; Xiaokai Rong; Tien N. Nguyen

arXiv:2512.21431·cs.SE·December 29, 2025

Cerberus: Multi-Agent Reasoning and Coverage-Guided Exploration for Static Detection of Runtime Errors

Hridya Dhulipala, Xiaokai Rong, Tien N. Nguyen

PDF

Open Access

TL;DR

Cerberus is a novel execution-free testing framework that uses large language models to predict code coverage and detect runtime errors in code snippets, improving error detection efficiency.

Contribution

It introduces a two-phase feedback loop leveraging LLMs for coverage-guided testing without executing code, enhancing runtime error detection in incomplete snippets.

Findings

01

Outperforms traditional testing methods in error detection.

02

Generates high-coverage test cases efficiently.

03

Discovers more runtime errors in code snippets.

Abstract

In several software development scenarios, it is desirable to detect runtime errors and exceptions in code snippets without actual execution. A typical example is to detect runtime exceptions in online code snippets before integrating them into a codebase. In this paper, we propose Cerberus, a novel predictive, execution-free coverage-guided testing framework. Cerberus uses LLMs to generate the inputs that trigger runtime errors and to perform code coverage prediction and error detection without code execution. With a two-phase feedback loop, Cerberus first aims to both increasing code coverage and detecting runtime errors, then shifts to focus only detecting runtime errors when the coverage reaches 100% or its maximum, enabling it to perform better than prompting the LLMs for both purposes. Our empirical evaluation demonstrates that Cerberus performs better than conventional and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Software System Performance and Reliability