FlyCatcher: Neural Inference of Runtime Checkers from Tests

Beatriz Souza; Chang Lou; Suman Nath; and Michael Pradel

arXiv:2604.22028·cs.SE·April 27, 2026

FlyCatcher: Neural Inference of Runtime Checkers from Tests

Beatriz Souza, Chang Lou, Suman Nath, and Michael Pradel

PDF

TL;DR

FlyCatcher automatically derives runtime checkers from existing tests using LLMs, static analysis, and validation, significantly improving silent failure detection in complex software systems.

Contribution

It introduces an automated method to generate stateful runtime checkers from tests, enhancing error detection over previous approaches.

Findings

01

Inferred 334 checkers from 400 tests across four systems.

02

300 checkers were validated as correct via cross-validation.

03

Detected 5.2 times more errors compared to a state-of-the-art approach.

Abstract

Complex software systems often suffer from silent failures, i.e., violations of the intended semantics that do not cause explicit errors. A promising approach to detect such errors is to use system-specific runtime checkers that monitor the execution of a system and check for violations of the intended semantics. However, writing such checkers for a given software system is challenging and time-consuming, and hence, rarely done in practice. This work presents FlyCatcher, an automated approach to derive runtime checkers from existing tests, i.e., from a resource available for most software systems. The critical challenge of such an approach is to generalize the behavioral properties encoded in a test case to arbitrary executions of a system. FlyCatcher addresses this challenge through a combination of LLM-based synthesis, static analysis, and dynamic validation, which infers a checker…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.