FlyCatcher: Neural Inference of Runtime Checkers from Tests
Beatriz Souza, Chang Lou, Suman Nath, and Michael Pradel

TL;DR
FlyCatcher automatically derives runtime checkers from existing tests using LLMs, static analysis, and validation, significantly improving silent failure detection in complex software systems.
Contribution
It introduces an automated method to generate stateful runtime checkers from tests, enhancing error detection over previous approaches.
Findings
Inferred 334 checkers from 400 tests across four systems.
300 checkers were validated as correct via cross-validation.
Detected 5.2 times more errors compared to a state-of-the-art approach.
Abstract
Complex software systems often suffer from silent failures, i.e., violations of the intended semantics that do not cause explicit errors. A promising approach to detect such errors is to use system-specific runtime checkers that monitor the execution of a system and check for violations of the intended semantics. However, writing such checkers for a given software system is challenging and time-consuming, and hence, rarely done in practice. This work presents FlyCatcher, an automated approach to derive runtime checkers from existing tests, i.e., from a resource available for most software systems. The critical challenge of such an approach is to generalize the behavioral properties encoded in a test case to arbitrary executions of a system. FlyCatcher addresses this challenge through a combination of LLM-based synthesis, static analysis, and dynamic validation, which infers a checker…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
