Testing Database Systems with Large Language Model Synthesized Fragments
Suyang Zhong, Manuel Rigger

TL;DR
This paper introduces ShQveL, a novel method that enhances database testing by using Large Language Models to generate complex SQL fragments, leading to the discovery of numerous previously unknown bugs in popular DBMSs.
Contribution
ShQveL systematically incorporates LLM-generated SQL fragments into existing test-case generators, expanding feature coverage and improving bug detection capabilities.
Findings
Discovered 55 new bugs across 5 DBMSs
50 bugs were fixed shortly after reporting
Enhanced SQL feature coverage in testing process
Abstract
Various automated testing approaches have been proposed for Database Management Systems (DBMSs). Many such approaches generate pairs of equivalent queries to identify bugs that cause DBMSs to compute incorrect results, and have found hundreds of bugs in mature, widely used DBMSs. Most of these approaches are based on manually written SQL generators; however, their bug-finding capabilities remain constrained by the limited set of SQL features supported by the generators. In this work, we propose ShQveL, an approach that augments existing SQL test-case generators by leveraging Large Language Models (LLMs) to synthesize SQL fragments. Our key idea is to systematically incorporate SQL features gained through automated interactions with LLMs into the SQL generators, increasing the features covered while efficiently generating test cases. Specifically, ShQveL uses SQL sketches -- SQL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsService-Oriented Architecture and Web Services · Advanced Database Systems and Queries · Data Quality and Management
