Explaining Wrong Queries Using Small Examples
Zhengjie Miao, Sudeepa Roy, Jun Yang

TL;DR
This paper introduces algorithms to find minimal counterexamples explaining why two SQL queries differ, aiding debugging and understanding of query inequivalence, with practical evaluation on student and benchmark queries.
Contribution
The paper presents novel algorithms for identifying the smallest counterexamples for various classes of SQL queries, including an efficient provenance-based method for complex queries.
Findings
Algorithms effectively find minimal counterexamples in practical scenarios.
Provenance-based approach scales well with complex queries.
User study shows the tool helps students understand query errors.
Abstract
For testing the correctness of SQL queries, e.g., evaluating student submissions in a database course, a standard practice is to execute the query in question on some test database instance and compare its result with that of the correct query. Given two queries and , we say that a database instance is a counterexample (for and ) if differs from ; such a counterexample can serve as an explanation of why and are not equivalent. While the test database instance may serve as a counterexample, it may be too large or complex to read and understand where the inequivalence comes from. Therefore, in this paper, given a known counterexample for and , we aim to find the smallest counterexample where . The problem in general is NP-hard. We give a suite of algorithms for finding the smallest…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Database Systems and Queries · Distributed and Parallel Computing Systems
