FuncEvalGMN: Evaluating Functional Correctness of SQL via Graph Matching   Network

Yi Zhan; Yang Sun; Han Weng; Longjie Cui; Guifeng Wang; Jiajun Xie; Yu; Tian; Xiaoming Yin; Boyi Liu; Dongchi Huang

arXiv:2407.14530·cs.DB·July 23, 2024

FuncEvalGMN: Evaluating Functional Correctness of SQL via Graph Matching Network

Yi Zhan, Yang Sun, Han Weng, Longjie Cui, Guifeng Wang, Jiajun Xie, Yu, Tian, Xiaoming Yin, Boyi Liu, Dongchi Huang

PDF

Open Access

TL;DR

This paper introduces FuncEvalGMN, a graph-based evaluation method for assessing the functional correctness of SQL code, overcoming limitations of existing metrics by using graph matching and a GNN approach.

Contribution

It presents a novel graph matching framework with a GNN for SQL correctness evaluation, along with a new dataset for training and testing.

Findings

01

Accurately predicts SQL functional correctness.

02

Outperforms traditional matching and execution-based metrics.

03

Provides a new benchmark dataset for SQL evaluation.

Abstract

In this paper, we propose a novel graph-based methodology to evaluate the functional correctness of SQL generation. Conventional metrics for assessing SQL code generation, such as matching-based and execution-based methods (e.g., exact set match and execution accuracy), are subject to two primary limitations. Firstly, the former fails to effectively assess functional correctness, as different SQL queries may possess identical functionalities. Secondly, the latter is susceptible to producing false positive samples in evaluations. Our proposed evaluation method, \texttt{FuncEvalGMN}, does not depend on the sufficient preparation of the test data, and it enables precise testing of the functional correctness of the code. Firstly, we parse SQL using a relational operator tree (ROT) called \textit{Relnode}, which contains rich semantic information from the perspective of logical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Semantic Web and Ontologies · Scientific Computing and Data Management

MethodsSparse Evolutionary Training · Focus