BIS: NL2SQL Service Evaluation Benchmark for Business Intelligence Scenarios
Bora Caglayan, Mingxue Wang, John D. Kelleher, Shen Fei, Gui Tong,, Jiandong Ding, Puchao Zhang

TL;DR
This paper introduces BIS, a new NL2SQL benchmark tailored for business intelligence scenarios, addressing limitations of existing benchmarks by focusing on common BI questions and proposing new evaluation metrics.
Contribution
The paper presents a BI-specific NL2SQL benchmark with question categories and two novel semantic similarity metrics for better assessment in BI applications.
Findings
Benchmark reflects typical BI questions
Proposes two semantic similarity evaluation metrics
Addresses gaps in existing NL2SQL benchmarks
Abstract
NL2SQL (Natural Language to Structured Query Language) transformation has seen wide adoption in Business Intelligence (BI) applications in recent years. However, existing NL2SQL benchmarks are not suitable for production BI scenarios, as they are not designed for common business intelligence questions. To address this gap, we have developed a new benchmark focused on typical NL questions in industrial BI scenarios. We discuss the challenges of constructing a BI-focused benchmark and the shortcomings of existing benchmarks. Additionally, we introduce question categories in our benchmark that reflect common BI inquiries. Lastly, we propose two novel semantic similarity evaluation metrics for assessing NL2SQL capabilities in BI applications and services.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Business Intelligence
