A System and Benchmark for LLM-based Q&A on Heterogeneous Data
Achille Fokoue, Srideepika Jayaraman, Elham Khabiri, Jeffrey O., Kephart, Yingjie Li, Dhruv Shah, Youssef Drissi, Fenno F. Heath III, Anu, Bhamidipaty, Fateh A. Tipu, Robert J.Baseman

TL;DR
This paper introduces siwarex, a platform that enables natural language querying across heterogeneous data sources like databases and APIs, addressing a key challenge in industrial data access.
Contribution
The paper presents siwarex, a novel system that handles heterogeneous data sources and extends the Spider benchmark with API data, facilitating research in LLM-based data querying.
Findings
siwarex effectively manages data source heterogeneity
Extended Spider benchmark with API data is publicly available
Demonstrates improved natural language access to diverse data sources
Abstract
In many industrial settings, users wish to ask questions whose answers may be found in structured data sources such as a spreadsheets, databases, APIs, or combinations thereof. Often, the user doesn't know how to identify or access the right data source. This problem is compounded even further if multiple (and potentially siloed) data sources must be assembled to derive the answer. Recently, various Text-to-SQL applications that leverage Large Language Models (LLMs) have addressed some of these problems by enabling users to ask questions in natural language. However, these applications remain impractical in realistic industrial settings because they fail to cope with the data source heterogeneity that typifies such environments. In this paper, we address heterogeneity by introducing the siwarex platform, which enables seamless natural language access to both databases and APIs. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExpert finding and Q&A systems
