TL;DR
TiInsight is an SQL-based system leveraging large language models to automate cross-domain exploratory data analysis with a user-friendly interface and visualization capabilities.
Contribution
It introduces a comprehensive pipeline combining natural language queries, hierarchical data context, question clarification, text-to-SQL, and visualization, deployed in a real-world environment.
Findings
Successfully deployed in PingCAP's production environment.
Demonstrated effective cross-domain data exploration with representative datasets.
Provides a user-friendly GUI for natural language data analysis.
Abstract
The SQL-based exploratory data analysis has garnered significant attention within the data analysis community. The emergence of large language models (LLMs) has facilitated the paradigm shift from manual to automated data exploration. However, existing methods generally lack the ability for cross-domain analysis, and the exploration of LLMs capabilities remains insufficient. This paper presents TiInsight, an SQL-based automated cross-domain exploratory data analysis system. First, TiInsight offers a user-friendly GUI enabling users to explore data using natural language queries. Second, TiInsight offers a robust cross-domain exploratory data analysis pipeline: hierarchical data context (i.e., HDC) generation, question clarification and decomposition, text-to-SQL (i.e., TiSQL), and data visualization (i.e., TiChart). Third, we have implemented and deployed TiInsight in the production…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
