Automated Question Generation on Tabular Data for Conversational Data Exploration
Ritwik Chaudhuri, Rajmohan C, Kirushikesh DB, Arvind Agarwal

TL;DR
This paper presents a system that automatically generates natural language questions about datasets to facilitate non-technical data exploration through conversational interaction, using a fine-tuned language model.
Contribution
It introduces a novel approach combining interestingness measures and a fine-tuned T5 model to generate and rank questions for dataset exploration in a conversational manner.
Findings
Effective question generation for datasets demonstrated.
Improved user engagement in exploratory data analysis.
System tested on real datasets showing practical utility.
Abstract
Exploratory data analysis (EDA) is an essential step for analyzing a dataset to derive insights. Several EDA techniques have been explored in the literature. Many of them leverage visualizations through various plots. But it is not easy to interpret them for a non-technical user, and producing appropriate visualizations is also tough when there are a large number of columns. Few other works provide a view of some interesting slices of data but it is still difficult for the user to draw relevant insights from them. Of late, conversational data exploration is gaining a lot of traction among non-technical users. It helps the user to explore the dataset without having deep technical knowledge about the data. Towards this, we propose a system that recommends interesting questions in natural language based on relevant slices of a dataset in a conversational setting. Specifically, given a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Advanced Text Analysis Techniques · Expert finding and Q&A systems
MethodsSparse Evolutionary Training
