Urania: Visualizing Data Analysis Pipelines for Natural Language-Based Data Exploration
Yi Guo, Nan Cao, Xiaoyu Qi, Haoyang Li, Danqing Shi, Jing Zhang, Qing, Chen, and Daniel Weiskopf

TL;DR
Urania is an interactive system that visualizes data analysis pipelines generated from natural language questions, improving interpretability and exploration in data analysis.
Contribution
It introduces a novel data-aware question decomposition algorithm and visualizes analysis pipelines to enhance data exploration and understanding.
Findings
The decomposition algorithm outperforms state-of-the-art in accuracy.
Urania improves dataset exploration effectiveness.
Expert feedback supports system usability and interpretability.
Abstract
Exploratory Data Analysis (EDA) is an essential yet tedious process for examining a new dataset. To facilitate it, natural language interfaces (NLIs) can help people intuitively explore the dataset via data-oriented questions. However, existing NLIs primarily focus on providing accurate answers to questions, with few offering explanations or presentations of the data analysis pipeline used to uncover the answer. Such presentations are crucial for EDA as they enhance the interpretability and reliability of the answer, while also helping users understand the analysis process and derive insights. To fill this gap, we introduce Urania, a natural language interactive system that is able to visualize the data analysis pipelines used to resolve input questions. It integrates a natural language interface that allows users to explore data via questions, and a novel data-aware question…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Topic Modeling · Computational Physics and Python Applications
