DataChat: Prototyping a Conversational Agent for Dataset Search and Visualization
Lizhou Fan, Sara Lafia, Lingyao Li, Fangyuan Yang, Libby Hemphill

TL;DR
DataChat is a chatbot-based system that uses a graph database and large language models to improve dataset search and exploration, addressing user challenges in understanding and evaluating data for reuse.
Contribution
The paper introduces DataChat, a novel conversational agent that enhances dataset search and understanding through AI-driven interactions and graph database integration.
Findings
Improves dataset discovery and exploration experience.
Facilitates understanding of dataset context and relevance.
Supports data reuse and decision-making processes.
Abstract
Data users need relevant context and research expertise to effectively search for and identify relevant datasets. Leading data providers, such as the Inter-university Consortium for Political and Social Research (ICPSR), offer standardized metadata and search tools to support data search. Metadata standards emphasize the machine-readability of data and its documentation. There are opportunities to enhance dataset search by improving users' ability to learn about, and make sense of, information about data. Prior research has shown that context and expertise are two main barriers users face in effectively searching for, evaluating, and deciding whether to reuse data. In this paper, we propose a novel chatbot-based search system, DataChat, that leverages a graph database and a large language model to provide novel ways for users to interact with and search for research data. DataChat…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · AI in Service Interactions · Context-Aware Activity Recognition Systems
