Towards automated data analysis: A guided framework for LLM-based risk estimation
Panteleimon Rodis

TL;DR
This paper proposes a guided framework using Large Language Models for automated dataset risk estimation, combining AI capabilities with human oversight to improve data analysis robustness.
Contribution
It introduces a novel framework that integrates LLMs with human supervision for automated risk analysis of datasets, addressing limitations of manual and fully automated methods.
Findings
Framework effectively identifies dataset properties and proposes clustering techniques.
Human guidance ensures analysis alignment and process integrity.
Proof of concept demonstrates feasibility in risk assessment tasks.
Abstract
Large Language Models (LLMs) are increasingly integrated into critical decision-making pipelines, a trend that raises the demand for robust and automated data analysis. Current approaches to dataset risk analysis are limited to manual auditing methods which involve time-consuming and complex tasks, whereas fully automated analysis based on Artificial Intelligence (AI) suffers from hallucinations and issues stemming from AI alignment. To this end, this work proposes a framework for dataset risk estimation that integrates Generative AI under human guidance and supervision, aiming to set the foundations for a future automated risk analysis paradigm. Our approach utilizes LLMs to identify semantic and structural properties in database schemata, subsequently propose clustering techniques, generate the code for them and finally interpret the produced results. The human supervisor guides the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Data Quality and Management · Business Process Modeling and Analysis
