InsightEval: An Expert-Curated Benchmark for Assessing Insight Discovery in LLM-Driven Data Agents
Zhenghao Zhu, Yuanfeng Song, Xin Chen, Chengzhong Liu, Yakun Cui, Caleb Chen Cao, Sirui Han, Yike Guo

TL;DR
InsightEval is a new benchmark dataset and metric for evaluating the ability of large language models and multi-agent systems to discover insights in data, addressing limitations of previous benchmarks.
Contribution
The paper introduces InsightEval, a curated dataset and a novel metric for assessing insight discovery, improving upon existing benchmarks like InsightBench.
Findings
Highlighting challenges in automated insight discovery
Demonstrating the effectiveness of InsightEval and the new metric
Providing guidance for future research in data insight discovery
Abstract
Data analysis has become an indispensable part of scientific research. To discover the latent knowledge and insights hidden within massive datasets, we need to perform deep exploratory analysis to realize their full value. With the advent of large language models (LLMs) and multi-agent systems, more and more researchers are making use of these technologies for insight discovery. However, there are few benchmarks for evaluating insight discovery capabilities. As one of the most comprehensive existing frameworks, InsightBench also suffers from many critical flaws: format inconsistencies, poorly conceived objectives, and redundant insights. These issues may significantly affect the quality of data and the evaluation of agents. To address these issues, we thoroughly investigate shortcomings in InsightBench and propose essential criteria for a high-quality insight benchmark. Regarding this,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Advanced Text Analysis Techniques · Advanced Graph Neural Networks
