Can LLMs Help Uncover Insights about LLMs? A Large-Scale, Evolving Literature Analysis of Frontier LLMs
Jungsoo Park, Junmo Kang, Gabriel Stanovsky, Alan Ritter

TL;DR
This paper introduces LLMEvalDB, a semi-automated, LLM-powered tool for extracting and analyzing experimental results from LLM literature, significantly reducing manual effort and enabling continuous, up-to-date insights into frontier LLMs.
Contribution
It presents LLMEvalDB, a novel structured dataset created through LLM-assisted data extraction, facilitating large-scale, ongoing analysis of LLM research findings.
Findings
In-context examples improve coding and multimodal tasks.
Limited gains of in-context examples in math reasoning.
Automated analysis reproduces and extends manual literature insights.
Abstract
The surge of LLM studies makes synthesizing their findings challenging. Analysis of experimental results from literature can uncover important trends across studies, but the time-consuming nature of manual data extraction limits its use. Our study presents a semi-automated approach for literature analysis that accelerates data extraction using LLMs. It automatically identifies relevant arXiv papers, extracts experimental results and related attributes, and organizes them into a structured dataset, LLMEvalDB. We then conduct an automated literature analysis of frontier LLMs, reducing the effort of paper surveying and data extraction by more than 93% compared to manual approaches. We validate LLMEvalDB by showing that it reproduces key findings from a recent manual analysis of Chain-of-Thought (CoT) reasoning and also uncovers new insights that go beyond it, showing, for example, that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsEntrepreneurship Studies and Influences · Private Equity and Venture Capital
