CriticAL: Critic Automation with Language Models

Michael Y. Li; Vivek Vajipey; Noah D. Goodman; Emily B. Fox

arXiv:2411.06590·cs.LG·November 12, 2024

CriticAL: Critic Automation with Language Models

Michael Y. Li, Vivek Vajipey, Noah D. Goodman, Emily B. Fox

PDF

Open Access

TL;DR

CriticAL leverages large language models to automate scientific model criticism by generating and evaluating discrepancies between models and data, improving model validation and development.

Contribution

This paper introduces CriticAL, a novel framework that automates model criticism using LLMs within a hypothesis testing approach, addressing hallucination issues and enhancing scientific discovery.

Findings

01

CriticAL reliably generates accurate critiques without hallucinations.

02

CriticAL's critiques are preferred for transparency and actionability.

03

CriticAL enables LLM scientists to improve models on real datasets.

Abstract

Understanding the world through models is a fundamental goal of scientific research. While large language model (LLM) based approaches show promise in automating scientific discovery, they often overlook the importance of criticizing scientific models. Criticizing models deepens scientific understanding and drives the development of more accurate models. Automating model criticism is difficult because it traditionally requires a human expert to define how to compare a model with data and evaluate if the discrepancies are significant--both rely heavily on understanding the modeling assumptions and domain. Although LLM-based critic approaches are appealing, they introduce new challenges: LLMs might hallucinate the critiques themselves. Motivated by this, we introduce CriticAL (Critic Automation with Language Models). CriticAL uses LLMs to generate summary statistics that capture…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Semantic Web and Ontologies