Exploiting LLMs for Automatic Hypothesis Assessment via a Logit-Based Calibrated Prior

Yue Gong; Raul Castro Fernandez

arXiv:2506.03444·cs.LG·June 5, 2025

Exploiting LLMs for Automatic Hypothesis Assessment via a Logit-Based Calibrated Prior

Yue Gong, Raul Castro Fernandez

PDF

Open Access

TL;DR

This paper introduces a novel LLM-based prior for automatically assessing the novelty of correlations in data, aiding hypothesis evaluation by predicting correlation values with high accuracy and generalization.

Contribution

It proposes the Logit-based Calibrated Prior, a new method leveraging LLMs to predict correlation values, outperforming existing classifiers and demonstrating context-sensitive reasoning.

Findings

01

Achieves 78.8% sign accuracy in correlation prediction

02

Outperforms fine-tuned RoBERTa in binary correlation prediction

03

Generalizes to unseen correlations, indicating reasoning beyond memorization

Abstract

As hypothesis generation becomes increasingly automated, a new bottleneck has emerged: hypothesis assessment. Modern systems can surface thousands of statistical relationships-correlations, trends, causal links-but offer little guidance on which ones are novel, non-trivial, or worthy of expert attention. In this work, we study the complementary problem to hypothesis generation: automatic hypothesis assessment. Specifically, we ask: given a large set of statistical relationships, can we automatically assess which ones are novel and worth further exploration? We focus on correlations as they are a common entry point in exploratory data analysis that often serve as the basis for forming deeper scientific or causal hypotheses. To support automatic assessment, we propose to leverage the vast knowledge encoded in LLMs' weights to derive a prior distribution over the correlation value of a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference · Gaussian Processes and Bayesian Inference