Trustworthy scientific inference with generative models

James Carzon; Luca Masserano; Joshua D. Ingram; Alex Shen; Antonio Carlos Herling Ribeiro Junior; Tommaso Dorigo; Michele Doro; Joshua S. Speagle; Rafael Izbicki; Ann B. Lee

arXiv:2508.02602·stat.ML·January 15, 2026

Trustworthy scientific inference with generative models

James Carzon, Luca Masserano, Joshua D. Ingram, Alex Shen, Antonio Carlos Herling Ribeiro Junior, Tommaso Dorigo, Michele Doro, Joshua S. Speagle, Rafael Izbicki, Ann B. Lee

PDF

Open Access

TL;DR

This paper introduces FreB, a rigorous protocol that transforms AI-generated posteriors into valid confidence regions, ensuring trustworthy scientific inference even with complex, intractable likelihoods.

Contribution

The paper presents FreB, a novel method that guarantees valid confidence regions from generative models, improving reliability in scientific inference across various disciplines.

Findings

01

FreB produces confidence regions with correct coverage probabilities.

02

It effectively handles dataset shifts and model misspecifications.

03

FreB enhances trustworthiness of AI-based scientific conclusions.

Abstract

Generative artificial intelligence (AI) excels at producing complex data structures (text, images, videos) by learning patterns from training examples. Across scientific disciplines, researchers are now applying generative models to "inverse problems" to directly predict hidden parameters from observed data along with measures of uncertainty. While these predictive or posterior-based methods can handle intractable likelihoods and large-scale studies, they can also produce biased or overconfident conclusions even without model misspecifications. We present a solution with Frequentist-Bayes (FreB), a mathematically rigorous protocol that reshapes AI-generated posterior probability distributions into (locally valid) confidence regions that consistently include true parameters with the expected probability, while achieving minimum size when training and target data align. We demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Materials Science · Generative Adversarial Networks and Image Synthesis · Computational and Text Analysis Methods