Robust LLM Performance Certification via Constrained Maximum Likelihood Estimation

Minghe Shen; Ananth Balashankar; Adam Fisch; David Madras; Miguel Rodrigues

arXiv:2604.03257·cs.CL·April 7, 2026

Robust LLM Performance Certification via Constrained Maximum Likelihood Estimation

Minghe Shen, Ananth Balashankar, Adam Fisch, David Madras, Miguel Rodrigues

PDF

TL;DR

This paper introduces a constrained maximum likelihood estimation method for accurately estimating LLM failure rates by combining human labels, judge annotations, and domain constraints, outperforming existing approaches.

Contribution

The paper presents a novel, practical constrained MLE approach that integrates multiple signals and domain knowledge for more accurate LLM failure rate estimation.

Findings

01

Constrained MLE outperforms state-of-the-art baselines across various settings.

02

The method provides more accurate and lower-variance failure rate estimates.

03

Empirical validation demonstrates robustness across diverse experimental regimes.

Abstract

The ability to rigorously estimate the failure rates of large language models (LLMs) is a prerequisite for their safe deployment. Currently, however, practitioners often face a tradeoff between expensive human gold standards and potentially severely-biased automatic annotation schemes such as "LLM-as-a-Judge" labeling. In this paper, we propose a new, practical, and efficient approach to LLM failure rate estimation based on constrained maximum-likelihood estimation (MLE). Our method integrates three distinct signal sources: (i) a small, high-quality human-labeled calibration set, (ii) a large corpus of LLM-judge annotations, and, most importantly, (iii) additional side information via domain-specific constraints derived from known bounds on judge performance statistics. We validate our approach through a comprehensive empirical study, benchmarking it against state-of-the-art baselines…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.