Uncertainty Quantification for LLM-based Code Generation

Senrong Xu; Yuhao Tan; Yanke Zhou; Guangyuan Wu; Zenan Li; Yuan Yao; Taolue Chen; Feng Xu; Xiaoxing Ma

arXiv:2605.12201·cs.SE·May 13, 2026

Uncertainty Quantification for LLM-based Code Generation

Senrong Xu, Yuhao Tan, Yanke Zhou, Guangyuan Wu, Zenan Li, Yuan Yao, Taolue Chen, Feng Xu, Xiaoxing Ma

PDF

TL;DR

This paper introduces RisCoSet, a novel method for quantifying uncertainty in LLM-based code generation by constructing risk-controlling prediction sets that include correct solutions with high confidence.

Contribution

RisCoSet leverages multiple hypothesis testing to improve uncertainty quantification in code generation, overcoming limitations of previous methods like PAC prediction sets.

Findings

01

Reduces code removal by up to 24.5% compared to state-of-the-art.

02

Guarantees high-confidence inclusion of correct solutions.

03

Effective across three different large language models.

Abstract

Prediction sets provide a theoretically grounded framework for quantifying uncertainty in machine learning models. Adapting them to structured generation tasks, in particular, large language model (LLM) based code generation, remains a challenging problem. An existing attempt proposes PAC prediction sets but is limited by its strong monotonicity assumption on risk and single-label classification framework, which severely limits the space of candidate programs and cannot accommodate the multiple valid outputs inherent to code generation. To address these limitations, we propose an approach RisCoSet that leverages multiple hypothesis testing to construct risk-controlling predictions for LLM-based code generation. Given a trained code generation model, we produce a prediction set represented by a partial program, which is guaranteed to contain a correct solution with high confidence.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.