Uncertainty Quantification for LLM-based Code Generation
Senrong Xu, Yuhao Tan, Yanke Zhou, Guangyuan Wu, Zenan Li, Yuan Yao, Taolue Chen, Feng Xu, Xiaoxing Ma

TL;DR
This paper introduces RisCoSet, a novel method for quantifying uncertainty in LLM-based code generation by constructing risk-controlling prediction sets that include correct solutions with high confidence.
Contribution
RisCoSet leverages multiple hypothesis testing to improve uncertainty quantification in code generation, overcoming limitations of previous methods like PAC prediction sets.
Findings
Reduces code removal by up to 24.5% compared to state-of-the-art.
Guarantees high-confidence inclusion of correct solutions.
Effective across three different large language models.
Abstract
Prediction sets provide a theoretically grounded framework for quantifying uncertainty in machine learning models. Adapting them to structured generation tasks, in particular, large language model (LLM) based code generation, remains a challenging problem. An existing attempt proposes PAC prediction sets but is limited by its strong monotonicity assumption on risk and single-label classification framework, which severely limits the space of candidate programs and cannot accommodate the multiple valid outputs inherent to code generation. To address these limitations, we propose an approach RisCoSet that leverages multiple hypothesis testing to construct risk-controlling predictions for LLM-based code generation. Given a trained code generation model, we produce a prediction set represented by a partial program, which is guaranteed to contain a correct solution with high confidence.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
