Set-Valued Prediction for Large Language Models with Feasibility-Aware Coverage Guarantees
Ye Li, Anqi Hu, Yuanchang Ye, Shiyan Tong, Zhiyuan Wang, Bo Fu

TL;DR
This paper introduces a set-valued prediction framework for large language models that offers feasibility-aware coverage guarantees, addressing the limitations of point predictions by producing candidate sets with statistical validity.
Contribution
It develops a principled, data-driven calibration method for set predictions in LLMs that accounts for finite sampling limitations and guarantees coverage under feasible risk levels.
Findings
Framework provides statistically valid coverage guarantees.
Calibration method adapts to finite sampling constraints.
Experimental results show improved prediction reliability.
Abstract
Large language models (LLMs) inherently operate over a large generation space, yet conventional usage typically reports the most likely generation (MLG) as a point prediction, which underestimates the model's capability: although the top-ranked response can be incorrect, valid answers may still exist within the broader output space and can potentially be discovered through repeated sampling. This observation motivates moving from point prediction to set-valued prediction, where the model produces a set of candidate responses rather than a single MLG. In this paper, we propose a principled framework for set-valued prediction, which provides feasibility-aware coverage guarantees. We show that, given the finite-sampling nature of LLM generation, coverage is not always achievable: even with multiple samplings, LLMs may fail to yield an acceptable response for certain questions within the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods
