Prune 'n Predict: Optimizing LLM Decision-making with Conformal Prediction

Harit Vishwakarma; Alan Mishler; Thomas Cook; Niccol\`o Dalmasso; Natraj Raman; Sumitra Ganesh

arXiv:2501.00555·cs.LG·July 15, 2025

Prune 'n Predict: Optimizing LLM Decision-making with Conformal Prediction

Harit Vishwakarma, Alan Mishler, Thomas Cook, Niccol\`o Dalmasso, Natraj Raman, Sumitra Ganesh

PDF

Open Access 1 Video

TL;DR

This paper introduces CROQ, a method that refines large language model questions using conformal prediction to improve accuracy, especially when combined with an optimization framework called CP-OPT that reduces prediction set sizes.

Contribution

The paper proposes CROQ, a novel question revision technique leveraging conformal prediction, and CP-OPT, an optimization framework to minimize set sizes while maintaining coverage, enhancing LLM decision-making.

Findings

01

CROQ improves LLM accuracy over standard inference.

02

Combining CROQ with CP-OPT yields larger gains in accuracy.

03

The methods are effective across multiple datasets and LLMs.

Abstract

Large language models (LLMs) are empowering decision-making in several applications, including tool or API usage and answering multiple-choice questions (MCQs). However, incorrect outputs pose significant risks in high-stakes domains like healthcare and finance. To quantify LLM uncertainty and thereby mitigate these risks, recent works employ conformal prediction (CP), a model- and distribution-agnostic framework that uses LLM outputs to generate a \emph{prediction set} containing the true answer with high probability. Leveraging CP, we propose \emph{conformal revision of questions} (CROQ), which revises the question by narrowing down the available choices to those in the prediction set and asking the LLM the revised question. We expect LLMs to be more accurate on revised questions with fewer choices. Furthermore, we expect CROQ to be effective when the prediction sets from CP are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Prune 'n Predict: Optimizing LLM Decision-making with Conformal Prediction· slideslive

Taxonomy

TopicsBig Data and Business Intelligence

MethodsSparse Evolutionary Training