COPU: Conformal Prediction for Uncertainty Quantification in Natural   Language Generation

Sean Wang; Yicheng Jiang; Yuxin Tang; Lu Cheng; Hanjie Chen

arXiv:2502.12601·cs.CL·April 9, 2025

COPU: Conformal Prediction for Uncertainty Quantification in Natural Language Generation

Sean Wang, Yicheng Jiang, Yuxin Tang, Lu Cheng, Hanjie Chen

PDF

Open Access

TL;DR

This paper introduces heirmethod, a conformal prediction approach for natural language generation that explicitly includes the ground truth in candidate outputs, improving uncertainty quantification across various models and tasks.

Contribution

We propose heirmethod, a novel conformal prediction technique that guarantees inclusion of the ground truth in NLG outputs, enhancing uncertainty quantification for large language models.

Findings

01

heirmethod outperforms baseline methods in calibration accuracy.

02

It provides reliable uncertainty estimates across diverse NLG tasks.

03

The method is effective for multiple large language models.

Abstract

Uncertainty Quantification (UQ) for Natural Language Generation (NLG) is crucial for assessing the performance of Large Language Models (LLMs), as it reveals confidence in predictions, identifies failure modes, and gauges output reliability. Conformal Prediction (CP), a model-agnostic method that generates prediction sets with a specified error rate, has been adopted for UQ in classification tasks, where the size of the prediction set indicates the model's uncertainty. However, when adapting CP to NLG, the sampling-based method for generating candidate outputs cannot guarantee the inclusion of the ground truth, limiting its applicability across a wide range of error rates. To address this, we propose \ourmethod, a method that explicitly adds the ground truth to the candidate outputs and uses logit scores to measure nonconformity. Our experiments with six LLMs on four NLG tasks show that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling