Robust Uncertainty Quantification for Self-Evolving Large Language Models via Continual Domain Pretraining

Xiaofan Zhou; Lu Cheng

arXiv:2510.22931·cs.LG·October 29, 2025

Robust Uncertainty Quantification for Self-Evolving Large Language Models via Continual Domain Pretraining

Xiaofan Zhou, Lu Cheng

PDF

TL;DR

This paper proposes an adaptive conformal prediction framework for large language models undergoing continual domain pretraining, improving the reliability and informativeness of uncertainty estimates amid domain shifts.

Contribution

It introduces a novel adaptive rejection and non-exchangeable conformal prediction method tailored for continual domain pretraining of LLMs, addressing distribution shifts and abstention challenges.

Findings

01

Enhanced reliability of uncertainty quantification under domain shifts

02

Improved prediction set efficiency with adaptive reweighting

03

Effective abstention mechanism for unanswerable queries

Abstract

Continual Learning (CL) is essential for enabling self-evolving large language models (LLMs) to adapt and remain effective amid rapid knowledge growth. Yet, despite its importance, little attention has been given to establishing statistical reliability guarantees for LLMs under CL, particularly in the setting of continual domain pretraining (CDP). Conformal Prediction (CP) has shown promise in offering correctness guarantees for LLMs, but it faces major challenges in CDP: testing data often stems from unknown or shifting domain distributions, under which CP may no longer provide valid guarantees. Moreover, when high coverage is required, CP can yield excessively large prediction sets for unanswerable queries, reducing informativeness. To address these challenges, we introduce an adaptive rejection and non-exchangeable CP framework. Our method first estimates the distribution of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.