Online Conformal Abstention for Factuality Control Under Adversarial Bandit Feedback
Minjae Lee, Yoonjae Jung, Sangdon Park

TL;DR
This paper introduces ExAUL, an online learning framework that improves the reliability of generative models by controlling false responses through conformal abstention, even with partial feedback and adversarial conditions.
Contribution
ExAUL provides a novel online learning method with theoretical guarantees for FDR control using partial feedback, addressing challenges in real-world, adversarial environments.
Findings
ExAUL achieves an $O( oot T)$ regret bound, ensuring effective FDR control.
Empirical results show ExAUL maintains high coverage while controlling false responses.
The framework is effective for large language models in non-stationary, adversarial settings.
Abstract
As interactive generative systems are increasingly deployed in real-world applications, their tendency to generate unreliable or false responses raises serious concerns. Conformal abstention mitigates this risk by ensuring that the system answers only when confident. However, real-world deployments typically provide only partial user feedback (e.g., thumbs up/down) on the selected response and often operate in non-stationary or adversarial environments, for which effective learning methods are largely missing. To bridge this gap, we propose ExAUL, a novel online learning framework for conformal abstention with adversarial and partial feedback. Technically, we introduce (i) a novel conversion lemma}that translates the regret of any bandit algorithm into an FDR bound, and (ii) feedback unlocking, a strategy that exploits the structure of conformal abstention to extract additional learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
