Belief-Guided Inference Control for Large Language Model Services via Verifiable Observations

Wenhao Yuan; Chenchen Lin; Jian Chen; Jinfeng Xu; Shuo Yang; Edith Cheuk Han Ngai

arXiv:2604.27536·cs.AI·May 1, 2026

Belief-Guided Inference Control for Large Language Model Services via Verifiable Observations

Wenhao Yuan, Chenchen Lin, Jian Chen, Jinfeng Xu, Shuo Yang, Edith Cheuk Han Ngai

PDF

TL;DR

Veroic is a framework that improves large language model response reliability by adaptively controlling inference based on verifiable observations and risk-aware decision-making.

Contribution

It introduces a novel verifiable observation channel and formulates inference control as a POMDP for better quality-cost trade-offs in black-box LLMs.

Findings

01

Veroic achieves better quality-cost trade-offs.

02

It provides stronger risk estimation and calibration.

03

It demonstrates robustness in long-horizon inference control.

Abstract

In black-box large language model (LLM) services, response reliability is often only partially observable at decision time, while stronger inference pathways incur substantial computational cost, inducing a budgeted sequential decision problem: for each request, the system should decide whether the default low-cost response is sufficiently reliable or whether additional computation should be allocated to improve response quality. In this paper, we propose \textbf{Ver}ifiable \textbf{O}bservations for Risk-aware \textbf{I}nference \textbf{C}ontrol (\textsc{Veroic}), a framework for adaptive inference control in black-box LLM settings, which formulates request-time control as a \textit{partially observable Markov decision process} to capture partial observability and sequential budget coupling. It constructs a lightweight verifiable observation channel from the input-output pair by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.