Conformal Selective Acting: Anytime-Valid Risk Control for RLVR-Trained LLMs
Hamed Khosravi, Xiaoming Huo

TL;DR
This paper introduces Conformal Selective Acting (CSA), a new online risk control method for deploying RLVR-trained LLMs with per-deployment error budgets, ensuring safety guarantees without requiring model changes.
Contribution
CSA is a novel conformal wrapper that maintains pathwise validity and selective risk control in adaptive, online settings, filling a gap in existing conformal risk methods.
Findings
CSA achieves anytime-pathwise selective-risk bounds with $R_T^{\mathrm{act}}\le\alpha+O(N_T^{-1/2})$
CSA provides rate-optimal certification matching $\Theta(\bar\eta^{-2}\log(1/\delta))$
CSA outperforms other methods in 8 specialist benchmarks, 16 adversarial cells, and 5 live RLVR cells, ensuring safety and deployment without model modifications.
Abstract
A local specialist LLM, fine-tuned with reinforcement learning from verifiable rewards (RLVR) on operator-local data, is installed in a regulated organization with per-deployment error budget . The operator needs a safety certificate for this deployment's stream at every round: no pooling across deployments, no waiting for a long-run average. Existing wrappers cannot deliver this on adaptive, online-updated streams: offline conformal-risk methods require exchangeability; online-conformal methods bound only long-run averages; non-exchangeable extensions are marginally valid; and the closest anytime wrapper, A-RCPS, controls marginal rather than selective risk. Using a (test statistic, validity guarantee, deployment rule) framework, we identify one empty cell forced by deployment requirements: e-process per threshold, selective risk, anytime-pathwise validity,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
