What Do We Care About in Bandits with Noncompliance? BRACE: Bandits with Recommendations, Abstention, and Certified Effects
Nicol\'as Della Penna

TL;DR
This paper introduces BRACE, a new algorithm for bandit problems with noncompliance, providing valid policy evaluation and optimal recommendation policies under various conditions, with theoretical guarantees and empirical validation.
Contribution
The paper formalizes the objective-choice problem in bandits with noncompliance and proposes BRACE, a parameter-free algorithm that offers valid inference and optimal policy identification.
Findings
BRACE achieves simultaneous policy-value validity.
It identifies operationally optimal recommendation and treatment policies.
Experiments demonstrate safety, abstention, and structural uncertainty handling.
Abstract
Bandits with noncompliance separate the learner's recommendation from the treatment actually delivered, so the learning target itself must be chosen. A platform may care about recommendation welfare in the current mediated workflow, treatment learning for a future direct-control regime, or anytime-valid uncertainty for one of those targets. These objectives need not agree. We formalize this objective-choice problem, identify the direct-control regime in which recommendation and treatment objectives collapse, and show by example that recommendation welfare can strictly exceed every learner-measurable treatment policy when downstream actors use private information. For finite-context square-IV problems we propose BRACE, a parameter-free phase-doubling algorithm that performs IV inversion only after matrix certification and otherwise returns full-range but honest structural intervals.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Advanced Causal Inference Techniques · Explainable Artificial Intelligence (XAI)
