Budget-Constrained Causal Bandits: Bridging Uplift Modeling and Sequential Decision-Making
Abhirami Pillai

TL;DR
This paper introduces Budget-Constrained Causal Bandits (BCCB), an online learning framework for treatment allocation under budget constraints that is effective in cold-start scenarios and outperforms traditional offline methods and existing online algorithms.
Contribution
The paper presents BCCB, a unified online approach that learns ad effectiveness, explores uncertain responses, and manages budget pacing simultaneously, addressing cold-start challenges.
Findings
BCCB is effective from the first user, unlike offline methods needing 10,000 observations.
BCCB has 3-5x lower variance in performance across runs.
BCCB outperforms Thompson Sampling, budgeted Thompson Sampling, and greedy HTE estimation.
Abstract
Treatment allocation under budget constraints is a central challenge in digital advertising: advertisers must decide which users to show ads to while spending a limited budget wisely. The standard approach follows a two-stage offline pipeline - first collect historical data to estimate heterogeneous treatment effects (HTE), then solve a constrained optimization to allocate the budget. This works well with abundant data, but fails in cold-start settings such as new campaigns, new markets, or new customer segments where little historical data exists. We propose Budget-Constrained Causal Bandits (BCCB), an online framework that learns which users respond to ads while simultaneously spending the budget, making treatment decisions one user at a time. BCCB unifies three components into a single sequential process: learning individual-level ad effectiveness, exploring users whose response is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
