Catoni Contextual Bandits are Robust to Heavy-tailed Rewards
Chenlu Ye, Yujia Jin, Alekh Agarwal, Tong Zhang

TL;DR
This paper introduces a robust algorithm for contextual bandits that handles heavy-tailed rewards, achieving regret bounds that depend only on reward variance and logarithmically on the reward range, improving robustness over traditional methods.
Contribution
The paper develops a new algorithm using Catoni's estimator for robust contextual bandits, with regret bounds that are less sensitive to reward range and heavy tails, including unknown variance scenarios.
Findings
Regret depends only on reward variance and logarithmically on reward range R.
Proposed algorithms are robust to heavy-tailed rewards and unknown variances.
Matching lower bounds demonstrate the optimality of the regret bounds.
Abstract
Typical contextual bandit algorithms assume that the rewards at each round lie in some fixed range , and their regret scales polynomially with this reward range . However, many practical scenarios naturally involve heavy-tailed rewards or rewards where the worst-case range can be substantially larger than the variance. In this paper, we develop an algorithmic approach building on Catoni's estimator from robust statistics, and apply it to contextual bandits with general function approximation. When the variance of the reward at each round is known, we use a variance-weighted regression approach and establish a regret bound that depends only on the cumulative reward variance and logarithmically on the reward range as well as the number of rounds . For the unknown-variance case, we further propose a careful peeling-based algorithm and remove the need for cumbersome…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Decision-Making and Behavioral Economics · Auction Theory and Applications
