High-Dimensional Linear Bandits under Stochastic Latent Heterogeneity
Elynn Chen, Xi Chen, Wenbo Jing, Xiao Liu

TL;DR
This paper introduces a framework for high-dimensional linear bandits with stochastic latent heterogeneity, modeling unobserved subgroups and establishing fundamental limits on regret due to inherent classification uncertainty.
Contribution
It proposes a phased EM-greedy algorithm for joint learning of latent groups and rewards, and reveals a stochastic barrier that prevents sub-linear regret against fully informed oracles.
Findings
Optimal estimation and classification guarantees achieved.
Linear strong regret growth due to irreducible uncertainty.
Minimax-optimal sublinear regular regret established.
Abstract
This paper addresses the critical challenge of stochastic latent heterogeneity in online decision-making, where individuals' responses to actions vary not only with observable contexts but also with unobserved, randomly realized subgroups. Existing data-driven approaches largely capture observable heterogeneity through contextual features but fail when the sources of variation are latent and stochastic. We propose a latent heterogeneous bandit framework that explicitly models probabilistic subgroup membership and group-specific reward functions, using promotion targeting as a motivating example. Our phased EM-greedy algorithm jointly learns latent group probabilities and reward parameters in high dimensions, achieving optimal estimation and classification guarantees. Our analysis reveals a new phenomenon unique to decision-making with stochastic latent subgroups: randomness in group…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Distributed Sensor Networks and Detection Algorithms · Smart Grid Energy Management
MethodsAttentive Walk-Aggregating Graph Neural Network
