Catching a Moving Subspace: Low-Rank Bandits Beyond Stationarity
Hamed Khosravi, Xiaoming Huo

TL;DR
This paper introduces a method for low-rank bandit problems with drifting subspaces, achieving tight theoretical bounds and demonstrating superior empirical performance across diverse datasets.
Contribution
It characterizes the identification boundary for recovering moving subspaces and develops an algorithm with optimal dynamic regret in non-stationary low-rank bandits.
Findings
The moving subspace is recoverable under specific conditions involving noise and probe support.
The proposed SPSC algorithm achieves regret bounds proportional to the intrinsic rank, not ambient dimension.
Empirical results show SPSC outperforms baselines on multiple real-world and synthetic datasets.
Abstract
Many bandit deployments (recommendation, clinical dosing, ad targeting) share two facts prior work handles only in isolation: rewards live on a low-dimensional latent subspace, and that subspace drifts. Stationary low-rank bandits exploit rank but break under subspace change; non-stationary linear bandits adapt to drift but pay ambient rate . We study piecewise-stationary low-rank linear contextual bandits with scalar feedback: with rank- factor constant within each of unknown segments and able to shift at boundaries. Our results are tight along three axes. (i) Identification boundary. With single-play scalar rewards, the moving subspace is recoverable through quadratic functionals of rewards iff three probe-side conditions hold: known noise variance, bounded state-noise coupling, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
