Efficient Frameworks for Generalized Low-Rank Matrix Bandit Problems
Yue Kang, Cho-Jui Hsieh, Thomas C. M. Lee

TL;DR
This paper introduces efficient algorithms G-ESTT and G-ESTS for the generalized low-rank matrix bandit problem under the GLM framework, achieving improved regret bounds and computational tractability.
Contribution
The paper proposes novel algorithms G-ESTT and G-ESTS that improve regret bounds and computational efficiency for the generalized low-rank matrix bandit problem.
Findings
G-ESTT achieves regret of (\u00b7) ( ext{d}_1+ ext{d}_2) M r T
G-ESTS achieves regret of (\u00b7) (( ext{d}_1+ ext{d}_2)^{3/2} M r^{3/2} T)
Experiments show G-ESTS outperforms existing methods in computational efficiency and reward maximization.
Abstract
In the stochastic contextual low-rank matrix bandit problem, the expected reward of an action is given by the inner product between the action's feature matrix and some fixed, but initially unknown by matrix with rank , and an agent sequentially takes actions based on past experience to maximize the cumulative reward. In this paper, we study the generalized low-rank matrix bandit problem, which has been recently proposed in \cite{lu2021low} under the Generalized Linear Model (GLM) framework. To overcome the computational infeasibility and theoretical restrain of existing algorithms on this problem, we first propose the G-ESTT framework that modifies the idea from \cite{jun2019bilinear} by using Stein's method on the subspace estimation and then leverage the estimated subspaces via a regularization idea. Furthermore, we remarkably improve the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research
