Feature and Parameter Selection in Stochastic Linear Bandits

Ahmadreza Moradipari; Berkay Turan; Yasin Abbasi-Yadkori; Mahnoosh; Alizadeh; Mohammad Ghavamzadeh

arXiv:2106.05378·cs.LG·June 20, 2022·1 cites

Feature and Parameter Selection in Stochastic Linear Bandits

Ahmadreza Moradipari, Berkay Turan, Yasin Abbasi-Yadkori, Mahnoosh, Alizadeh, Mohammad Ghavamzadeh

PDF

Open Access

TL;DR

This paper introduces efficient algorithms for feature and parameter selection in stochastic linear bandits, achieving near-optimal regret bounds and demonstrating effectiveness through synthetic and real-world experiments.

Contribution

It proposes computationally efficient algorithms for model selection in linear bandits with theoretical regret guarantees, improving dependence on the number of models.

Findings

01

Regret bounds are close to the case with known true model.

02

Algorithms perform well in synthetic and real-world experiments.

03

Achieves the best-known dependence on the number of models M.

Abstract

We study two model selection settings in stochastic linear bandits (LB). In the first setting, which we refer to as feature selection, the expected reward of the LB problem is in the linear span of at least one of $M$ feature maps (models). In the second setting, the reward parameter of the LB problem is arbitrarily selected from $M$ models represented as (possibly) overlapping balls in $R^{d}$ . However, the agent only has access to misspecified models, i.e.,~estimates of the centers and radii of the balls. We refer to this setting as parameter selection. For each setting, we develop and analyze a computationally efficient algorithm that is based on a reduction from bandits to full-information problems. This allows us to obtain regret bounds that are not worse (up to a $lo g M$ factor) than the case where the true model is known. This is the best-reported dependence on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Age of Information Optimization