Incentivizing Exploration with Linear Contexts and Combinatorial Actions

Mark Sellke

arXiv:2306.01990·cs.GT·September 25, 2024·1 cites

Incentivizing Exploration with Linear Contexts and Combinatorial Actions

Mark Sellke

PDF

Open Access 1 Video

TL;DR

This paper extends incentivized bandit exploration to linear contexts with combinatorial actions, proposing a convexity condition for incentive compatibility and improving initial sample complexity in semibandits.

Contribution

It introduces a convexity-based condition for incentive compatibility in linear bandits and enhances initial data collection efficiency in semibandit models.

Findings

01

Incentive compatibility achieved under convexity condition for linear bandits.

02

Improved sample complexity for initial data collection in semibandit models.

03

Extension of incentive-compatible exploration to high-dimensional action spaces.

Abstract

We advance the study of incentivized bandit exploration, in which arm choices are viewed as recommendations and are required to be Bayesian incentive compatible. Recent work has shown under certain independence assumptions that after collecting enough initial samples, the popular Thompson sampling algorithm becomes incentive compatible. We give an analog of this result for linear bandits, where the independence of the prior is replaced by a natural convexity condition. This opens up the possibility of efficient and regret-optimal incentivized exploration in high-dimensional action spaces. In the semibandit model, we also improve the sample complexity for the pre-Thompson sampling phase of initial data collection.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Incentivizing Exploration with Linear Contexts and Combinatorial Actions· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Mobile Crowdsensing and Crowdsourcing