Combinatorial Bandits for Incentivizing Agents with Dynamic Preferences

Tanner Fiez; Shreyas Sekar; Liyuan Zheng; Lillian J. Ratliff

arXiv:1807.02297·cs.LG·July 9, 2018

Combinatorial Bandits for Incentivizing Agents with Dynamic Preferences

Tanner Fiez, Shreyas Sekar, Liyuan Zheng, Lillian J. Ratliff

PDF

Open Access

TL;DR

This paper introduces a multi-armed bandit framework for dynamically matching incentives to users with evolving preferences, optimizing engagement in resource-constrained digital platforms.

Contribution

It develops a novel algorithm combining greedy matching, UCB bandits, and Markov chain mixing times, with theoretical regret bounds and practical validation.

Findings

01

The algorithm achieves sublinear regret bounds.

02

Performance demonstrated on synthetic data.

03

Effective in real-world bike-sharing platform scenario.

Abstract

The design of personalized incentives or recommendations to improve user engagement is gaining prominence as digital platform providers continually emerge. We propose a multi-armed bandit framework for matching incentives to users, whose preferences are unknown a priori and evolving dynamically in time, in a resource constrained environment. We design an algorithm that combines ideas from three distinct domains: (i) a greedy matching paradigm, (ii) the upper confidence bound algorithm (UCB) for bandits, and (iii) mixing times from the theory of Markov chains. For this algorithm, we provide theoretical bounds on the regret and demonstrate its performance via both synthetic and realistic (matching supply and demand in a bike-sharing platform) examples.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Reinforcement Learning in Robotics