Combinatorial Allocation Bandits with Nonlinear Arm Utility

Yuki Shibukawa; Koichi Tanaka; Yuta Saito; Shinji Ito

arXiv:2603.07005·cs.LG·March 10, 2026

Combinatorial Allocation Bandits with Nonlinear Arm Utility

Yuki Shibukawa, Koichi Tanaka, Yuta Saito, Shinji Ito

PDF

Open Access

TL;DR

This paper introduces Combinatorial Allocation Bandits, a new online learning framework focusing on maximizing arm satisfaction in matching platforms, with algorithms that achieve near-optimal regret bounds and demonstrate effectiveness in synthetic experiments.

Contribution

The paper formulates the novel CAB problem incorporating arm satisfaction, and develops UCB and TS algorithms with theoretical regret guarantees and empirical validation.

Findings

01

Proposed algorithms achieve near-optimal regret bounds.

02

Algorithms outperform existing methods in synthetic experiments.

03

Arm satisfaction maximization reduces participant churn.

Abstract

A matching platform is a system that matches different types of participants, such as companies and job-seekers. In such a platform, merely maximizing the number of matches can result in matches being concentrated on highly popular participants, which may increase dissatisfaction among other participants, such as companies, and ultimately lead to their churn, reducing the platform's profit opportunities. To address this issue, we propose a novel online learning problem, Combinatorial Allocation Bandits (CAB), which incorporates the notion of *arm satisfaction*. In CAB, at each round $t = 1, \dots, T$ , the learner observes $K$ feature vectors corresponding to $K$ arms for each of $N$ users, assigns each user to an arm, and then observes feedback following a generalized linear model (GLM). Unlike prior work, the learner's objective is not to maximize the number of positive feedback, but…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Recommender Systems and Techniques · Mobile Crowdsensing and Crowdsourcing