An Arm-Wise Randomization Approach to Combinatorial Linear Semi-Bandits
Kei Takemura, Shinji Ito

TL;DR
This paper introduces an arm-wise randomization technique for combinatorial linear semi-bandits, improving performance in clustered feature scenarios through new algorithms and theoretical regret bounds.
Contribution
The paper proposes a novel arm-wise randomization method and two algorithms, PC^2UCB and TS, with theoretical analysis and empirical validation for clustered feature cases.
Findings
Outperforms existing algorithms in clustered scenarios
Provides high probability asymptotic regret bounds
Demonstrates effectiveness on real-world datasets
Abstract
Combinatorial linear semi-bandits (CLS) are widely applicable frameworks of sequential decision-making, in which a learner chooses a subset of arms from a given set of arms associated with feature vectors. Existing algorithms work poorly for the clustered case, in which the feature vectors form several large clusters. This shortcoming is critical in practice because it can be found in many applications, including recommender systems. In this paper, we clarify why such a shortcoming occurs, and we introduce a key technique of arm-wise randomization to overcome it. We propose two algorithms with this technique: the perturbed CUCB (PCUCB) and the Thompson sampling (TS). Our empirical evaluation with artificial and real-world datasets demonstrates that the proposed algorithms with the arm-wise randomization technique outperform the existing algorithms without this technique,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Reinforcement Learning in Robotics
