Cost-Efficient Online Decision Making: A Combinatorial Multi-Armed Bandit Approach
Arman Rahbar, Niklas {\AA}kerblom, Morteza Haghir Chehreghani

TL;DR
This paper introduces a cost-aware combinatorial multi-armed bandit framework for online decision making, balancing test costs and accuracy, with theoretical analysis and real-world experiments demonstrating its effectiveness.
Contribution
It presents a novel formulation incorporating test costs into combinatorial bandits and analyzes Thompson Sampling within this context.
Findings
Thompson Sampling effectively balances cost and accuracy.
Framework outperforms baseline methods in experiments.
Theoretical guarantees support practical applicability.
Abstract
Online decision making plays a crucial role in numerous real-world applications. In many scenarios, the decision is made based on performing a sequence of tests on the incoming data points. However, performing all tests can be expensive and is not always possible. In this paper, we provide a novel formulation of the online decision making problem based on combinatorial multi-armed bandits and take the (possibly stochastic) cost of performing tests into account. Based on this formulation, we provide a new framework for cost-efficient online decision making which can utilize posterior sampling or BayesUCB for exploration. We provide a theoretical analysis of Thompson Sampling for cost-efficient online decision making, and present various experimental results that demonstrate the applicability of our framework to real-world problems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Data Stream Mining Techniques
