Cost-aware Cascading Bandits

Ruida Zhou; Chao Gan; Jing Yan; Cong Shen

arXiv:1805.08638·cs.LG·May 23, 2018·6 cites

Cost-aware Cascading Bandits

Ruida Zhou, Chao Gan, Jing Yan, Cong Shen

PDF

Open Access

TL;DR

This paper introduces a cost-aware cascading bandits model that optimizes the sequence and stopping point of item examination to maximize net reward, considering random costs, with proven optimal policies and regret bounds.

Contribution

It proposes a novel cost-aware cascading bandits framework, deriving optimal offline policies and an online algorithm with logarithmic regret bounds.

Findings

01

UCR-T1 policy is optimal in offline setting.

02

CC-UCB algorithm achieves O(log T) regret in online setting.

03

Experimental results validate the effectiveness of the proposed methods.

Abstract

In this paper, we propose a cost-aware cascading bandits model, a new variant of multi-armed ban- dits with cascading feedback, by considering the random cost of pulling arms. In each step, the learning agent chooses an ordered list of items and examines them sequentially, until certain stopping condition is satisfied. Our objective is then to max- imize the expected net reward in each step, i.e., the reward obtained in each step minus the total cost in- curred in examining the items, by deciding the or- dered list of items, as well as when to stop examina- tion. We study both the offline and online settings, depending on whether the state and cost statistics of the items are known beforehand. For the of- fline setting, we show that the Unit Cost Ranking with Threshold 1 (UCR-T1) policy is optimal. For the online setting, we propose a Cost-aware Cas- cading Upper Confidence Bound…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms