CONQUER: Confusion Queried Online Bandit Learning

Daniel Barsky; Koby Crammer

arXiv:1510.08974·cs.LG·January 26, 2016

CONQUER: Confusion Queried Online Bandit Learning

Daniel Barsky, Koby Crammer

PDF

Open Access

TL;DR

This paper introduces CONQUER, a new online bandit learning algorithm for recommendation systems that select two items based on context, using a second-order framework with confidence bounds, and demonstrates its effectiveness through theoretical analysis and experiments.

Contribution

It proposes a novel second-order algorithm framework for dual-item recommendation with a regret bound analysis and empirical validation across multiple domains.

Findings

01

UCB-based algorithms are less effective than greedy or sampling methods.

02

The proposed algorithm achieves a regret bound of O(Q_T + sqrt(TQ_T log T) + sqrt(T) log T).

03

Experimental results show advantages over related algorithms across 33 domains.

Abstract

We present a new recommendation setting for picking out two items from a given set to be highlighted to a user, based on contextual input. These two items are presented to a user who chooses one of them, possibly stochastically, with a bias that favours the item with the higher value. We propose a second-order algorithm framework that members of it use uses relative upper-confidence bounds to trade off exploration and exploitation, and some explore via sampling. We analyze one algorithm in this framework in an adversarial setting with only mild assumption on the data, and prove a regret bound of $O (Q_{T} + T Q_{T} lo g T + T lo g T)$ , where $T$ is the number of rounds and $Q_{T}$ is the cumulative approximation error of item values using a linear model. Experiments with product reviews from 33 domains show the advantage of our methods over algorithms designed for related settings,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems