A KL-LUCB Bandit Algorithm for Large-Scale Crowdsourcing

Bob Mankoff; Robert Nowak; Ervin Tanczos

arXiv:1709.03570·math.ST·September 13, 2017

A KL-LUCB Bandit Algorithm for Large-Scale Crowdsourcing

Bob Mankoff, Robert Nowak, Ervin Tanczos

PDF

Open Access

TL;DR

This paper introduces a new bandit algorithm combining lil-UCB and KL-LUCB for efficient best-arm identification in large-scale crowdsourcing, supported by theoretical bounds and experimental validation.

Contribution

It presents a novel anytime confidence bound for bounded distributions and fuses existing algorithms to improve large-scale crowdsourcing performance.

Findings

01

Proves a new confidence bound for bounded rewards

02

Demonstrates improved identification efficiency in experiments

03

Validates theoretical results with real-world data

Abstract

This paper focuses on best-arm identification in multi-armed bandits with bounded rewards. We develop an algorithm that is a fusion of lil-UCB and KL-LUCB, offering the best qualities of the two algorithms in one method. This is achieved by proving a novel anytime confidence bound for the mean of bounded distributions, which is the analogue of the LIL-type bounds recently developed for sub-Gaussian distributions. We corroborate our theoretical results with numerical experiments based on the New Yorker Cartoon Caption Contest.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Gaussian Processes and Bayesian Inference