Prior Ordering and Monotonicity in Dirichlet Bandits

Yaming Yu

arXiv:1101.4903·math.ST·January 26, 2011·1 cites

Prior Ordering and Monotonicity in Dirichlet Bandits

Yaming Yu

PDF

Open Access

TL;DR

This paper investigates how the expected payoff and optimal strategies in Dirichlet bandit problems change with prior information, revealing monotonic relationships and extending classical results in Bayesian bandit theory.

Contribution

It establishes new monotonicity properties of the maximum expected payoff with respect to Dirichlet process priors, settling a conjecture and extending previous work on Bernoulli bandits.

Findings

01

Expected payoff increases with larger prior means.

02

Expected payoff decreases with higher prior weights for fixed mean.

03

Results generalize classical bandit theory to Dirichlet process priors.

Abstract

One of two independent stochastic processes (arms) are to be selected at each of n stages. The selection is sequential and depends on past observations as well as the prior information. Observations from arm i are independent given a distribution P_i, and, following Clayton and Berry (1985), P_i's have independent Dirichlet process priors. The objective is to maximize the expected future-discounted sum of the n observations. We study structural properties of the bandit, in particular how the maximum expected payoff and the optimal strategy vary with the Dirichlet process priors. The main results are (i) for a particular arm and a fixed prior weight, the maximum expected payoff increases as the mean of the Dirichlet process prior becomes larger in the increasing convex order; (ii) for a fixed prior mean, the maximum expected payoff decreases as the prior weight increases. Specializing to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Recommender Systems and Techniques