Asymptotic properties of a multicolored random reinforced urn model with   an application to multi-armed bandits

Li Yang; Jiang Hu; Jianghao Li; Zhidong Bai

arXiv:2406.10854·math.ST·June 18, 2024

Asymptotic properties of a multicolored random reinforced urn model with an application to multi-armed bandits

Li Yang, Jiang Hu, Jianghao Li, Zhidong Bai

PDF

Open Access

TL;DR

This paper analyzes the asymptotic behavior of a multicolored, multi-drawing reinforced urn model, establishing convergence properties and applying findings to hypothesis testing in multi-armed bandit problems.

Contribution

It introduces a multicolored, multi-drawing reinforced urn model and derives its limiting behavior, strong convergence estimators, and their asymptotic independence, with applications to multi-armed bandits.

Findings

01

Established strong convergence of urn composition.

02

Derived asymptotic normality of reinforcement mean estimators.

03

Identified asymptotic independence among estimators of different reinforcement means.

Abstract

The random self-reinforcement mechanism, characterized by the principle of ``the rich get richer'', has demonstrated significant utility across various domains. One prominent model embodying this mechanism is the random reinforcement urn model. This paper investigates a multicolored, multiple-drawing variant of the random reinforced urn model. We establish the limiting behavior of the normalized urn composition and demonstrate strong convergence upon scaling the counts of each color. Additionally, we derive strong convergence estimators for the reinforcement means, i.e., for the expectations of the replacement matrix's diagonal elements, and prove their joint asymptotic normality. It is noteworthy that the estimators of the largest reinforcement mean are asymptotically independent of the estimators of the other smaller reinforcement means. Additionally, if a reinforcement mean is not…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research