Collaboratively Learning the Best Option, Using Bounded Memory

Lili Su; Martin Zubeldia; Nancy Lynch

arXiv:1802.08159·cs.LG·November 13, 2018·1 cites

Collaboratively Learning the Best Option, Using Bounded Memory

Lili Su, Martin Zubeldia, Nancy Lynch

PDF

Open Access

TL;DR

This paper demonstrates that social interaction enables groups of individuals with limited memory to learn the best option in multi-armed bandit problems, a task impossible for isolated individuals, using mean-field approximation techniques.

Contribution

It introduces a novel mean-field analysis approach to study social learning dynamics with bounded memory, proving group convergence to the best option.

Findings

01

As group size increases, all individuals learn the best option with probability approaching 1.

02

The fraction of individuals preferring the best option grows exponentially fast over time.

03

The mean-field approximation simplifies analysis by modeling stochastic dynamics with deterministic ODEs.

Abstract

We consider multi-armed bandit problems in social groups wherein each individual has bounded memory and shares the common goal of learning the best arm/option. We say an individual learns the best option if eventually (as $t \to \infty$ ) it pulls only the arm with the highest average reward. While this goal is provably impossible for an isolated individual, we show that, in social groups, this goal can be achieved easily with the aid of social persuasion, i.e., communication. Specifically, we study the learning dynamics wherein an individual sequentially decides on which arm to pull next based on not only its private reward feedback but also the suggestions provided by randomly chosen peers. Our learning dynamics are hard to analyze via explicit probabilistic calculations due to the stochastic dependency induced by social interaction. Instead, we employ the mean-field approximation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Quantum many-body systems · Opinion Dynamics and Social Influence