Fixed confidence community mode estimation

Meera Pai; Nikhil Karamchandani; Jayakrishnan Nair

arXiv:2309.12687·math.ST·September 25, 2023

Fixed confidence community mode estimation

Meera Pai, Nikhil Karamchandani, Jayakrishnan Nair

PDF

Open Access

TL;DR

This paper develops and analyzes algorithms for estimating the largest community in a population with fixed confidence, demonstrating how identity information can improve sampling efficiency.

Contribution

It introduces two models for community mode estimation, derives lower bounds, and proposes asymptotically optimal algorithms, highlighting the benefits of identity information.

Findings

01

Identity information improves sample efficiency.

02

Lower bounds match the proposed algorithms' complexity.

03

Algorithms are asymptotically optimal.

Abstract

Our aim is to estimate the largest community (a.k.a., mode) in a population composed of multiple disjoint communities. This estimation is performed in a fixed confidence setting via sequential sampling of individuals with replacement. We consider two sampling models: (i) an identityless model, wherein only the community of each sampled individual is revealed, and (ii) an identity-based model, wherein the learner is able to discern whether or not each sampled individual has been sampled before, in addition to the community of that individual. The former model corresponds to the classical problem of identifying the mode of a discrete distribution, whereas the latter seeks to capture the utility of identity information in mode estimation. For each of these models, we establish information theoretic lower bounds on the expected number of samples needed to meet the prescribed confidence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Advanced Bandit Algorithms Research · Data Stream Mining Techniques