Learning Poisson Binomial Distributions

Constantinos Daskalakis; Ilias Diakonikolas; Rocco A. Servedio

arXiv:1107.2702·cs.DS·February 18, 2015·2 cites

Learning Poisson Binomial Distributions

Constantinos Daskalakis, Ilias Diakonikolas, Rocco A. Servedio

PDF

Open Access

TL;DR

This paper introduces efficient algorithms for learning Poisson Binomial Distributions, achieving near-optimal sample complexity and running time, and advances understanding of this fundamental distribution class.

Contribution

It provides the first highly efficient algorithms for learning PBDs with near-optimal sample complexity and running time, settling the complexity of this basic problem.

Findings

01

An algorithm learns PBDs with (rac{1}{\u03b5^3}) samples, independent of n.

02

A proper learning algorithm uses (rac{1}{^2}) samples with near-optimal complexity.

03

The algorithms are nearly optimal in sample complexity, matching lower bounds.

Abstract

We consider a basic problem in unsupervised learning: learning an unknown \emph{Poisson Binomial Distribution}. A Poisson Binomial Distribution (PBD) over ${0, 1, \dots, n}$ is the distribution of a sum of $n$ independent Bernoulli random variables which may have arbitrary, potentially non-equal, expectations. These distributions were first studied by S. Poisson in 1837 \cite{Poisson:37} and are a natural $n$ -parameter generalization of the familiar Binomial Distribution. Surprisingly, prior to our work this basic learning problem was poorly understood, and known results for it were far from optimal. We essentially settle the complexity of the learning problem for this basic class of distributions. As our first main result we give a highly efficient algorithm which learns to $\eps$ -accuracy (with respect to the total variation distance) using $\tilde{O} (1/ \eps^{3})$ samples…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Imbalanced Data Classification Techniques · Algorithms and Data Compression