Learning Poisson Binomial Distributions
Constantinos Daskalakis, Ilias Diakonikolas, Rocco A. Servedio

TL;DR
This paper introduces efficient algorithms for learning Poisson Binomial Distributions, achieving near-optimal sample complexity and running time, and advances understanding of this fundamental distribution class.
Contribution
It provides the first highly efficient algorithms for learning PBDs with near-optimal sample complexity and running time, settling the complexity of this basic problem.
Findings
An algorithm learns PBDs with (rac{1}{\u03b5^3}) samples, independent of n.
A proper learning algorithm uses (rac{1}{^2}) samples with near-optimal complexity.
The algorithms are nearly optimal in sample complexity, matching lower bounds.
Abstract
We consider a basic problem in unsupervised learning: learning an unknown \emph{Poisson Binomial Distribution}. A Poisson Binomial Distribution (PBD) over is the distribution of a sum of independent Bernoulli random variables which may have arbitrary, potentially non-equal, expectations. These distributions were first studied by S. Poisson in 1837 \cite{Poisson:37} and are a natural -parameter generalization of the familiar Binomial Distribution. Surprisingly, prior to our work this basic learning problem was poorly understood, and known results for it were far from optimal. We essentially settle the complexity of the learning problem for this basic class of distributions. As our first main result we give a highly efficient algorithm which learns to -accuracy (with respect to the total variation distance) using samples…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Imbalanced Data Classification Techniques · Algorithms and Data Compression
