Learning mixtures of structured distributions over discrete domains
Siu-on Chan, Ilias Diakonikolas, Rocco A. Servedio, Xiaorui Sun

TL;DR
This paper introduces an efficient algorithm for learning mixtures of structured distributions over discrete domains, leveraging the property that these distributions can be approximated by histograms with few bins.
Contribution
The paper presents a general method for learning mixtures of distributions that are well-approximated by histograms, applicable to various distribution classes like log-concave and unimodal.
Findings
Efficient algorithms for mixture learning with near-optimal sample complexity.
Distribution classes such as log-concave and unimodal satisfy the histogram approximation property.
The approach achieves near-optimal performance for multiple natural distribution families.
Abstract
Let be a class of probability distributions over the discrete domain We show that if satisfies a rather general condition -- essentially, that each distribution in can be well-approximated by a variable-width histogram with few bins -- then there is a highly efficient (both in terms of running time and sample complexity) algorithm that can learn any mixture of unknown distributions from We analyze several natural types of distributions over , including log-concave, monotone hazard rate and unimodal distributions, and show that they have the required structural property of being well-approximated by a histogram with few bins. Applying our general algorithm, we obtain near-optimally efficient algorithms for all these mixture learning problems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Imbalanced Data Classification Techniques · Domain Adaptation and Few-Shot Learning
