Efficient Parameter Estimation of Truncated Boolean Product Distributions
Dimitris Fotakis, Alkis Kalavasis, Christos Tzamos

TL;DR
This paper investigates the problem of learning Boolean product distributions from truncated samples, introducing the concept of set fatness to enable efficient parameter estimation and extending results to ranking models.
Contribution
It introduces the notion of fatness of truncation sets, enabling efficient learning from truncated samples in discrete models, and generalizes to ranking distributions.
Findings
Samples from sufficiently fat truncation sets contain enough information.
Efficient learning is possible under certain conditions on the truncation set.
The approach extends to Mallows models for ranking distributions.
Abstract
We study the problem of estimating the parameters of a Boolean product distribution in dimensions, when the samples are truncated by a set accessible through a membership oracle. This is the first time that the computational and statistical complexity of learning from truncated samples is considered in a discrete setting. We introduce a natural notion of fatness of the truncation set , under which truncated samples reveal enough information about the true distribution. We show that if the truncation set is sufficiently fat, samples from the true distribution can be generated from truncated samples. A stunning consequence is that virtually any statistical task (e.g., learning in total variation distance, parameter estimation, uniformity or identity testing) that can be performed efficiently for Boolean product distributions, can also be performed from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
