Efficient Statistics, in High Dimensions, from Truncated Samples
Constantinos Daskalakis, Themis Gouleakis, Christos Tzamos and, Manolis Zampetakis

TL;DR
This paper introduces an efficient polynomial-time algorithm for estimating the mean and covariance of a multivariate normal distribution from truncated samples, given oracle access to the truncation set, addressing a classical statistical problem.
Contribution
The paper presents the first polynomial-time algorithm for accurate parameter estimation of truncated multivariate normals with oracle access to the truncation set.
Findings
Accurate estimation of mean and covariance is possible with oracle access.
Estimation is impossible without oracle access to the truncation set.
The algorithm works for arbitrary accuracy under certain conditions.
Abstract
We provide an efficient algorithm for the classical problem, going back to Galton, Pearson, and Fisher, of estimating, with arbitrary accuracy the parameters of a multivariate normal distribution from truncated samples. Truncated samples from a -variate normal means a samples is only revealed if it falls in some subset ; otherwise the samples are hidden and their count in proportion to the revealed samples is also hidden. We show that the mean and covariance matrix can be estimated with arbitrary accuracy in polynomial-time, as long as we have oracle access to , and has non-trivial measure under the unknown -variate normal distribution. Additionally we show that without oracle access to , any non-trivial estimation is impossible.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
