Efficient Statistics With Unknown Truncation, Polynomial Time Algorithms, Beyond Gaussians
Jane H. Lee, Anay Mehrotra, Manolis Zampetakis

TL;DR
This paper develops polynomial-time algorithms for estimating distributional parameters from truncated samples, extending beyond Gaussians and simple sets like halfspaces, with applications to Gaussian estimation and linear regression.
Contribution
It introduces new algorithms for parameter estimation in exponential families with unknown truncation sets, including Gaussian and linear regression models, under structural assumptions.
Findings
First polynomial-time algorithm for Gaussian estimation with unknown truncation.
First polynomial-time algorithm for linear regression with unknown truncation and Gaussian features.
Develops tools for robust PAC learning with positive and unlabeled samples.
Abstract
We study the estimation of distributional parameters when samples are shown only if they fall in some unknown set . Kontonis, Tzamos, and Zampetakis (FOCS'19) gave a time algorithm for finding -accurate parameters for the special case of Gaussian distributions with diagonal covariance matrix. Recently, Diakonikolas, Kane, Pittas, and Zarifis (COLT'24) showed that this exponential dependence on is necessary even when belongs to some well-behaved classes. These works leave the following open problems which we address in this work: Can we estimate the parameters of any Gaussian or even extend beyond Gaussians? Can we design time algorithms when is a simple set such as a halfspace? We make progress on both of these questions by providing the following results:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
