Efficient Statistics With Unknown Truncation, Polynomial Time Algorithms, Beyond Gaussians

Jane H. Lee; Anay Mehrotra; Manolis Zampetakis

arXiv:2410.01656·math.ST·May 12, 2026

Efficient Statistics With Unknown Truncation, Polynomial Time Algorithms, Beyond Gaussians

Jane H. Lee, Anay Mehrotra, Manolis Zampetakis

PDF

TL;DR

This paper develops polynomial-time algorithms for estimating distributional parameters from truncated samples, extending beyond Gaussians and simple sets like halfspaces, with applications to Gaussian estimation and linear regression.

Contribution

It introduces new algorithms for parameter estimation in exponential families with unknown truncation sets, including Gaussian and linear regression models, under structural assumptions.

Findings

01

First polynomial-time algorithm for Gaussian estimation with unknown truncation.

02

First polynomial-time algorithm for linear regression with unknown truncation and Gaussian features.

03

Develops tools for robust PAC learning with positive and unlabeled samples.

Abstract

We study the estimation of distributional parameters when samples are shown only if they fall in some unknown set $S \subseteq R^{d}$ . Kontonis, Tzamos, and Zampetakis (FOCS'19) gave a $d^{poly (1/ ε)}$ time algorithm for finding $ε$ -accurate parameters for the special case of Gaussian distributions with diagonal covariance matrix. Recently, Diakonikolas, Kane, Pittas, and Zarifis (COLT'24) showed that this exponential dependence on $1/ ε$ is necessary even when $S$ belongs to some well-behaved classes. These works leave the following open problems which we address in this work: Can we estimate the parameters of any Gaussian or even extend beyond Gaussians? Can we design $poly (d / ε)$ time algorithms when $S$ is a simple set such as a halfspace? We make progress on both of these questions by providing the following results:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.