Linear Regression with Unknown Truncation Beyond Gaussian Features

Alexandros Kouridakis; Anay Mehrotra; Alkis Kalavasis; Constantine Caramanis

arXiv:2602.12534·stat.ML·February 16, 2026

Linear Regression with Unknown Truncation Beyond Gaussian Features

Alexandros Kouridakis, Anay Mehrotra, Alkis Kalavasis, Constantine Caramanis

PDF

Open Access

TL;DR

This paper introduces a polynomial-time algorithm for truncated linear regression with unknown survival sets, requiring only sub-Gaussian features and advancing the practical applicability of such models.

Contribution

It presents the first efficient algorithm for unknown survival sets in truncated linear regression, relaxing distributional assumptions to sub-Gaussian features.

Findings

01

Algorithm runs in polynomial time in d and 1/ε

02

Learns unions of intervals using positive examples only

03

Advances positive-only PAC learning methods

Abstract

In truncated linear regression, samples $(x, y)$ are shown only when the outcome $y$ falls inside a certain survival set $S^{⋆}$ and the goal is to estimate the unknown $d$ -dimensional regressor $w^{⋆}$ . This problem has a long history of study in Statistics and Machine Learning going back to the works of (Galton, 1897; Tobin, 1958) and more recently in, e.g., (Daskalakis et al., 2019; 2021; Lee et al., 2023; 2024). Despite this long history, however, most prior works are limited to the special case where $S^{⋆}$ is precisely known. The more practically relevant case, where $S^{⋆}$ is unknown and must be learned from data, remains open: indeed, here the only available algorithms require strong assumptions on the distribution of the feature vectors (e.g., Gaussianity) and, even then, have a $d^{poly (1/ ε)}$ run time for achieving $ε$ accuracy. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Gaussian Processes and Bayesian Inference · Stochastic Gradient Optimization Techniques