Fundamental Limits of Learning High-dimensional Simplices in Noisy Regimes

Seyed Amir Hossein Saberi; Amir Najafi; Abolfazl Motahari; Babak H. khalaj

arXiv:2506.10101·stat.ML·June 13, 2025

Fundamental Limits of Learning High-dimensional Simplices in Noisy Regimes

Seyed Amir Hossein Saberi, Amir Najafi, Abolfazl Motahari, Babak H. khalaj

PDF

Open Access

TL;DR

This paper establishes sample complexity bounds for learning high-dimensional simplices from noisy data, providing new theoretical limits and algorithms that perform near optimally under certain noise conditions.

Contribution

The paper introduces new bounds for simplex learning in noisy regimes, extending previous work with a Fourier-based recovery method and matching lower bounds in specific SNR regimes.

Findings

01

Sample complexity scales as (K^2/ε^2) e^{O(K/SNR^2)} for high-dimensional simplices.

02

Lower bounds show at least Ω(K^3 σ^2/ε^2 + K/ε) samples are needed in noisy settings.

03

When SNR ≥ Ω(√K), the noisy case complexity matches the noiseless case.

Abstract

In this paper, we establish sample complexity bounds for learning high-dimensional simplices in $R^{K}$ from noisy data. Specifically, we consider $n$ i.i.d. samples uniformly drawn from an unknown simplex in $R^{K}$ , each corrupted by additive Gaussian noise of unknown variance. We prove an algorithm exists that, with high probability, outputs a simplex within $ℓ_{2}$ or total variation (TV) distance at most $ε$ from the true simplex, provided $n \geq (K^{2} / ε^{2}) e^{O (K / SNR^{2})}$ , where $SNR$ is the signal-to-noise ratio. Extending our prior work~\citep{saberi2023sample}, we derive new information-theoretic lower bounds, showing that simplex estimation within TV distance $ε$ requires at least $n \geq Ω (K^{3} σ^{2} / ε^{2} + K / ε)$ samples, where $σ^{2}$ denotes the noise variance. In the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Markov Chains and Monte Carlo Methods