On the probability of linear separability through intrinsic volumes

Felix Kuchelmeister

arXiv:2404.12889·math.ST·August 26, 2025

On the probability of linear separability through intrinsic volumes

Felix Kuchelmeister

PDF

Open Access

TL;DR

This paper derives a formula and bounds for the probability of linear separability in Gaussian datasets, linking geometric properties of polyhedral cones to statistical questions about data separability.

Contribution

It introduces a geometric approach using intrinsic volumes of polyhedral cones to compute the probability of linear separability, extending recent theoretical results.

Findings

01

Provides a formula for Gaussian feature data

02

Derives an upper bound complementing recent work

03

Calculates intrinsic volumes using a new projection algorithm

Abstract

A dataset with two labels is linearly separable if it can be split into its two classes with a hyperplane. This inflicts a curse on some statistical tools (such as logistic regression) but forms a blessing for others (e.g. support vector machines). Recently, the following question has regained interest: What is the probability that the data are linearly separable? We provide a formula for the probability of linear separability for Gaussian features and labels depending only on one marginal of the features (as in generalized linear models). In this setting, we derive an upper bound that complements the recent result by Hayakawa, Lyons, and Oberhauser [2023], and a sharp upper bound for sign-flip noise. To prove our results, we exploit that this probability can be expressed as a sum of the intrinsic volumes of a polyhedral cone of the form $span {v} \oplus [0, \infty)^{n}$ , as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoil Geostatistics and Mapping · Statistical and numerical algorithms