The Geometric Structure of Models Learning Sparse Data

Thomas Walker; T. Mitchell Roddenberry; Ahmed Imtiaz Humayun; Randall Balestriero; Richard Baraniuk

arXiv:2605.08464·cs.LG·May 18, 2026

The Geometric Structure of Models Learning Sparse Data

Thomas Walker, T. Mitchell Roddenberry, Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk

PDF

TL;DR

This paper explores how models learn in sparse data regimes by leveraging local geometric structures, introducing the concept of normal alignment, and proposing regularization strategies to improve robustness and training efficiency.

Contribution

It formalizes the concept of normal alignment, proves its benefits for training and robustness, and introduces GrokAlign and RFAMs as practical methods based on these insights.

Findings

01

Normal alignment minimizes training objectives under norm constraints.

02

GrokAlign accelerates training dynamics in deep networks.

03

RFAMs show increased adversarial robustness over RFMs.

Abstract

The manifold hypothesis (MH) is often used to explain how machine learning can overcome the curse of dimensionality. However, the MH is only applicable in regimes where the training data provides a sufficiently dense sample of the underlying low-dimensional data manifold, or where such a low-dimensional manifold is conceivably present. We describe the regimes where the MH is not applicable as sparse. In this paper, we demonstrate that models succeed in the sparse regime by exploiting a highly structured local geometry, a property we formalize as normal alignment. We prove that normal-aligned classifiers -- whose input-output Jacobians are rank-one and align perfectly with the training data -- minimize the training objective under norm constraints and achieve maximal local robustness under a non-zero Jacobian constraint. For continuous piecewise-affine deep networks, normal alignment…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.