Does Sparse Connectivity Improve Generalization? Convolutional Networks Below the Edge of Stability

Tongtong Liang; Esha Singh; Rahul Parhi; Alexander Cloninger; Yu-Xiang Wang

arXiv:2603.04807·stat.ML·May 8, 2026

Does Sparse Connectivity Improve Generalization? Convolutional Networks Below the Edge of Stability

Tongtong Liang, Esha Singh, Rahul Parhi, Alexander Cloninger, Yu-Xiang Wang

PDF

TL;DR

This paper investigates how sparse connectivity in convolutional networks influences generalization, revealing that small receptive fields and structured patch geometry enable better generalization below the Edge of Stability.

Contribution

It demonstrates that sparse connectivity alters the stability constraints, leading to non-vacuous generalization bounds, and explains the generalization advantage of convolutional architectures.

Findings

01

Sparse connectivity processes low-dimensional patches, improving generalization bounds.

02

Structured patch geometry in natural images facilitates better generalization.

03

Unstructured patches fail to prevent overfitting, highlighting the importance of data geometry.

Abstract

Gradient descent on overparameterized neural networks typically operates at the Edge of Stability (EoS), where the largest Hessian eigenvalue hovers around a step-size-dependent threshold. We study how sparse connectivity changes generalization below this threshold in two-layer ReLU networks. Prior results have shown that for fully-connected networks (FCNs), generalization guarantees in this regime degrade and become vacuous on high-dimensional spherical inputs. Our analysis reveals that sparse connectivity fundamentally alters this picture. Under sparse connectivity, the network processes a collection of low-dimensional patches rather than the full input vector, so the effective constraint imposed by the stability condition is governed by the geometry of the training patch collection. We prove that when the receptive fields are small relative to the ambient dimension, the effective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.