Beyond the storage capacity: data driven satisfiability transition
Pietro Rotondo, Mauro Pastore, Marco Gherardi

TL;DR
This paper investigates how data structure influences neural network properties, revealing non-monotonic entropy behavior and additional critical points beyond storage capacity, which could lead to better bounds on generalization error.
Contribution
It introduces a data-driven perspective on the satisfiability transition, showing non-monotonic entropy and critical points in neural models with structured data, extending theoretical understanding.
Findings
Entropy is non-monotonic with training set size.
A second critical point beyond storage capacity is identified.
Similar behavior observed in margin classifiers with random labels.
Abstract
Data structure has a dramatic impact on the properties of neural networks, yet its significance in the established theoretical frameworks is poorly understood. Here we compute the Vapnik-Chervonenkis entropy of a kernel machine operating on data grouped into equally labelled subsets. At variance with the unstructured scenario, entropy is non-monotonic in the size of the training set, and displays an additional critical point besides the storage capacity. Remarkably, the same behavior occurs in margin classifiers even with randomly labelled data, as is elucidated by identifying the synaptic volume encoding the transition. These findings reveal aspects of expressivity lying beyond the condensed description provided by the storage capacity, and they indicate the path towards more realistic bounds for the generalization error of neural networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
