Beyond the storage capacity: data driven satisfiability transition

Pietro Rotondo; Mauro Pastore; Marco Gherardi

arXiv:2005.09992·cs.LG·October 21, 2020

Beyond the storage capacity: data driven satisfiability transition

Pietro Rotondo, Mauro Pastore, Marco Gherardi

PDF

TL;DR

This paper investigates how data structure influences neural network properties, revealing non-monotonic entropy behavior and additional critical points beyond storage capacity, which could lead to better bounds on generalization error.

Contribution

It introduces a data-driven perspective on the satisfiability transition, showing non-monotonic entropy and critical points in neural models with structured data, extending theoretical understanding.

Findings

01

Entropy is non-monotonic with training set size.

02

A second critical point beyond storage capacity is identified.

03

Similar behavior observed in margin classifiers with random labels.

Abstract

Data structure has a dramatic impact on the properties of neural networks, yet its significance in the established theoretical frameworks is poorly understood. Here we compute the Vapnik-Chervonenkis entropy of a kernel machine operating on data grouped into equally labelled subsets. At variance with the unstructured scenario, entropy is non-monotonic in the size of the training set, and displays an additional critical point besides the storage capacity. Remarkably, the same behavior occurs in margin classifiers even with randomly labelled data, as is elucidated by identifying the synaptic volume encoding the transition. These findings reveal aspects of expressivity lying beyond the condensed description provided by the storage capacity, and they indicate the path towards more realistic bounds for the generalization error of neural networks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.