On the Sample Complexity of One Hidden Layer Networks with Equivariance, Locality and Weight Sharing
Arash Behboodi, Gabriele Cesa

TL;DR
This paper analyzes how weight sharing, equivariance, and locality affect the sample complexity of neural networks, providing theoretical bounds and empirical insights into their roles in generalization and expressivity.
Contribution
It offers the first theoretical bounds on sample complexity for single hidden layer networks with these design choices, clarifying their individual impacts.
Findings
Locality improves generalization but involves a trade-off with expressivity.
Non-equivariant weight sharing can achieve similar bounds as equivariant sharing.
Experimental results confirm theoretical predictions and reveal consistent trends.
Abstract
Weight sharing, equivariance, and local filters, as in convolutional neural networks, are believed to contribute to the sample efficiency of neural networks. However, it is not clear how each one of these design choices contributes to the generalization error. Through the lens of statistical learning theory, we aim to provide insight into this question by characterizing the relative impact of each choice on the sample complexity. We obtain lower and upper sample complexity bounds for a class of single hidden layer networks. For a large class of activation functions, the bounds depend merely on the norm of filters and are dimension-independent. We also provide bounds for max-pooling and an extension to multi-layer networks, both with mild dimension dependence. We provide a few takeaways from the theoretical results. It can be shown that depending on the weight-sharing mechanism, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph theory and applications · Neural Networks Stability and Synchronization · Machine Learning and ELM
