On the Sample Complexity of Two-Layer Networks: Lipschitz vs.   Element-Wise Lipschitz Activation

Amit Daniely; Elad Granot

arXiv:2211.09634·cs.LG·January 23, 2024

On the Sample Complexity of Two-Layer Networks: Lipschitz vs. Element-Wise Lipschitz Activation

Amit Daniely, Elad Granot

PDF

Open Access

TL;DR

This paper analyzes the sample complexity of two-layer neural networks with Lipschitz activation functions, showing that element-wise activations lead to logarithmic width dependency, unlike non-element-wise functions.

Contribution

It establishes the importance of element-wise activation functions for achieving logarithmic sample complexity dependency on network width.

Findings

01

Logarithmic sample complexity for element-wise activations

02

Linear width dependency for certain non-element-wise activations

03

Development of new techniques using Approximate Description Length (ADL)

Abstract

We investigate the sample complexity of bounded two-layer neural networks using different activation functions. In particular, we consider the class $H = {x \mapsto ⟨ v, σ \circ W b + b ⟩ : b \in R^{d}, W \in R^{T \times d}, v \in R^{T}}$ where the spectral norm of $W$ and $v$ is bounded by $O (1)$ , the Frobenius norm of $W$ is bounded from its initialization by $R > 0$ , and $σ$ is a Lipschitz activation function. We prove that if $σ$ is element-wise, then the sample complexity of $H$ has only logarithmic dependency in width and that this complexity is tight, up to logarithmic factors. We further show that the element-wise property of $σ$ is essential for a logarithmic dependency bound in width, in the sense…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Machine Learning in Materials Science