Topological properties of the set of functions generated by neural   networks of fixed size

Philipp Petersen; Mones Raslan; Felix Voigtlaender

arXiv:1806.08459·math.GN·January 24, 2020·Found. Comput. Math.

Topological properties of the set of functions generated by neural networks of fixed size

Philipp Petersen, Mones Raslan, Felix Voigtlaender

PDF

TL;DR

This paper investigates the topological structure of the set of functions representable by fixed-size neural networks, revealing non-convexity, non-closure, and instability issues that may hinder training convergence.

Contribution

It provides a detailed analysis of the topological limitations of neural network function spaces, highlighting properties that could cause training difficulties.

Findings

01

Set is highly non-convex for most activation functions.

02

Set is not closed in various norms, except for ReLU and parametric ReLU.

03

Function-to-weights mapping is not inverse stable, affecting training stability.

Abstract

We analyze the topological properties of the set of functions that can be implemented by neural networks of a fixed size. Surprisingly, this set has many undesirable properties. It is highly non-convex, except possibly for a few exotic activation functions. Moreover, the set is not closed with respect to $L^{p}$ -norms, $0 < p < \infty$ , for all practically-used activation functions, and also not closed with respect to the $L^{\infty}$ -norm for all practically-used activation functions except for the ReLU and the parametric ReLU. Finally, the function that maps a family of weights to the function computed by the associated network is not inverse stable for every practically used activation function. In other words, if $f_{1}, f_{2}$ are two functions realized by neural networks and if $f_{1}, f_{2}$ are close in the sense that $∥ f_{1} - f_{2} ∥_{L^{\infty}} \leq ε$ for $ε > 0$ , it is,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.