Learning on a Razor's Edge: Identifiability and Singularity of Polynomial Neural Networks
Vahid Shahverdi, Giovanni Luca Marchetti, Kathl\'en Kohn

TL;DR
This paper investigates the geometric structure of neural network function spaces, focusing on identifiability and singularities in polynomial-activated MLPs and CNNs, revealing differences in parametrization and loss landscape implications.
Contribution
It provides a comprehensive algebraic geometric analysis of neuromanifolds for polynomial neural networks, characterizing singularities and parametrization properties for MLPs and CNNs.
Findings
MLPs have finitely many parameters for almost all functions
CNN parametrization is generically one-to-one
Singularities arise from sparse subnetworks
Abstract
We study function spaces parametrized by neural networks, referred to as neuromanifolds. Specifically, we focus on deep Multi-Layer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs) with an activation function that is a sufficiently generic polynomial. First, we address the identifiability problem, showing that, for almost all functions in the neuromanifold of an MLP, there exist only finitely many parameter choices yielding that function. For CNNs, the parametrization is generically one-to-one. As a consequence, we compute the dimension of the neuromanifold. Second, we describe singular points of neuromanifolds. We characterize singularities completely for CNNs, and partially for MLPs. In both cases, they arise from sparse subnetworks. For MLPs, we prove that these singularities often correspond to critical points of the mean-squared error loss, which does not hold for CNNs.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Graph Neural Networks · Model Reduction and Neural Networks
MethodsFocus
