Geometry and Optimization of Shallow Polynomial Networks
Yossi Arjevani, Joan Bruna, Joe Kileel, Elzbieta Polak, Matthew Trager

TL;DR
This paper explores the geometry and optimization of shallow polynomial neural networks, analyzing their function space, training dynamics, and critical points, especially for quadratic activations with Gaussian data.
Contribution
It introduces a tensor-based framework for shallow polynomial networks, analyzes the optimization landscape, and characterizes critical points for quadratic activations with Gaussian data.
Findings
Function space characterized by symmetric tensors of bounded rank.
Introduction of teacher-metric data discriminant for optimization behavior.
Complete characterization of critical points for quadratic networks with Gaussian data.
Abstract
We study shallow neural networks with monomial activations and output dimension one. The function space for these models can be identified with a set of symmetric tensors with bounded rank. We describe general features of these networks, focusing on the relationship between width and optimization. We then consider teacher-student problems, which can be viewed as problems of low-rank tensor approximation with respect to non-standard inner products that are induced by the data distribution. In this setting, we introduce a teacher-metric data discriminant which encodes the qualitative behavior of the optimization as a function of the training data distribution. Finally, we focus on networks with quadratic activations, presenting an in-depth analysis of the optimization landscape. In particular, we present a variation of the Eckart-Young Theorem characterizing all critical points and their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPolynomial and algebraic computation
MethodsSparse Evolutionary Training · Focus
