On the Geometry and Optimization of Polynomial Convolutional Networks

Vahid Shahverdi; Giovanni Luca Marchetti; Kathl\'en Kohn

arXiv:2410.00722·cs.LG·March 4, 2025

On the Geometry and Optimization of Polynomial Convolutional Networks

Vahid Shahverdi, Giovanni Luca Marchetti, Kathl\'en Kohn

PDF

Open Access

TL;DR

This paper investigates the geometric structure and optimization landscape of polynomial convolutional networks with monomial activations, revealing their expressivity and critical point characteristics using algebraic geometry tools.

Contribution

It provides a detailed geometric analysis of polynomial CNNs, including their neuromanifold properties, and derives explicit formulas for critical points in large dataset regression tasks.

Findings

01

Parameterization map is an isomorphism almost everywhere.

02

Computed the dimension and degree of the neuromanifold.

03

Derived an explicit formula for the number of critical points in optimization.

Abstract

We study convolutional neural networks with monomial activation functions. Specifically, we prove that their parameterization map is regular and is an isomorphism almost everywhere, up to rescaling the filters. By leveraging on tools from algebraic geometry, we explore the geometric properties of the image in function space of this map - typically referred to as neuromanifold. In particular, we compute the dimension and the degree of the neuromanifold, which measure the expressivity of the model, and describe its singularities. Moreover, for a generic large dataset, we derive an explicit formula that quantifies the number of critical points arising in the optimization of a regression loss.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGraph theory and applications · Polynomial and algebraic computation · Matrix Theory and Algorithms