Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time
Tolga Ergen, Mert Pilanci

TL;DR
This paper develops convex optimization formulations for training shallow CNNs with ReLU activations, enabling globally optimal solutions in polynomial time and revealing implicit regularizers tied to architecture.
Contribution
It introduces exact convex optimization frameworks for two- and three-layer CNNs, providing polynomial-time solutions and insights into architectural biases as convex regularizers.
Findings
Two-layer CNNs can be globally optimized via convex programs with ℓ2 regularization.
Multi-layer circular CNNs are equivalent to ℓ1 regularized convex programs promoting spectral sparsity.
Extensions to pooling methods reveal implicit architectural regularizers.
Abstract
We study training of Convolutional Neural Networks (CNNs) with ReLU activations and introduce exact convex optimization formulations with a polynomial complexity with respect to the number of data samples, the number of neurons, and data dimension. More specifically, we develop a convex analytic framework utilizing semi-infinite duality to obtain equivalent convex optimization problems for several two- and three-layer CNN architectures. We first prove that two-layer CNNs can be globally optimized via an norm regularized convex program. We then show that multi-layer circular CNN training problems with a single ReLU layer are equivalent to an regularized convex program that encourages sparsity in the spectral domain. We also extend these results to three-layer CNNs with two ReLU layers. Furthermore, we present extensions of our approach to different pooling methods,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and ELM
Methods*Communicated@Fast*How Do I Communicate to Expedia?
