Neural Spectrahedra and Semidefinite Lifts: Global Convex Optimization of Polynomial Activation Neural Networks in Fully Polynomial-Time
Burak Bartan, Mert Pilanci

TL;DR
This paper introduces a convex optimization approach for training two-layer neural networks with polynomial activations, enabling polynomial-time global optimization and outperforming standard backpropagation in accuracy and speed.
Contribution
It develops exact semidefinite programming formulations for neural networks with polynomial activations, ensuring global optimality and extending to various architectures.
Findings
Semidefinite lifting is always exact for these networks.
Convex penalties can make training polynomial-time solvable.
Standard backpropagation often fails to find global optima.
Abstract
The training of two-layer neural networks with nonlinear activation functions is an important non-convex optimization problem with numerous applications and promising performance in layerwise deep learning. In this paper, we develop exact convex optimization formulations for two-layer neural networks with second degree polynomial activations based on semidefinite programming. Remarkably, we show that semidefinite lifting is always exact and therefore computational complexity for global optimization is polynomial in the input dimension and sample size for all input data. The developed convex formulations are proven to achieve the same global optimal solution set as their non-convex counterparts. More specifically, the globally optimal two-layer neural network with polynomial activations can be found by solving a semidefinite program (SDP) and decomposing the solution using a procedure we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Advanced Neural Network Applications
MethodsWeight Decay
