Neural Spectrahedra and Semidefinite Lifts: Global Convex Optimization   of Polynomial Activation Neural Networks in Fully Polynomial-Time

Burak Bartan; Mert Pilanci

arXiv:2101.02429·cs.LG·January 11, 2021·5 cites

Neural Spectrahedra and Semidefinite Lifts: Global Convex Optimization of Polynomial Activation Neural Networks in Fully Polynomial-Time

Burak Bartan, Mert Pilanci

PDF

Open Access

TL;DR

This paper introduces a convex optimization approach for training two-layer neural networks with polynomial activations, enabling polynomial-time global optimization and outperforming standard backpropagation in accuracy and speed.

Contribution

It develops exact semidefinite programming formulations for neural networks with polynomial activations, ensuring global optimality and extending to various architectures.

Findings

01

Semidefinite lifting is always exact for these networks.

02

Convex penalties can make training polynomial-time solvable.

03

Standard backpropagation often fails to find global optima.

Abstract

The training of two-layer neural networks with nonlinear activation functions is an important non-convex optimization problem with numerous applications and promising performance in layerwise deep learning. In this paper, we develop exact convex optimization formulations for two-layer neural networks with second degree polynomial activations based on semidefinite programming. Remarkably, we show that semidefinite lifting is always exact and therefore computational complexity for global optimization is polynomial in the input dimension and sample size for all input data. The developed convex formulations are proven to achieve the same global optimal solution set as their non-convex counterparts. More specifically, the globally optimal two-layer neural network with polynomial activations can be found by solving a semidefinite program (SDP) and decomposing the solution using a procedure we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Advanced Neural Network Applications

MethodsWeight Decay