Efficient Global Optimization of Two-Layer ReLU Networks: Quadratic-Time Algorithms and Adversarial Training
Yatong Bai, Tanmay Gautam, Somayeh Sojoudi

TL;DR
This paper introduces two quadratic-time algorithms with global convergence guarantees for training two-layer ReLU neural networks, including robust adversarial training formulations, addressing non-convexity issues.
Contribution
It develops efficient convex optimization algorithms for globally training two-layer ReLU networks, including adversarially robust models, with theoretical convergence guarantees.
Findings
Algorithms achieve linear and quadratic convergence rates.
High prediction accuracy achieved in initial iterations.
Robust convex formulations enable adversarially resilient training.
Abstract
The non-convexity of the artificial neural network (ANN) training landscape brings inherent optimization difficulties. While the traditional back-propagation stochastic gradient descent (SGD) algorithm and its variants are effective in certain cases, they can become stuck at spurious local minima and are sensitive to initializations and hyperparameters. Recent work has shown that the training of an ANN with ReLU activations can be reformulated as a convex program, bringing hope to globally optimizing interpretable ANNs. However, naively solving the convex training formulation has an exponential complexity, and even an approximation heuristic requires cubic time. In this work, we characterize the quality of this approximation and develop two efficient algorithms that train ANNs with global convergence guarantees. The first algorithm is based on the alternating direction method of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Machine Learning and ELM
