Convex Formulations for Training Two-Layer ReLU Neural Networks
Karthik Prakhya, Tolga Birdal, Alp Yurtsever

TL;DR
This paper introduces a convex reformulation of training infinite-width two-layer ReLU neural networks, enabling polynomial-time approximations and demonstrating competitive test accuracy in classification tasks.
Contribution
It presents the first convex formulation for training two-layer ReLU networks and proposes a semidefinite relaxation to make the problem computationally feasible.
Findings
Semidefinite relaxation is effective and can be solved in polynomial time.
The relaxation achieves competitive test accuracy on various classification tasks.
The convex formulation provides insights into neural network training dynamics.
Abstract
Solving non-convex, NP-hard optimization problems is crucial for training machine learning models, including neural networks. However, non-convexity often leads to black-box machine learning models with unclear inner workings. While convex formulations have been used for verifying neural network robustness, their application to training neural networks remains less explored. In response to this challenge, we reformulate the problem of training infinite-width two-layer ReLU networks as a convex completely positive program in a finite-dimensional (lifted) space. Despite the convexity, solving this problem remains NP-hard due to the complete positivity constraint. To overcome this challenge, we introduce a semidefinite relaxation that can be solved in polynomial time. We then experimentally evaluate the tightness of this relaxation, demonstrating its competitive performance in test…
Peer Reviews
Decision·ICLR 2025 Poster
This is an interesting paper that derives an equivalence between infinite-width RELU network training and solving a certain convex copositive program. This work contributes to the growing literature relating neural network training and convex optimization. The empirical results are also promising. Overall, I think this is an interesting work that provides valuable insight.
The empirical results seem somewhat weak to me. The authors acknowledge the similarity of their work to earlier work relating copositive programming and RELU network training. Although the existing methods make additional assumptions on the data distribution, a numerical comparison to prior work (e.g. the approximation ratio) would be beneficial in understanding how the proposed framework compares to earlier work in practice or further discussion on the applications of their technique.
The paper is clear, and reads well. The supplementary material provides the code to reproduce the results. Section 2 provides a complete yet brief background on the relevant optimization topics and concepts.
The paper does not consider bias terms in the linear layers. The tightness of the proposed relaxation is evaluated only empirically. There is no convergence guarantee for the TOS rounding step. The rounding step is performed using the critical width, however in practice this is unfeasible. Time complexity is not considered in the evaluations. The empirical evaluation is performed on classification tasks, using L2 loss.
- The idea of the proposed method based on convex completely positive program and semidefinite relaxation is interesting. - The presentation of the paper is fairly clear.
- The proposed method is only for training wide two-layer neural networks with ReLU activation function. It seems that extending the method to deeper networks is non-trivial, which limits the practical value of the proposed method. It would be beneficial if the authors could discuss the possibility to implement the proposed methods to train modern deep neural networks. - As the authors have commented, the problem (Cp-Nn) is NP-hard due to the complete positivity constraint, and as far as I can
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and ELM · Brain Tumor Detection and Classification
Methods*Communicated@Fast*How Do I Communicate to Expedia?
