Convergence Analysis of the Dynamics of a Special Kind of Two-Layered Neural Networks with $\ell_1$ and $\ell_2$ Regularization
Zhifeng Kong

TL;DR
This paper extends convergence analysis for two-layer ReLU neural networks by incorporating $$ and $$ regularization, proving convergence to optimal solutions under certain conditions, supported by numerical experiments.
Contribution
It provides a theoretical convergence guarantee for regularized two-layer neural networks with ReLU activation, considering both and regularization terms.
Findings
Weight vectors converge to the optimal solution with high probability.
Small regularization coefficient or ensures convergence.
Numerical experiments validate the theoretical results.
Abstract
In this paper, we made an extension to the convergence analysis of the dynamics of two-layered bias-free networks with one output. We took into consideration two popular regularization terms: the and norm of the parameter vector , and added it to the square loss function with coefficient . We proved that when is small, the weight vector converges to the optimal solution (with respect to the new loss function) with probability under random initiations in a sphere centered at the origin, where is a small value and is a constant. Numerical experiments including phase diagrams and repeated simulations verified our theory.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques · Machine Learning and ELM
