Finite-Sum Optimization: A New Perspective for Convergence to a Global Solution
Lam M. Nguyen, Trang H. Tran, Marten van Dijk

TL;DR
This paper introduces a novel reformulation and recursive algorithmic framework for finite-sum optimization in deep neural networks, providing convergence guarantees to a global minimum under bounded assumptions.
Contribution
It presents a new perspective and algorithmic approach for convergence analysis in non-convex DNN training, with theoretical guarantees under bounded assumptions.
Findings
Proves convergence to an ε-global minimum using rac{1}{\u03b5^3} gradient computations.
Introduces a reformulation enabling a recursive optimization framework.
Broadens understanding of conditions for global convergence in DNN training.
Abstract
Deep neural networks (DNNs) have shown great success in many machine learning tasks. Their training is challenging since the loss surface of the network architecture is generally non-convex, or even non-smooth. How and under what assumptions is guaranteed convergence to a \textit{global} minimum possible? We propose a reformulation of the minimization problem allowing for a new recursive algorithmic framework. By using bounded style assumptions, we prove convergence to an -(global) minimum using gradient computations. Our theoretical foundation motivates further study, implementation, and optimization of the new algorithmic framework and further investigation of its non-standard bounded style assumptions. This new direction broadens our understanding of why and under what circumstances training of a DNN converges to a global minimum.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Machine Learning and ELM
