TL;DR
AGGLIO is a novel optimization method that guarantees global convergence for non-convex problems with locally convex regions, improving training efficiency for neural networks with common activation functions.
Contribution
Introduces AGGLIO, a stage-wise graduated optimization technique with provable global convergence for locally convex non-convex objectives, including neural network training.
Findings
Outperforms recent optimization methods in convergence rate
Achieves higher convergent accuracy
Applicable with SGD for practical training scenarios
Abstract
This paper presents AGGLIO (Accelerated Graduated Generalized LInear-model Optimization), a stage-wise, graduated optimization technique that offers global convergence guarantees for non-convex optimization problems whose objectives offer only local convexity and may fail to be even quasi-convex at a global scale. In particular, this includes learning problems that utilize popular activation functions such as sigmoid, softplus and SiLU that yield non-convex training objectives. AGGLIO can be readily implemented using point as well as mini-batch SGD updates and offers provable convergence to the global optimum in general conditions. In experiments, AGGLIO outperformed several recently proposed optimization techniques for non-convex and locally convex objectives in terms of convergence rate as well as convergent accuracy. AGGLIO relies on a graduation technique for generalized linear…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSigmoid Linear Unit · Stochastic Gradient Descent
