SAGRAD: A Program for Neural Network Training with Simulated Annealing and the Conjugate Gradient Method
Javier Bernal, Jose Torres-Jimenez

TL;DR
SAGRAD is a Fortran 77 program that combines simulated annealing and a scaled conjugate gradient method to efficiently train neural networks for classification, addressing local minima issues.
Contribution
The paper introduces SAGRAD, a novel neural network training program that integrates simulated annealing with a scaled conjugate gradient algorithm for improved optimization.
Findings
Effective in avoiding local minima during training
Demonstrated on two classification datasets
Combines gradient computation with stochastic reinitialization
Abstract
SAGRAD (Simulated Annealing GRADient), a Fortran 77 program for computing neural networks for classification using batch learning, is discussed. Neural network training in SAGRAD is based on a combination of simulated annealing and M{\o}ller's scaled conjugate gradient algorithm, the latter a variation of the traditional conjugate gradient method, better suited for the nonquadratic nature of neural networks. Different aspects of the implementation of the training process in SAGRAD are discussed, such as the efficient computation of gradients and multiplication of vectors by Hessian matrices that are required by M{\o}ller's algorithm; the (re)initialization of weights with simulated annealing required to (re)start M{\o}ller's algorithm the first time and each time thereafter that it shows insufficient progress in reaching a possibly local minimum; and the use of simulated annealing when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
