AdaSwarm: Augmenting Gradient-Based optimizers in Deep Learning with   Swarm Intelligence

Rohan Mohapatra; Snehanshu Saha; Carlos A. Coello Coello; Anwesh; Bhattacharya; Soma S. Dhavala; Sriparna Saha

arXiv:2006.09875·cs.NE·May 28, 2024

AdaSwarm: Augmenting Gradient-Based optimizers in Deep Learning with Swarm Intelligence

Rohan Mohapatra, Snehanshu Saha, Carlos A. Coello Coello, Anwesh, Bhattacharya, Soma S. Dhavala, Sriparna Saha

PDF

2 Repos

TL;DR

AdaSwarm is a new gradient-free optimizer that combines swarm intelligence with gradient approximation, achieving comparable or superior performance to Adam in neural network training.

Contribution

The paper introduces AdaSwarm, a novel optimizer that uses EMPSO to approximate gradients, bridging numerical methods and swarm intelligence for deep learning.

Findings

01

AdaSwarm performs on par or better than Adam in various tasks.

02

It effectively handles diverse loss functions like MAE.

03

Mathematical proofs support the gradient approximation method.

Abstract

This paper introduces AdaSwarm, a novel gradient-free optimizer which has similar or even better performance than the Adam optimizer adopted in neural networks. In order to support our proposed AdaSwarm, a novel Exponentially weighted Momentum Particle Swarm Optimizer (EMPSO), is proposed. The ability of AdaSwarm to tackle optimization problems is attributed to its capability to perform good gradient approximations. We show that, the gradient of any function, differentiable or not, can be approximated by using the parameters of EMPSO. This is a novel technique to simulate GD which lies at the boundary between numerical methods and swarm intelligence. Mathematical proofs of the gradient approximation produced are also provided. AdaSwarm competes closely with several state-of-the-art (SOTA) optimizers. We also show that AdaSwarm is able to handle a variety of loss functions during…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAdam