CaAdam: Improving Adam optimizer using connection aware methods

Remi Genet; Hugo Inzirillo

arXiv:2410.24216·cs.LG·November 1, 2024

CaAdam: Improving Adam optimizer using connection aware methods

Remi Genet, Hugo Inzirillo

PDF

Open Access 1 Repo

TL;DR

CaAdam is a novel optimizer inspired by Adam that leverages architectural information like layer depth and connectivity to improve convergence speed and accuracy in neural network training.

Contribution

This paper introduces CaAdam, an architecture-aware optimizer that dynamically adjusts learning rates based on network structure, a feature absent in traditional optimizers.

Findings

01

Faster convergence on standard datasets

02

Higher accuracy compared to Adam

03

Effective use of structural proxies for optimization

Abstract

We introduce a new method inspired by Adam that enhances convergence speed and achieves better loss function minima. Traditional optimizers, including Adam, apply uniform or globally adjusted learning rates across neural networks without considering their architectural specifics. This architecture-agnostic approach is deeply embedded in most deep learning frameworks, where optimizers are implemented as standalone modules without direct access to the network's structural information. For instance, in popular frameworks like Keras or PyTorch, optimizers operate solely on gradients and parameters, without knowledge of layer connectivity or network topology. Our algorithm, CaAdam, explores this overlooked area by introducing connection-aware optimization through carefully designed proxies of architectural information. We propose multiple scaling methodologies that dynamically adjust…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

remigenet/Caadam
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Video Surveillance and Tracking Methods

MethodsAdam · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings