CaAdam: Improving Adam optimizer using connection aware methods
Remi Genet, Hugo Inzirillo

TL;DR
CaAdam is a novel optimizer inspired by Adam that leverages architectural information like layer depth and connectivity to improve convergence speed and accuracy in neural network training.
Contribution
This paper introduces CaAdam, an architecture-aware optimizer that dynamically adjusts learning rates based on network structure, a feature absent in traditional optimizers.
Findings
Faster convergence on standard datasets
Higher accuracy compared to Adam
Effective use of structural proxies for optimization
Abstract
We introduce a new method inspired by Adam that enhances convergence speed and achieves better loss function minima. Traditional optimizers, including Adam, apply uniform or globally adjusted learning rates across neural networks without considering their architectural specifics. This architecture-agnostic approach is deeply embedded in most deep learning frameworks, where optimizers are implemented as standalone modules without direct access to the network's structural information. For instance, in popular frameworks like Keras or PyTorch, optimizers operate solely on gradients and parameters, without knowledge of layer connectivity or network topology. Our algorithm, CaAdam, explores this overlooked area by introducing connection-aware optimization through carefully designed proxies of architectural information. We propose multiple scaling methodologies that dynamically adjust…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Video Surveillance and Tracking Methods
MethodsAdam · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
