Multiplicative update rules for accelerating deep learning training and   increasing robustness

Manos Kirtas; Nikolaos Passalis; Anastasios Tefas

arXiv:2307.07189·cs.LG·August 22, 2024

Multiplicative update rules for accelerating deep learning training and increasing robustness

Manos Kirtas, Nikolaos Passalis, Anastasios Tefas

PDF

Open Access

TL;DR

This paper introduces a novel optimization framework using multiplicative update rules to accelerate deep learning training and enhance model robustness across various tasks and architectures.

Contribution

It proposes a new multiplicative update rule and a hybrid update method, extending optimization techniques for faster and more robust deep learning training.

Findings

01

Accelerates training across multiple deep learning tasks.

02

Produces more robust models compared to traditional additive updates.

03

Effective with various optimization algorithms and neural network architectures.

Abstract

Even nowadays, where Deep Learning (DL) has achieved state-of-the-art performance in a wide range of research domains, accelerating training and building robust DL models remains a challenging task. To this end, generations of researchers have pursued to develop robust methods for training DL architectures that can be less sensitive to weight distributions, model architectures and loss landscapes. However, such methods are limited to adaptive learning rate optimizers, initialization schemes, and clipping gradients without investigating the fundamental rule of parameters update. Although multiplicative updates have contributed significantly to the early development of machine learning and hold strong theoretical claims, to best of our knowledge, this is the first work that investigate them in context of DL training acceleration and robustness. In this work, we propose an optimization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM