Bidirectional Looking with A Novel Double Exponential Moving Average to   Adaptive and Non-adaptive Momentum Optimizers

Yineng Chen; Zuchao Li; Lefei Zhang; Bo Du; Hai Zhao

arXiv:2307.00631·cs.LG·July 4, 2023

Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers

Yineng Chen, Zuchao Li, Lefei Zhang, Bo Du, Hai Zhao

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Admeta, a novel optimizer combining a double exponential moving average with a dynamic lookahead strategy, demonstrating improved convergence and performance over existing optimizers in deep learning tasks.

Contribution

The paper proposes a new optimizer framework, Admeta, integrating a DEMA-based backward-looking component and a dynamic lookahead forward-looking strategy, with implementations based on RAdam and SGDM.

Findings

01

Admeta outperforms baseline optimizers in diverse tasks.

02

Theoretical proof confirms convergence of Admeta algorithms.

03

Experimental results show advantages over recent competitive optimizers.

Abstract

Optimizer is an essential component for the success of deep learning, which guides the neural network to update the parameters according to the loss on the training set. SGD and Adam are two classical and effective optimizers on which researchers have proposed many variants, such as SGDM and RAdam. In this paper, we innovatively combine the backward-looking and forward-looking aspects of the optimizer algorithm and propose a novel \textsc{Admeta} (\textbf{A} \textbf{D}ouble exponential \textbf{M}oving averag\textbf{E} \textbf{T}o \textbf{A}daptive and non-adaptive momentum) optimizer framework. For backward-looking part, we propose a DEMA variant scheme, which is motivated by a metric in the stock market, to replace the common exponential moving average scheme. While in the forward-looking part, we present a dynamic lookahead strategy which asymptotically approaches a set value,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chernyn/admeta-optimizer
jaxOfficial

Videos

Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stock Market Forecasting Methods · Metaheuristic Optimization Algorithms Research

MethodsAdam · Balanced Selection · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Stochastic Gradient Descent · RAdam