Implementation of a modified Nesterov's Accelerated quasi-Newton Method   on Tensorflow

S. Indrapriyadarsini; Shahrzad Mahboubi; Hiroshi Ninomiya; Hideki; Asai

arXiv:1910.09158·cs.LG·October 16, 2020

Implementation of a modified Nesterov's Accelerated quasi-Newton Method on Tensorflow

S. Indrapriyadarsini, Shahrzad Mahboubi, Hiroshi Ninomiya, Hideki, Asai

PDF

TL;DR

This paper implements a modified Nesterov's Accelerated Quasi-Newton method in TensorFlow, demonstrating improved convergence speed and robustness on non-convex optimization problems compared to standard optimizers.

Contribution

The paper introduces two modifications to the NAQ algorithm to ensure global convergence and eliminate linesearch, enhancing its performance in non-convex optimization.

Findings

01

mNAQ converges faster than first-order optimizers.

02

mNAQ outperforms quasi-Newton in convergence speed.

03

Algorithm shows robustness on benchmark problems.

Abstract

Recent studies incorporate Nesterov's accelerated gradient method for the acceleration of gradient based training. The Nesterov's Accelerated Quasi-Newton (NAQ) method has shown to drastically improve the convergence speed compared to the conventional quasi-Newton method. This paper implements NAQ for non-convex optimization on Tensorflow. Two modifications have been proposed to the original NAQ algorithm to ensure global convergence and eliminate linesearch. The performance of the proposed algorithm - mNAQ is evaluated on standard non-convex function approximation benchmark problems and microwave circuit modelling problems. The results show that the improved algorithm converges better and faster compared to first order optimizers such as AdaGrad, RMSProp, Adam, and the second order methods such as the quasi-Newton method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Adam · AdaGrad · RMSProp