ICLR Reproducibility Challenge Report (Padam : Closing The   Generalization Gap Of Adaptive Gradient Methods in Training Deep Neural   Networks)

Harshal Mittal; Kartikey Pandey; Yash Kant

arXiv:1901.09517·cs.LG·January 29, 2019·1 cites

ICLR Reproducibility Challenge Report (Padam : Closing The Generalization Gap Of Adaptive Gradient Methods in Training Deep Neural Networks)

Harshal Mittal, Kartikey Pandey, Yash Kant

PDF

Open Access 1 Repo

TL;DR

This paper reproduces and reviews the PADAM optimizer, which aims to improve the generalization of adaptive gradient methods by bridging the gap with SGD, and discusses future research directions.

Contribution

It reproduces the PADAM optimizer, evaluates its performance, and provides insights and future directions for improving adaptive gradient methods.

Findings

01

PADAM improves generalization over traditional adaptive methods.

02

The partially adaptive parameter p influences optimizer performance.

03

Reproduction confirms PADAM's effectiveness in training deep neural networks.

Abstract

This work is a part of ICLR Reproducibility Challenge 2019, we try to reproduce the results in the conference submission PADAM: Closing The Generalization Gap of Adaptive Gradient Methods In Training Deep Neural Networks. Adaptive gradient methods proposed in past demonstrate a degraded generalization performance than the stochastic gradient descent (SGD) with momentum. The authors try to address this problem by designing a new optimization algorithm that bridges the gap between the space of Adaptive Gradient algorithms and SGD with momentum. With this method a new tunable hyperparameter called partially adaptive parameter p is introduced that varies between [0, 0.5]. We build the proposed optimizer and use it to mirror the experiments performed by the authors. We review and comment on the empirical analysis performed by the authors. Finally, we also propose a future direction for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yashkant/Padam-Tensorflow
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification

MethodsStochastic Gradient Descent