Tune My Adam, Please!

Theodoros Athanasiadis; Steven Adriaensen; Samuel M\"uller; Frank Hutter

arXiv:2508.19733·cs.LG·August 29, 2025

Tune My Adam, Please!

Theodoros Athanasiadis, Steven Adriaensen, Samuel M\"uller, Frank Hutter

PDF

TL;DR

This paper introduces Adam-PFN, a pre-trained surrogate model combined with a novel augmentation method, CDF-augment, to enhance hyperparameter tuning of Adam optimizer using freeze-thaw Bayesian Optimization, especially on out-of-distribution tasks.

Contribution

It presents Adam-PFN, a surrogate model trained on learning curves, and CDF-augment, a data augmentation technique, improving hyperparameter tuning efficiency and accuracy for Adam.

Findings

01

Enhanced learning curve extrapolation accuracy.

02

Accelerated hyperparameter optimization process.

03

Strong performance on out-of-distribution tasks.

Abstract

The Adam optimizer remains one of the most widely used optimizers in deep learning, and effectively tuning its hyperparameters is key to optimizing performance. However, tuning can be tedious and costly. Freeze-thaw Bayesian Optimization (BO) is a recent promising approach for low-budget hyperparameter tuning, but is limited by generic surrogates without prior knowledge of how hyperparameters affect learning. We propose Adam-PFN, a new surrogate model for Freeze-thaw BO of Adam's hyperparameters, pre-trained on learning curves from TaskSet, together with a new learning curve augmentation method, CDF-augment, which artificially increases the number of available training examples. Our approach improves both learning curve extrapolation and accelerates hyperparameter optimization on TaskSet evaluation tasks, with strong performance on out-of-distribution (OOD) tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.