How far away are truly hyperparameter-free learning algorithms?

Priya Kasimbeg; Vincent Roulet; Naman Agarwal; Sourabh Medapati; Fabian Pedregosa; Atish Agarwala; George E. Dahl

arXiv:2505.24005·cs.LG·June 2, 2025

How far away are truly hyperparameter-free learning algorithms?

Priya Kasimbeg, Vincent Roulet, Naman Agarwal, Sourabh Medapati, Fabian Pedregosa, Atish Agarwala, George E. Dahl

PDF

Open Access

TL;DR

This paper evaluates the progress of hyperparameter-free learning algorithms, particularly those without learning rate tuning, and finds that while improvements are promising, they still lag behind well-calibrated baselines, indicating room for further development.

Contribution

The study assesses learning-rate-free methods using a comprehensive benchmark and highlights the gap between current methods and optimal performance, emphasizing the need for better hyperparameter reduction techniques.

Findings

01

Default settings perform poorly on the benchmark.

02

Calibrated learning-rate-free methods improve performance.

03

They still lag behind strong baseline algorithms.

Abstract

Despite major advances in methodology, hyperparameter tuning remains a crucial (and expensive) part of the development of machine learning systems. Even ignoring architectural choices, deep neural networks have a large number of optimization and regularization hyperparameters that need to be tuned carefully per workload in order to obtain the best results. In a perfect world, training algorithms would not require workload-specific hyperparameter tuning, but would instead have default settings that performed well across many workloads. Recently, there has been a growing literature on optimization methods which attempt to reduce the number of hyperparameters -- particularly the learning rate and its accompanying schedule. Given these developments, how far away is the dream of neural network training algorithms that completely obviate the need for painful tuning? In this paper, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Neural Networks and Applications