Can LLMs Beat Classical Hyperparameter Optimization Algorithms? A Study on autoresearch

Fabio Ferreira; Lucca Wobbe; Arjun Krishnakumar; Frank Hutter; Arber Zela

arXiv:2603.24647·cs.LG·April 21, 2026

Can LLMs Beat Classical Hyperparameter Optimization Algorithms? A Study on autoresearch

Fabio Ferreira, Lucca Wobbe, Arjun Krishnakumar, Frank Hutter, Arber Zela

PDF

1 Repo

TL;DR

This study compares classical hyperparameter optimization algorithms with LLM-based methods using autoresearch, finding classical methods generally outperform LLMs unless hybrid approaches like Centaur are used, which leverage strengths of both.

Contribution

The paper introduces Centaur, a hybrid hyperparameter optimization method combining CMA-ES and LLMs, achieving superior results over pure classical or LLM approaches.

Findings

01

Classical methods outperform LLMs in fixed search spaces.

02

Allowing LLMs to edit code narrows but does not eliminate performance gaps.

03

Centaur, a hybrid approach, outperforms all tested methods.

Abstract

The autoresearch repository enables an LLM agent to optimize hyperparameters by editing training code directly. We use it as a testbed to compare classical HPO algorithms against LLM-based methods on tuning the hyperparameters of a small language model under a fixed compute budget. When defining a fixed search space over autoresearch, classical methods such as CMA-ES and TPE consistently outperform LLM-based agents, where avoiding out-of-memory failures matters more than search diversity. Allowing the LLM to directly edit source code narrows the gap to the classical methods but does not close it, even with frontier models available at the time of writing such as Claude Opus 4.6 and Gemini 3.1 Pro Preview. We observe that LLMs struggle to track optimization state across trials. In contrast, classical methods lack the domain knowledge of LLMs. To combine the strengths of both, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ferreirafabio/autoresearch-automl
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.