Improving LLM-based Global Optimization with Search Space Partitioning

Andrej Schwanke; Lyubomir Ivanov; David Salinas; Fabio Ferreira; Aaron Klein; Frank Hutter; Arber Zela

arXiv:2505.21372·cs.LG·January 28, 2026

Improving LLM-based Global Optimization with Search Space Partitioning

Andrej Schwanke, Lyubomir Ivanov, David Salinas, Fabio Ferreira, Aaron Klein, Frank Hutter, Arber Zela

PDF

Open Access 1 Repo 3 Reviews

TL;DR

HOLLM is a new global optimization algorithm that improves LLM-based search by partitioning the search space into promising regions, balancing exploration and exploitation, and achieving superior results on benchmarks.

Contribution

The paper introduces HOLLM, a novel search space partitioning method that enhances LLM-driven optimization without requiring domain knowledge.

Findings

01

HOLLM outperforms existing LLM-based sampling strategies.

02

HOLLM matches or exceeds leading global optimization methods.

03

Partitioning search space improves high-dimensional optimization performance.

Abstract

Large Language Models (LLMs) have recently emerged as effective surrogate models and candidate generators within global optimization frameworks for expensive blackbox functions. Despite promising results, LLM-based methods often struggle in high-dimensional search spaces or when lacking domain-specific priors, leading to sparse or uninformative suggestions. To overcome these limitations, we propose HOLLM, a novel global optimization algorithm that enhances LLM-driven sampling by partitioning the search space into promising subregions. Each subregion acts as a ``meta-arm'' selected via a bandit-inspired scoring mechanism that effectively balances exploration and exploitation. Within each selected subregion, an LLM then proposes high-quality candidate points, without any explicit domain knowledge. Empirical evaluation on standard optimization benchmarks shows that HOLLM consistently…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 4

Strengths

- Authors showcase extensive results demonstrating the applicability of HOLLM. - The authors provide ablation studies of the design choices on the appendix. - In general the paper is well written and presented.

Weaknesses

- My only doubt is that the authors develop a method based on the main motivation that LLMs are not efficient covering the space of solution. However, it seems the method not necessarilly needs an LLM in its design. On this point why LLMs are revelant for this method? My first assumption would be because LLMs have strong inductive bias about the problem and make them sample more efficiently. However, the inductive biases would be given by the context that is given to the LLM. This is a point tha

Reviewer 02Rating 4Confidence 5

Strengths

- The paper clearly identifies and demonstrates a practical weakness of LLM-based samplers—their high bias and inability to cover a space effectively (as shown in Figure 1)—and proposes an intuitive solution. - The inclusion of a "global LLM" baseline is crucial, as it provides strong evidence that the partitioning framework itself, not just the use of an LLM, is responsible for the performance gains

Weaknesses

- The paper's primary contribution appears incremental. The core components—hierarchical space partitioning (e.g., KD-trees) and bandit-based region selection—are well-established techniques in the black-box and hierarchical optimization literature (e.g., HOO, MABs). The method seems to primarily substitute a traditional sampler within this framework with an LLM, which may limit the work's fundamental novelty. - There is a notable disconnect between the paper's motivation and its empirical evalu

Reviewer 03Rating 6Confidence 4

Strengths

- The paper gives an intuitive and empirical evidence of LLM failure modes in high-dimensional spaces (biased coverage, mode-seeking), which it addresses through searching/sampling within bounded subregions - The algorithm makes sense, combining KD-tree partitioning, UCB-style scoring/selection over the partitions (as arms), and LLM-buided local BO. While each of those components exist in some form in BO/MAB/hierarchical bandit literature, the particular design feels suitable and well-motivated

Weaknesses

- The arm-level scoring functions feels slightly ad-hoc and slightly complicated, it is not clear why the HV-based term and UCB-V (as opposed to other UCB variants) is necessary and used. The HOLLM vs global LLM-based BO is a good ablation, but it would also be interesting to isolate the different terms (e.g., UCB-variance) to see if they matter in practice. - It would be interesting to understand the LLM performance specifically, e.g., how well the surrogate performance is vs a GP - One possibl

Code & Models

Repositories

automl/hollm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMetaheuristic Optimization Algorithms Research · Distributed and Parallel Computing Systems · Advanced Manufacturing and Logistics Optimization