Local Entropy Search over Descent Sequences for Bayesian Optimization

David Stenger; Armin Lindicke; Alexander von Rohr; Sebastian Trimpe

arXiv:2511.19241·cs.LG·November 25, 2025

Local Entropy Search over Descent Sequences for Bayesian Optimization

David Stenger, Armin Lindicke, Alexander von Rohr, Sebastian Trimpe

PDF

Open Access 3 Reviews

TL;DR

This paper introduces Local Entropy Search (LES), a Bayesian optimization method that models descent sequences to efficiently find optima in complex design spaces, outperforming existing methods.

Contribution

LES explicitly models descent sequences in Bayesian optimization, combining analytic and sampling methods to improve sample efficiency on complex problems.

Findings

01

LES outperforms existing Bayesian optimization methods in sample efficiency.

02

LES effectively models local descent sequences for complex objectives.

03

Empirical results demonstrate LES's robustness on synthetic and benchmark problems.

Abstract

Searching large and complex design spaces for a global optimum can be infeasible and unnecessary. A practical alternative is to iteratively refine the neighborhood of an initial design using local optimization methods such as gradient descent. We propose local entropy search (LES), a Bayesian optimization paradigm that explicitly targets the solutions reachable by the descent sequences of iterative optimizers. The algorithm propagates the posterior belief over the objective through the optimizer, resulting in a probability distribution over descent sequences. It then selects the next evaluation by maximizing mutual information with that distribution, using a combination of analytic entropy calculations and Monte-Carlo sampling of descent sequences. Empirical results on high-complexity synthetic objectives and benchmark problems show that LES achieves strong sample efficiency compared to…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 10Confidence 4

Strengths

1. The paper shows real depth, both theoretically and empirically. The authors clearly know the literature, handle the fine details, and take a principled path to build LES. 2. The empirical study is broad, with enough variants to support strong claims about LES performance across complex settings, different distribution regimes (in-model and out-of-model), and a thoughtful set of baselines. 3. The paper gives a robust treatment of edge cases. It shows the one-step equivalence to GIBO, explai

Weaknesses

1. The most prominent weakness, which the authors already acknowledge, is the tendency to get trapped in a local minimum basin, especially in complex settings with a highly multimodal posterior. The paper offers a careful discussion of this issue and introduces a stopping rule to trigger restarts, but the risk remains in challenging landscapes. 2. LES performance depends on several core components, including the chosen optimizer, the surrogate model specification, and the underlying problem com

Reviewer 02Rating 8Confidence 3

Strengths

- The proposed AF belongs to the class of information-theoretic AFs, a class of theoretically grounded strategies. Framing the problem as trying to gain information over local optima rather than global optima is novel and well-motivated, specifically for high-dimensional settings, and the practical execution into a computationally reasonable algorithm is also a contribution in itself. - The experiments section is quite extensive, to say the least. I have to say that I rarely encounter BO paper

Weaknesses

I cannot pinpoint concrete weaknesses. The limitations of the approach are clearly stated, albeit briefly in the conclusion, but are presented in more detail in the appendix. I agree with such limitations; they do not represent grounds for rejection in my opinion.

Reviewer 03Rating 4Confidence 3

Strengths

- The acquistion function is very straightforward. - The combination of entropy search with local search is freshing

Weaknesses

- The empirical benchmark, although demonstrate up to 124 d, is on somwhat easy and not very representative BO benchmark, its unclear that the method will really be competitive in high input dimensionality or not. I think to demonstrate the real benifit of the acquisition function, which to me is the most promising point of this work, is to conduct more thorough empirical results to provide more compelling results. - The algorithm is rather heuristic without too much theoretical insight. The exi

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Multi-Objective Optimization Algorithms · Metaheuristic Optimization Algorithms Research · Stochastic Gradient Optimization Techniques