Task-free Adaptive Meta Black-box Optimization
Chao Wang, Licheng Jiao, Lingling Li, Jiaxuan Zhao, Guanchun Wang, Fang Liu, Shuyuan Yang

TL;DR
This paper introduces ABOM, an online adaptive meta-optimizer for black-box problems that learns and updates parameters during optimization, enabling zero-shot performance without prior training tasks.
Contribution
ABOM is a novel online adaptive meta-optimizer that self-updates its parameters during optimization, eliminating the need for predefined training tasks and enabling zero-shot black-box optimization.
Findings
Achieves competitive results on synthetic benchmarks
Performs effectively on UAV path planning problems
Operators exhibit meaningful search patterns
Abstract
Handcrafted optimizers become prohibitively inefficient for complex black-box optimization (BBO) tasks. MetaBBO addresses this challenge by meta-learning to automatically configure optimizers for low-level BBO tasks, thereby eliminating heuristic dependencies. However, existing methods typically require extensive handcrafted training tasks to learn meta-strategies that generalize to target tasks, which poses a critical limitation for realistic applications with unknown task distributions. To overcome the issue, we propose the Adaptive meta Black-box Optimization Model (ABOM), which performs online parameter adaptation using solely optimization data from the target task, obviating the need for predefined task distributions. Unlike conventional metaBBO frameworks that decouple meta-training and optimization phases, ABOM introduces a closed-loop adaptive parameter learning mechanism, where…
Peer Reviews
Decision·ICLR 2026 Oral
I appreciate the novelty of this paper. Since existing MetaBBO approaches require a pre-defined problem distribution and corresponding pre-training to make the meta-level policy generalizable, the idea in this paper (online adaption through self-supervision) enlights more efficient MetaBBOs.
1. Theoretical perspective: I have say that though the authors provide a intuitive proof on what they have claimed (Collary 1, 2, Theorem 3.1), a strong assumption (may not happen in real optimization problem) makes me suspect the rationale behind the proof. This assumption is: "the elite solution is ϵ-suboptimal". I wonder how to guarantee such assumption since we face randomized (stocastic) optimization here. This question becomes more obvious when we consider Eq. (10), what if the elite infor
1. Sound Method Design: The parameterization of evolutionary operators (selection, crossover, mutation) into differentiable, attention-driven modules is well-designed. The online update of optimizer parameters is achieved by using population data generated during the target problem's runtime, with the objective of minimizing the distance between offspring and the elite archive as the loss function. 2. Theoretical Guarantee (to a degree): The paper provides a convergence proof under idealized as
1. The paper needs to explain the differences between the proposed method and other, such as GLHF and B2OPT, in detail, especially regarding the parameterization of evolutionary operators. If prior methods were adopted, proper citations are required. 2. The loss function $\min _\theta||\hat{P}^{(t)}-E^{(t)}||_2$ encourages the offspring population $\hat{P}^{(t)}$ to be close to the current elite archive $E^{(t)}$, essentially using the current local optima to guide optimizer updates. This is a f
1. Unlike existing MetaBBO methods based on deep neural networks, ABOM does not require pre-training, saving substantial time and computational resources. Additionally, the model takes the population itself as input and produces offspring in an end-to-end manner, reducing the human effort required to design optimization states and model actions. Although this end-to-end approach is also adopted in other MetaBBO methods such as RNN-OI and GLHF, ABOM's loss function avoids requiring the gradient o
1. The loss function is calculated as the distance between offspring and the elite archive. Given that the search range and problem dimensionality can be large (e.g., [-100, 100] range and 500 dimensions for BBOB in this paper), the scale of the loss and gradients could be large, potentially causing unstable training. Furthermore, the loss function encourages the model to generate offspring that completely surpass the parent population, representing a greedy strategy that may reduce exploration.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Multi-Objective Optimization Algorithms · Robotic Path Planning Algorithms · Spacecraft Dynamics and Control
