Towards Universal Offline Black-Box Optimization via Learning Language Model Embeddings

Rong-Xi Tan; Ming Chen; Ke Xue; Yao Wang; Yaoyuan Wang; Sheng Fu; Chao Qian

arXiv:2506.07109·cs.LG·June 10, 2025

Towards Universal Offline Black-Box Optimization via Learning Language Model Embeddings

Rong-Xi Tan, Ming Chen, Ke Xue, Yao Wang, Yaoyuan Wang, Sheng Fu, Chao Qian

PDF

Open Access 1 Repo

TL;DR

This paper proposes leveraging language model embeddings to develop universal offline black-box optimization algorithms capable of handling heterogeneous data across multiple domains, overcoming traditional limitations.

Contribution

It introduces a novel approach using language model priors and embedding spaces to enable cross-domain universal black-box optimization.

Findings

01

Experiments show the proposed methods achieve universality and effectiveness.

02

Unifying language model priors with string embedding spaces overcomes traditional barriers.

03

The approach enables general-purpose black-box optimization across diverse data types.

Abstract

The pursuit of universal black-box optimization (BBO) algorithms is a longstanding goal. However, unlike domains such as language or vision, where scaling structured data has driven generalization, progress in offline BBO remains hindered by the lack of unified representations for heterogeneous numerical spaces. Thus, existing offline BBO approaches are constrained to single-task and fixed-dimensional settings, failing to achieve cross-domain universal optimization. Recent advances in language models (LMs) offer a promising path forward: their embeddings capture latent relationships in a unifying way, enabling universal optimization across different data types possible. In this paper, we discuss multiple potential approaches, including an end-to-end learning framework in the form of next-token prediction, as well as prioritizing the learning of latent spaces with strong representational…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lamda-bbo/universal-offline-bbo
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis