Advanced Black-Box Tuning of Large Language Models with Limited API Calls

Zhikang Xie; Weilin Wan; Peizhu Gong; Weizhong Zhang; Cheng Jin

arXiv:2511.10210·cs.AI·December 17, 2025

Advanced Black-Box Tuning of Large Language Models with Limited API Calls

Zhikang Xie, Weilin Wan, Peizhu Gong, Weizhong Zhang, Cheng Jin

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel black-box tuning method for large language models that uses a Gaussian Process surrogate to minimize API calls, significantly improving accuracy and efficiency in model adaptation.

Contribution

The paper proposes a new black-box tuning approach that leverages a Gaussian Process surrogate with LogitMap Pairs to reduce API queries while maintaining high accuracy.

Findings

01

Model accuracy improved from 55.92% to 86.85%.

02

API query frequency reduced to 1.38%.

03

Outperforms offline and query-intensive methods.

Abstract

Black-box tuning is an emerging paradigm for adapting large language models (LLMs) to better achieve desired behaviors, particularly when direct access to model parameters is unavailable. Current strategies, however, often present a dilemma of suboptimal extremes: either separately train a small proxy model and then use it to shift the predictions of the foundation model, offering notable efficiency but often yielding limited improvement; or making API calls in each tuning iteration to the foundation model, which entails prohibitive computational costs. Therefore, we propose a novel advanced black-box tuning method for LLMs with limited API calls. Our core strategy involves training a Gaussian Process (GP) surrogate model with "LogitMap Pairs" derived from querying the foundation model on a minimal but highly informative training subset. This surrogate can approximate the outputs of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Advanced Black-Box Tuning of Large Language Models with Limited API Calls· underline

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Machine Learning in Materials Science