Inference-Aware Prompt Optimization for Aligning Black-Box Large Language Models

Saaduddin Mahmud; Mason Nakamura; Kyle Hollins Wray; Shlomo Zilberstein

arXiv:2508.10030·cs.CL·February 11, 2026

Inference-Aware Prompt Optimization for Aligning Black-Box Large Language Models

Saaduddin Mahmud, Mason Nakamura, Kyle Hollins Wray, Shlomo Zilberstein

PDF

1 Video

TL;DR

This paper introduces IAPO, a unified framework that jointly optimizes prompts and inference strategies for large language models, improving alignment by considering inference costs and user preferences.

Contribution

It presents a novel inference-aware prompt optimization framework and a fixed-budget training algorithm with theoretical guarantees, addressing a key gap in existing methods.

Findings

01

IAPO improves alignment across multiple tasks.

02

Incorporating inference-awareness enhances prompt optimization effectiveness.

03

PSST provides finite-budget guarantees on error probability.

Abstract

Prompt optimization methods have demonstrated significant effectiveness in aligning black-box large language models (LLMs). In parallel, inference scaling strategies such as Best-of-N Sampling and Majority Voting have likewise been shown to improve alignment and performance by trading additional computation for better output. However, existing prompt optimization approaches are inference strategy agnostic; that is, they optimize prompts without accounting for the inference strategy. This constitutes a significant methodological gap, as our empirical and theoretical analysis reveals a strong interdependence between these two paradigms. Moreover, we find that user preferences regarding trade-offs among multiple objectives and inference budgets substantially influence the choice of prompt and inference configuration. To address this gap, we introduce a novel unified framework named IAPO…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Inference-Aware Prompt Optimization for Aligning Black-Box Large Language Models· underline