TL;DR
This paper introduces IAPO, a unified framework that jointly optimizes prompts and inference strategies for large language models, improving alignment by considering inference costs and user preferences.
Contribution
It presents a novel inference-aware prompt optimization framework and a fixed-budget training algorithm with theoretical guarantees, addressing a key gap in existing methods.
Findings
IAPO improves alignment across multiple tasks.
Incorporating inference-awareness enhances prompt optimization effectiveness.
PSST provides finite-budget guarantees on error probability.
Abstract
Prompt optimization methods have demonstrated significant effectiveness in aligning black-box large language models (LLMs). In parallel, inference scaling strategies such as Best-of-N Sampling and Majority Voting have likewise been shown to improve alignment and performance by trading additional computation for better output. However, existing prompt optimization approaches are inference strategy agnostic; that is, they optimize prompts without accounting for the inference strategy. This constitutes a significant methodological gap, as our empirical and theoretical analysis reveals a strong interdependence between these two paradigms. Moreover, we find that user preferences regarding trade-offs among multiple objectives and inference budgets substantially influence the choice of prompt and inference configuration. To address this gap, we introduce a novel unified framework named IAPO…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
