Rethinking Prompt Optimization: Reinforcement, Diversification, and Migration in Blackbox LLMs

MohammadReza Davari; Utkarsh Garg; Weixin Cai; Eugene Belilovsky

arXiv:2507.09839·cs.LG·July 15, 2025

Rethinking Prompt Optimization: Reinforcement, Diversification, and Migration in Blackbox LLMs

MohammadReza Davari, Utkarsh Garg, Weixin Cai, Eugene Belilovsky

PDF

Open Access

TL;DR

This paper introduces a new prompt optimization framework for black-box LLMs that leverages positive reinforcement, feedback diversification, and prompt migration techniques to improve effectiveness, efficiency, and adaptability across models.

Contribution

It proposes a novel APO framework that incorporates positive reinforcement and feedback diversification, and formalizes continual prompt optimization for model migration.

Findings

01

Outperforms existing prompt optimization methods in accuracy and convergence speed.

02

Effectively mitigates feedback noise through diversification techniques.

03

Enables efficient prompt migration across different LLM versions.

Abstract

An increasing number of NLP applications interact with large language models (LLMs) through black-box APIs, making prompt engineering critical for controlling model outputs. While recent Automatic Prompt Optimization (APO) methods iteratively refine prompts using model-generated feedback, textual gradients, they primarily focus on error correction and neglect valuable insights from correct predictions. This limits both their effectiveness and efficiency. In this paper, we propose a novel APO framework centered on enhancing the feedback mechanism. We reinterpret the textual gradient as a form of negative reinforcement and introduce the complementary positive reinforcement to explicitly preserve beneficial prompt components identified through successful predictions. To mitigate the noise inherent in LLM-generated feedback, we introduce a technique called feedback diversification, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScheduling and Optimization Algorithms