Are Large Language Models Good Prompt Optimizers?

Ruotian Ma; Xiaolei Wang; Xin Zhou; Jian Li; Nan Du; Tao Gui; Qi; Zhang; Xuanjing Huang

arXiv:2402.02101·cs.CL·February 6, 2024·1 cites

Are Large Language Models Good Prompt Optimizers?

Ruotian Ma, Xiaolei Wang, Xin Zhou, Jian Li, Nan Du, Tao Gui, Qi, Zhang, Xuanjing Huang

PDF

Open Access 1 Repo

TL;DR

This paper critically examines the effectiveness of LLMs as prompt optimizers, revealing their limitations in error reflection and prompt generation, and proposes a new paradigm called Automatic Behavior Optimization for better control.

Contribution

The study uncovers the mechanisms and limitations of LLM-based prompt optimization and introduces the novel Automatic Behavior Optimization paradigm for improved prompt refinement.

Findings

01

LLMs struggle to identify true errors due to bias from prior knowledge.

02

Even valid reflections often fail to produce effective prompts for target models.

03

Target models' unpredictable behaviors hinder single-step prompt refinement.

Abstract

LLM-based Automatic Prompt Optimization, which typically utilizes LLMs as Prompt Optimizers to self-reflect and refine prompts, has shown promising performance in recent studies. Despite the success, the underlying mechanism of this approach remains unexplored, and the true effectiveness of LLMs as Prompt Optimizers requires further validation. In this work, we conducted a comprehensive study to uncover the actual mechanism of LLM-based Prompt Optimization. Our findings reveal that the LLM optimizers struggle to identify the true causes of errors during reflection, tending to be biased by their own prior knowledge rather than genuinely reflecting on the errors. Furthermore, even when the reflection is semantically valid, the LLM optimizers often fail to generate appropriate prompts for the target models with a single prompt refinement step, partly due to the unpredictable behaviors of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rtmaww/LLM_AutoPromptStudy
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques