PACE: Improving Prompt with Actor-Critic Editing for Large Language   Model

Yihong Dong; Kangcheng Luo; Xue Jiang; Zhi Jin; and Ge Li

arXiv:2308.10088·cs.CL·May 17, 2024·2 cites

PACE: Improving Prompt with Actor-Critic Editing for Large Language Model

Yihong Dong, Kangcheng Luo, Xue Jiang, Zhi Jin, and Ge Li

PDF

Open Access 1 Video

TL;DR

PACE introduces an automatic prompt editing method for large language models using an actor-critic framework, significantly improving prompt quality and model performance across diverse tasks with minimal human effort.

Contribution

This paper presents PACE, a novel reinforcement learning-inspired approach that automatically refines prompts by leveraging LLMs as actor and critic, reducing reliance on human expertise.

Findings

01

Improves prompt quality by up to 98% for low-quality prompts.

02

Achieves performance comparable to high-quality human prompts.

03

Effective in prompt generation across multiple tasks.

Abstract

Large language models (LLMs) have showcased remarkable potential across various tasks by conditioning on prompts. However, the quality of different human-written prompts leads to substantial discrepancies in LLMs' performance, and improving prompts usually necessitates considerable human effort and expertise. To this end, this paper proposes Prompt with Actor-Critic Editing (PACE) for LLMs to enable automatic prompt editing. Drawing inspiration from the actor-critic algorithm in reinforcement learning, PACE leverages LLMs as the dual roles of actors and critics, conceptualizing prompt as a type of policy. PACE refines prompt, taking into account the feedback from both actors performing prompt and critics criticizing response. This process helps LLMs better align prompt to a specific task, thanks to real responses and thinking from LLMs. We conduct extensive experiments on 24 instruction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

PACE: Improving Prompt with Actor-Critic Editing for Large Language Model· underline

Taxonomy

TopicsReinforcement Learning in Robotics · Topic Modeling

MethodsALIGN