Prompt-R1: Collaborative Automatic Prompting Framework via End-to-end Reinforcement Learning

Wenjin Liu; Haoran Luo; Xueyuan Lin; Haoming Liu; Tiesunlong Shen; Jiapu Wang; Rui Mao; Erik Cambria

arXiv:2511.01016·cs.CL·April 17, 2026

Prompt-R1: Collaborative Automatic Prompting Framework via End-to-end Reinforcement Learning

Wenjin Liu, Haoran Luo, Xueyuan Lin, Haoming Liu, Tiesunlong Shen, Jiapu Wang, Rui Mao, Erik Cambria

PDF

1 Repo 1 Models 1 Datasets

TL;DR

Prompt-R1 is an end-to-end reinforcement learning framework enabling small LLMs to collaboratively generate prompts for large LLMs, improving problem-solving performance through multi-turn interactions.

Contribution

It introduces a novel collaborative prompting framework using reinforcement learning, enhancing large LLMs' effectiveness without user intervention.

Findings

01

Prompt-R1 outperforms baseline models on multiple datasets.

02

The framework effectively optimizes correctness, quality, and reasoning accuracy.

03

Supports both inference and training with various large-scale LLMs.

Abstract

Recently, advanced large language models (LLMs) have emerged at an increasingly rapid pace. However, when faced with complex problems, most users are often unable to provide accurate and effective prompts to interact with LLMs, thus limiting the performance of LLMs. To address this challenge, we propose Prompt-R1, an end-to-end reinforcement learning framework that uses a small-scale LLM to collaborate with large-scale LLMs, replacing user interaction to solve problems better. This collaboration is cast as a multi-turn prompt interaction, where the small-scale LLM thinks and generates prompts, and the large-scale LLM performs complex reasoning. A dual-constrained reward is designed to optimize for correctness, generation quality, and reasoning accuracy. Prompt-R1 provides a plug-and-play framework that supports both inference and training with various large-scale LLMs. Experiments on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

QwenQKing/Prompt-R1
github

Models

🤗
QwenQKing/Prompt-R1
model· ♡ 1
♡ 1

Datasets

QwenQKing/Prompt-R1
dataset· 542 dl
542 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.