Guiding Large Language Models via Directional Stimulus Prompting
Zekun Li, Baolin Peng, Pengcheng He, Michel Galley, Jianfeng Gao,, Xifeng Yan

TL;DR
This paper proposes Directional Stimulus Prompting, a new method to guide large language models using a small policy model to generate instance-specific prompts, improving performance on various tasks without direct model tuning.
Contribution
It introduces a novel framework that employs a tunable policy model to generate directional prompts, enhancing LLM outputs without modifying the models directly.
Findings
Significant performance improvements on summarization, dialogue, and reasoning tasks.
Achieved 41.4% performance boost on MultiWOZ with minimal data.
Generated prompts outperform human-crafted and automatic prompts in reasoning accuracy.
Abstract
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs. Instead of directly adjusting LLMs, our method employs a small tunable policy model (e.g., T5) to generate an auxiliary directional stimulus prompt for each input instance. These directional stimulus prompts act as nuanced, instance-specific hints and clues to guide LLMs in generating desired outcomes, such as including specific keywords in the generated summary. Our approach sidesteps the challenges of direct LLM tuning by optimizing the policy model to explore directional stimulus prompts that align LLMs with desired behaviors. The policy model can be optimized through 1) supervised fine-tuning using labeled data and 2) reinforcement learning from offline or online rewards based on the LLM's output. We assess our method across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
