TL;DR
SPRIG introduces an edit-based genetic algorithm to optimize system prompts, significantly enhancing large language model performance across diverse tasks, models, and languages.
Contribution
It presents a novel method for optimizing general system prompts that outperform task-specific prompts and generalize across models and languages.
Findings
A single optimized system prompt matches task-specific prompt performance.
Combining system and task prompts yields further improvements.
Optimized prompts generalize across model types, sizes, and languages.
Abstract
Large Language Models (LLMs) have shown impressive capabilities in many scenarios, but their performance depends, in part, on the choice of prompt. Past research has focused on optimizing prompts specific to a task. However, much less attention has been given to optimizing the general instructions included in a prompt, known as a system prompt. To address this gap, we propose SPRIG, an edit-based genetic algorithm that iteratively constructs prompts from prespecified components to maximize the model's performance in general scenarios. We evaluate the performance of system prompts on a collection of 47 different types of tasks to ensure generalizability. Our study finds that a single optimized system prompt performs on par with task prompts optimized for each individual task. Moreover, combining system and task-level optimizations leads to further improvement, which showcases their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
