Self-Prompt Tuning: Enable Autonomous Role-Playing in LLMs

Aobo Kong; Shiwan Zhao; Hao Chen; Qicheng Li; Yong Qin; Ruiqi Sun; Xin; Zhou; Jiaming Zhou; Haoqin Sun

arXiv:2407.08995·cs.CL·July 15, 2024·3 cites

Self-Prompt Tuning: Enable Autonomous Role-Playing in LLMs

Aobo Kong, Shiwan Zhao, Hao Chen, Qicheng Li, Yong Qin, Ruiqi Sun, Xin, Zhou, Jiaming Zhou, Haoqin Sun

PDF

Open Access

TL;DR

This paper introduces self-prompt tuning, a method where LLMs generate their own role prompts through fine-tuning, leading to improved performance on NLP benchmarks without manual prompt design.

Contribution

The authors propose a novel self-prompt tuning approach that enables LLMs to autonomously generate role prompts via fine-tuning, reducing manual effort and enhancing performance.

Findings

01

Self-prompt tuned LLMs outperform instruction-tuned baselines on multiple NLP benchmarks.

02

The method automates complex prompt design, making LLMs more autonomous.

03

The approach is validated on Llama-2-7B and Mistral-7B models.

Abstract

Recent advancements in LLMs have showcased their remarkable role-playing capabilities, able to accurately simulate the dialogue styles and cognitive processes of various roles based on different instructions and contexts. Studies indicate that assigning LLMs the roles of experts, a strategy known as role-play prompting, can enhance their performance in the corresponding domains. However, the prompt needs to be manually designed for the given problem, requiring certain expertise and iterative modifications. To this end, we propose self-prompt tuning, making LLMs themselves generate role-play prompts through fine-tuning. Leveraging the LIMA dataset as our foundational corpus, we employ GPT-4 to annotate role-play prompts for each data points, resulting in the creation of the LIMA-Role dataset. We then fine-tune LLMs like Llama-2-7B and Mistral-7B on LIMA-Role. Consequently, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Business Process Modeling and Analysis · Model-Driven Software Engineering Techniques

MethodsAttention Is All You Need · Residual Connection · Byte Pair Encoding · Layer Normalization · Label Smoothing · Linear Layer · Adam · Dropout · Multi-Head Attention · Dense Connections