RNR: Teaching Large Language Models to Follow Roles and Rules
Kuan Wang, Alexander Bukharin, Haoming Jiang, Qingyu Yin, Zhengyang, Wang, Tuo Zhao, Jingbo Shang, Chao Zhang, Bing Yin, Xian Li, Jianshu Chen,, Shiyang Li

TL;DR
This paper introduces RNR, a data generation pipeline that enhances large language models' ability to follow complex roles and rules, ensuring safer and more aligned interactions without sacrificing general instruction-following performance.
Contribution
We propose RNR, an automated method to generate diverse role and rule data for training LLMs, significantly improving their adherence to complex system prompts.
Findings
Over 25% increase in rule adherence pass-rate.
No regression on standard instruction-following benchmarks.
Improved performance on role and rule following benchmarks.
Abstract
Instruction fine-tuning (IFT) elicits instruction following capabilities and steers the behavior of large language models (LLMs) via supervised learning. However, existing models trained on open-source IFT datasets only have the ability to follow instructions from users, and often fail to follow complex role and rules specified by developers, a.k.a. system prompts. The ability to follow these roles and rules is essential for deployment, as it ensures that the model safely interacts with users within developer defined guidelines. To improve such role and rule following ability, we propose \model, an automated data generation pipeline that generates diverse roles and rules from existing IFT instructions, along with corresponding responses. This data can then be used to train models that follow complex system prompts. The models are evaluated on our newly created benchmarks for role and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
