RuleR: Improving LLM Controllability by Rule-based Data Recycling

Ming Li; Han Chen; Chenguang Wang; Dang Nguyen; Dianqi Li; Tianyi Zhou

arXiv:2406.15938·cs.CL·February 18, 2025

RuleR: Improving LLM Controllability by Rule-based Data Recycling

Ming Li, Han Chen, Chenguang Wang, Dang Nguyen, Dianqi Li, Tianyi Zhou

PDF

Open Access 3 Repos 1 Video

TL;DR

RuleR is a data augmentation technique that enhances large language model controllability by applying rule-based modifications to existing data, avoiding costly human or proprietary data curation.

Contribution

The paper introduces RuleR, a novel method for improving LLM controllability through rule-based data recycling that requires no new data collection.

Findings

01

Improves LLM controllability effectively

02

Maintains general instruction-following capabilities

03

Reduces reliance on human or proprietary data

Abstract

Large language models (LLMs) still lack delicate controllability over their responses, which is critical to enhancing their performance and the user experience. However, curating supervised fine-tuning (SFT) datasets to improve LLM controllability usually relies on human experts or proprietary LLMs, which requires additional costs. To bridge this gap, we propose Rule-based Data Recycling (RuleR), a data augmentation method incorporating multiple constraints into the original data samples according to predefined rules, which creates new training tasks to consolidate the controllability of LLMs. Instead of creating new data from scratch, RuleR "recycles" existing data by simply applying rule-based edits to their responses and appending the rule-instructions in their original instructions. Experimental results demonstrate RuleR's effectiveness in improving LLM controllability while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

RuleR: Improving LLM Controllability by Rule-based Data Recycling· underline

Taxonomy

TopicsDigital Rights Management and Security

MethodsShrink and Fine-Tune