A Lightweight Multi Aspect Controlled Text Generation Solution For Large   Language Models

Chenyang Zhang; Jiayi Lin; Haibo Tong; Bingxuan Hou; Dongyu Zhang,; Jialin Li; Junli Wang

arXiv:2410.14144·cs.CL·October 21, 2024

A Lightweight Multi Aspect Controlled Text Generation Solution For Large Language Models

Chenyang Zhang, Jiayi Lin, Haibo Tong, Bingxuan Hou, Dongyu Zhang,, Jialin Li, Junli Wang

PDF

Open Access 1 Video

TL;DR

This paper introduces a lightweight data augmentation approach to enhance large language models' ability to perform multi-aspect controllable text generation, addressing dataset biases and correlations for improved task performance.

Contribution

It proposes a novel data augmentation pipeline that improves LLMs' MCTG capabilities without complex model modifications, focusing on bias reduction and aspect control.

Findings

01

20% increase in accuracy for MCTG tasks

02

Reduced aspect correlations in augmented datasets

03

Enhanced adaptability of LLMs to controllable generation

Abstract

Large language models (LLMs) show remarkable abilities with instruction tuning. However, they fail to achieve ideal tasks when lacking high-quality instruction tuning data on target tasks. Multi-Aspect Controllable Text Generation (MCTG) is a representative task for this dilemma, where aspect datasets are usually biased and correlated. Existing work exploits additional model structures and strategies for solutions, limiting adaptability to LLMs. To activate MCTG ability of LLMs, we propose a lightweight MCTG pipeline based on data augmentation. We analyze bias and correlations in traditional datasets, and address these concerns with augmented control attributes and sentences. Augmented datasets are feasible for instruction tuning. In our experiments, LLMs perform better in MCTG after data augmentation, with a 20% accuracy rise and less aspect correlations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

A Lightweight Multi Aspect Controlled Text Generation Solution For Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques