TAG-INSTRUCT: Controlled Instruction Complexity Enhancement through Structure-based Augmentation

He Zhu; Zhiwen Ruan; Junyou Su; Xingwei He; Yun Chen; Wenjia Zhang; Guanhua Chen

arXiv:2505.18557·cs.CL·June 3, 2025

TAG-INSTRUCT: Controlled Instruction Complexity Enhancement through Structure-based Augmentation

He Zhu, Zhiwen Ruan, Junyou Su, Xingwei He, Yun Chen, Wenjia Zhang, Guanhua Chen

PDF

Open Access

TL;DR

TAG-INSTRUCT introduces a structured, RL-guided method to enhance instruction complexity for large language models by compressing instructions into a tag space, enabling better control and stability.

Contribution

It proposes a novel tag-based framework for instruction complexity augmentation that outperforms previous prompt-based methods and improves controllability.

Findings

01

Outperforms existing instruction complexity augmentation methods

02

Operates in a compact tag space for better control

03

Provides stable and controllable instruction synthesis

Abstract

High-quality instruction data is crucial for developing large language models (LLMs), yet existing approaches struggle to effectively control instruction complexity. We present TAG-INSTRUCT, a novel framework that enhances instruction complexity through structured semantic compression and controlled difficulty augmentation. Unlike previous prompt-based methods operating on raw text, TAG-INSTRUCT compresses instructions into a compact tag space and systematically enhances complexity through RL-guided tag expansion. Through extensive experiments, we show that TAG-INSTRUCT outperforms existing instruction complexity augmentation approaches. Our analysis reveals that operating in tag space provides superior controllability and stability across different instruction synthesis frameworks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSubtitles and Audiovisual Media · Advanced Computing and Algorithms · Digital Accessibility for Disabilities