CYCLE-INSTRUCT: Fully Seed-Free Instruction Tuning via Dual Self-Training and Cycle Consistency

Zhanming Shen; Hao Chen; Yulei Tang; Shaolin Zhu; Wentao Ye; Xiaomeng Hu; Haobo Wang; Gang Chen; Junbo Zhao

arXiv:2508.16100·cs.CL·August 25, 2025

CYCLE-INSTRUCT: Fully Seed-Free Instruction Tuning via Dual Self-Training and Cycle Consistency

Zhanming Shen, Hao Chen, Yulei Tang, Shaolin Zhu, Wentao Ye, Xiaomeng Hu, Haobo Wang, Gang Chen, Junbo Zhao

PDF

TL;DR

Cycle-Instruct introduces a fully seed-free instruction tuning framework for large language models, utilizing dual self-training and cycle consistency to learn from raw text without human annotations or seed data.

Contribution

It presents a novel cycle consistency-based method that bootstraps models from unlabeled data, eliminating the need for seed datasets in instruction tuning.

Findings

01

Outperforms seed-driven back-translation baselines

02

Achieves performance comparable to supervised methods

03

Effective across diverse data domains

Abstract

Instruction tuning is vital for aligning large language models (LLMs) with human intent, but current methods typically rely on costly human-annotated seed data or powerful external teacher models. While instruction back-translation techniques reduce this dependency, they remain fundamentally tethered to an initial seed set, which limits full automation, introduces biases, and can lead to inefficient use of unlabeled corpora. In this paper, we propose Cycle-Instruct, a novel framework that achieves fully seed-free instruction tuning. Inspired by cycle consistency, Cycle-Instruct employs a dual self-training loop where two models-an answer generator and a question generator-are bootstrapped solely from raw, unlabeled text. These models mutually supervise each other by reconstructing original text segments from their counterpart's generated pseudo-labels, effectively learning from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.