Warmup Generations: A Task-Agnostic Approach for Guiding Sequence-to-Sequence Learning with Unsupervised Initial State Generation
Senyu Li, Zipeng Sun, Jiayi Wang, Xue Liu, Pontus Stenetorp, Siva, Reddy, David Ifeoluwa Adelani

TL;DR
This paper proposes a task-agnostic method that generates warmup sequences to improve sequence-to-sequence learning, enhancing performance without relying on external supervision or predefined intermediate formats.
Contribution
Introduces a novel, scalable framework for unsupervised warmup sequence generation that guides sequence-to-sequence models across diverse tasks.
Findings
Outperforms traditional supervised fine-tuning methods.
Effective across translation, summarization, and reasoning tasks.
Enhances model performance without external annotations.
Abstract
Traditional supervised fine-tuning (SFT) strategies for sequence-to-sequence tasks often train models to directly generate the target output. Recent work has shown that guiding models with intermediate steps, such as keywords, outlines, or reasoning chains, can significantly improve performance, coherence, and interpretability. However, these methods often depend on predefined intermediate formats and annotated data, limiting their scalability and generalizability. In this work, we introduce a task-agnostic framework that enables models to generate intermediate "warmup" sequences. These warmup sequences, serving as an initial state for subsequent generation, are optimized to enhance the probability of generating the target sequence without relying on external supervision or human-designed structures. Drawing inspiration from reinforcement learning principles, our method iteratively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Reinforcement Learning in Robotics · AI-based Problem Solving and Planning
