Non-instructional Fine-tuning: Enabling Instruction-Following   Capabilities in Pre-trained Language Models without Instruction-Following   Data

Juncheng Xie; Shensian Syu; Hung-yi Lee

arXiv:2409.00096·cs.CL·September 4, 2024

Non-instructional Fine-tuning: Enabling Instruction-Following Capabilities in Pre-trained Language Models without Instruction-Following Data

Juncheng Xie, Shensian Syu, Hung-yi Lee

PDF

Open Access

TL;DR

This paper demonstrates that pre-trained language models can acquire instruction-following abilities through fine-tuning on non-instructional data generated by GPT models, challenging the necessity of explicit instruction data.

Contribution

It introduces a novel method of using non-instructional text as training data to enable instruction-following in LLMs, bypassing the need for instruction-specific datasets.

Findings

01

Models fine-tuned on non-instructional data gain instruction-following capabilities.

02

Non-instructional data improves performance of models already fine-tuned or aligned.

03

LLaMA-3-70B-Instruct matches LLaMA-3.1-70B-Instruct on benchmark.

Abstract

Instruction fine-tuning is crucial for today's large language models (LLMs) to learn to follow instructions and align with human preferences. Conventionally, supervised data, including the instruction and the correct response, is required for instruction fine-tuning. To obtain such data, some researchers prompted well-trained models like GPT-4 to generate instructions and correct responses. In this paper, we propose a novel approach that uses the first half of a random text from OpenWebText as the instruction and GPT-3.5-turbo or GPT-4-turbo to complete the text as the response. Despite the data being "non-instructional", we found that pre-trained LLMs fine-tuned on this data can gain instruction-following capabilities. This observation is verified by fine-tuning several well-known pre-trained LLMs (e.g., LLaMA-2-7B, LLaMA-3-8B, LLaMA-3-70B, Mistral-7B-v0.1). The "non-instructional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis