Synthesizing Instruction-Tuning Datasets with Contrastive Decoding

Tatsuya Ichinose; Youmi Ma; Masanari Oi; Ryuto Koike; Naoaki Okazaki

arXiv:2604.13538·cs.CL·April 16, 2026

Synthesizing Instruction-Tuning Datasets with Contrastive Decoding

Tatsuya Ichinose, Youmi Ma, Masanari Oi, Ryuto Koike, Naoaki Okazaki

PDF

TL;DR

This paper introduces CoDIT, a contrastive decoding method that enhances instruction-tuning datasets by disentangling instruction-following capabilities from pre-trained knowledge, leading to improved model performance.

Contribution

The paper proposes CoDIT, a novel contrastive decoding approach that isolates instruction-following behavior in generated responses, improving instruction-tuning datasets and model performance.

Findings

01

Models trained on CoDIT-generated datasets outperform those trained on directly generated responses.

02

Training on CoDIT datasets yields better performance than existing instruction-tuning datasets.

03

CoDIT can be interpreted as distilling instruction-tuning capabilities from parameter space to text space.

Abstract

Using responses generated by high-performing large language models (LLMs) for instruction tuning has become a widely adopted approach. However, the existing literature overlooks a property of LLM-generated responses: they conflate world knowledge acquired during pre-training with instruction-following capabilities acquired during post-training. We hypothesize that disentangling the instruction-following capabilities from pre-trained knowledge improves the effectiveness of instruction tuning. To this end, we propose CoDIT, a method that applies contrastive decoding between a post-trained model and its pre-trained counterpart during response generation. The method suppresses pre-trained knowledge shared between the two models while amplifying the instruction-following behavior acquired via post-training, resulting in responses that more purely reflect instruction-following capabilities.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.