Evolutionary Contrastive Distillation for Language Model Alignment

Julian Katz-Samuels; Zheng Li; Hyokun Yun; Priyanka Nigam; Yi Xu,; Vaclav Petricek; Bing Yin; Trishul Chilimbi

arXiv:2410.07513·cs.LG·October 11, 2024

Evolutionary Contrastive Distillation for Language Model Alignment

Julian Katz-Samuels, Zheng Li, Hyokun Yun, Priyanka Nigam, Yi Xu,, Vaclav Petricek, Bing Yin, Trishul Chilimbi

PDF

Open Access 1 Video

TL;DR

This paper introduces Evolutionary Contrastive Distillation, a method that enhances large language models' ability to follow complex instructions by generating and training on synthetic contrastive data, leading to significant performance improvements.

Contribution

It proposes a novel data generation technique that creates challenging contrastive examples for instruction tuning, improving model alignment with complex tasks.

Findings

01

7B model surpasses current SOTA 7B models in instruction following

02

Method achieves competitive performance with open-source 70B models

03

Contrastive data improves complex instruction adherence

Abstract

The ability of large language models (LLMs) to execute complex instructions is essential for their real-world applications. However, several recent studies indicate that LLMs struggle with challenging instructions. In this paper, we propose Evolutionary Contrastive Distillation (ECD), a novel method for generating high-quality synthetic preference data designed to enhance the complex instruction-following capability of language models. ECD generates data that specifically illustrates the difference between a response that successfully follows a set of complex instructions and a response that is high-quality, but nevertheless makes some subtle mistakes. This is done by prompting LLMs to progressively evolve simple instructions to more complex instructions. When the complexity of an instruction is increased, the original successful response to the original instruction becomes a "hard…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Evolutionary Contrastive Distillation for Language Model Alignment· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems

MethodsDirect Preference Optimization · Sparse Evolutionary Training · Contrastive Learning