Black-Box On-Policy Distillation of Large Language Models
Tianzhu Ye, Li Dong, Zewen Chi, Xun Wu, Shaohan Huang, Furu Wei

TL;DR
This paper introduces Generative Adversarial Distillation (GAD), a novel black-box on-policy distillation method for large language models that improves student model performance by using a discriminator as an adaptive reward signal.
Contribution
The paper presents GAD, a new adversarial framework for black-box LLM distillation that does not require access to teacher logits or parameters, and demonstrates its effectiveness over traditional methods.
Findings
GAD outperforms sequence-level knowledge distillation.
Qwen2.5-14B-Instruct trained with GAD matches GPT-5-Chat.
GAD provides a stable, adaptive feedback mechanism for distillation.
Abstract
Black-box distillation creates student large language models (LLMs) by learning from a proprietary teacher model's text outputs alone, without access to its internal logits or parameters. In this work, we introduce Generative Adversarial Distillation (GAD), which enables on-policy and black-box distillation. GAD frames the student LLM as a generator and trains a discriminator to distinguish its responses from the teacher LLM's, creating a minimax game. The discriminator acts as an on-policy reward model that co-evolves with the student, providing stable, adaptive feedback. Experimental results show that GAD consistently surpasses the commonly used sequence-level knowledge distillation. In particular, Qwen2.5-14B-Instruct (student) trained with GAD becomes comparable to its teacher, GPT-5-Chat, on the LMSYS-Chat automatic evaluation. The results establish GAD as a promising and effective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗ytz20/GAD-GPT-5-Chat-Qwen2.5-14B-Instructmodel· 18 dl· ♡ 118 dl♡ 1
- 🤗ytz20/GAD-GPT-5-Chat-Qwen2.5-7B-Instructmodel· 14 dl· ♡ 114 dl♡ 1
- 🤗ytz20/GAD-GPT-5-Chat-Qwen2.5-3B-Instructmodel· 15 dl· ♡ 115 dl♡ 1
- 🤗ytz20/GAD-GPT-5-Chat-Llama-3.2-3B-Instructmodel· 12 dl· ♡ 112 dl♡ 1
- 🤗ytz20/GAD-GPT-5-Chat-Llama-3.1-8B-Instructmodel· 14 dl· ♡ 314 dl♡ 3
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications
