Intermediate Distillation: Data-Efficient Distillation from Black-Box LLMs for Information Retrieval
Zizhong Li, Haopeng Zhang, Jiawei Zhang

TL;DR
This paper introduces Intermediate Distillation, a resource-efficient method to distill knowledge from black-box LLMs into retriever models using only their ranking outputs, significantly enhancing retrieval performance with minimal data.
Contribution
The paper proposes a novel black-box distillation approach that leverages LLM ranking outputs for training retrievers, reducing resource requirements and enabling effective knowledge transfer.
Findings
Improves retriever performance with only 1,000 training instances.
Enhances question-answering in RAG frameworks through distilled retrievers.
Demonstrates effectiveness of black-box distillation in real-world tasks.
Abstract
Recent research has explored distilling knowledge from large language models (LLMs) to optimize retriever models, especially within the retrieval-augmented generation (RAG) framework. However, most existing training methods rely on extracting supervision signals from LLMs' weights or their output probabilities, which is not only resource-intensive but also incompatible with black-box LLMs. In this paper, we introduce \textit{Intermediate Distillation}, a data-efficient knowledge distillation training scheme that treats LLMs as black boxes and distills their knowledge via an innovative LLM-ranker-retriever pipeline, solely using LLMs' ranking generation as the supervision signal. Extensive experiments demonstrate that our proposed method can significantly improve the performance of retriever models with only 1,000 training instances. Moreover, our distilled retriever model significantly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Residual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Dropout
