DRAG: Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillation

Jennifer Chen; Aidar Myrzakhan; Yaxin Luo; Hassaan Muhammad Khan; Sondos Mahmoud Bsharat; Zhiqiang Shen

arXiv:2506.01954·cs.CL·June 3, 2025

DRAG: Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillation

Jennifer Chen, Aidar Myrzakhan, Yaxin Luo, Hassaan Muhammad Khan, Sondos Mahmoud Bsharat, Zhiqiang Shen

PDF

Open Access

TL;DR

DRAG is a framework that distills knowledge from large LLMs into smaller models using evidence and knowledge graphs, reducing hallucinations and computational costs while maintaining factual accuracy.

Contribution

It introduces a novel evidence- and graph-based distillation method for transferring RAG knowledge from large to small LLMs, improving factual accuracy and efficiency.

Findings

01

Outperforms prior RAG methods like MiniRAG by up to 27.7% in accuracy.

02

Effectively reduces hallucinations in small LMs.

03

Enhances factual knowledge retention in distilled models.

Abstract

Retrieval-Augmented Generation (RAG) methods have proven highly effective for tasks requiring factual consistency and robust knowledge retrieval. However, large-scale RAG systems consume significant computational resources and are prone to generating hallucinated content from Humans. In this work, we introduce $DRAG$ , a novel framework for distilling RAG knowledge from large-scale Language Models (LLMs) into small LMs (SLMs). Our approach leverages evidence- and knowledge graph-based distillation, ensuring that the distilled model retains critical factual knowledge while significantly reducing model size and computational cost. By aligning the smaller model's predictions with a structured knowledge graph and ranked evidence, $DRAG$ effectively mitigates hallucinations and improves factual accuracy. We further present a case demonstrating how our framework mitigates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTraditional Chinese Medicine Studies · Machine Learning in Healthcare · Scientific Computing and Data Management