Distilling LLM Agent into Small Models with Retrieval and Code Tools
Minki Kang, Jongwon Jeong, Seanie Lee, Jaewoong Cho, Sung Ju Hwang

TL;DR
This paper introduces Agent Distillation, a method to transfer full task-solving abilities from large language models to smaller models using retrieval and code tools, enhancing small models' reasoning and robustness.
Contribution
It proposes a novel agent distillation framework with new prompting and self-consistent action generation techniques for small models.
Findings
Small models (0.5B-3B) achieve performance comparable to larger models.
Enhanced robustness and accuracy in reasoning tasks.
Effective transfer of reasoning and task-solving capabilities.
Abstract
Large language models (LLMs) excel at complex reasoning tasks but remain computationally expensive, limiting their practical deployment. To address this, recent works have focused on distilling reasoning capabilities into smaller language models (sLMs) using chain-of-thought (CoT) traces from teacher LLMs. However, this approach struggles in scenarios requiring rare factual knowledge or precise computation, where sLMs often hallucinate due to limited capability. In this work, we propose Agent Distillation, a framework for transferring not only reasoning capability but full task-solving behavior from LLM-based agents into sLMs with retrieval and code tools. We improve agent distillation along two complementary axes: (1) we introduce a prompting method called first-thought prefix to enhance the quality of teacher-generated trajectories; and (2) we propose a self-consistent action…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗agent-distillation/agent_distilled_Qwen2.5-1.5B-Instructmodel· 12 dl12 dl
- 🤗agent-distillation/agent_distilled_Qwen2.5-0.5B-Instructmodel· 3 dl3 dl
- 🤗agent-distillation/agent_distilled_Qwen2.5-3B-Instructmodel· 6 dl6 dl
- 🤗agent-distillation/agent_distilled_Qwen2.5-7B-Instructmodel· 4 dl4 dl
- 🤗agent-distillation/agent_distilled_ftp_Qwen2.5-1.5B-Instructmodel· 5 dl5 dl
- 🤗agent-distillation/agent_distilled_ftp_Qwen2.5-0.5B-Instructmodel· 7 dl7 dl
- 🤗agent-distillation/agent_distilled_ftp_Qwen2.5-3B-Instructmodel· 1 dl1 dl
- 🤗agent-distillation/agent_distilled_ftp_Qwen2.5-7B-Instructmodel· 5 dl5 dl
- agent-distillation/Qwen2.5-32B-Instruct_agent_trajectories_2kdataset· 62 dl62 dl
- agent-distillation/Qwen2.5-32B-Instruct_agent_trajectories_2k_prefixdataset· 74 dl74 dl
- agent-distillation/Qwen2.5-32B-Instruct_cot_trajectories_2kdataset· 18 dl18 dl
- agent-distillation/Qwen2.5-32B-Instruct_prefix_memory_3kdataset· 11 dl11 dl
Videos
