Distilling Reasoning Without Knowledge: A Framework for Reliable LLMs
Auksarapak Kietkajornrit, Jad Tarifi, Nima Asgharbeygi

TL;DR
This paper introduces a modular framework for large language models that explicitly separates planning from retrieval and answer synthesis, improving reliability and efficiency in fact-seeking tasks.
Contribution
It proposes a lightweight, trainable planner that generates structured reasoning plans, enhancing the accuracy and speed of search-augmented LLMs.
Findings
Supervised planning improves accuracy on SEAL-0 benchmark.
The framework reduces latency compared to monolithic models.
Explicit planning structures are crucial for reliable fact-seeking in LLMs.
Abstract
Fact-seeking question answering with large language models (LLMs) remains unreliable when answers depend on up-to-date or conflicting information. Although retrieval-augmented and tool-using LLMs reduce hallucinations, they often rely on implicit planning, leading to inefficient tool usage. We propose a modular framework that explicitly separates planning from factual retrieval and answer synthesis. A lightweight student planner is trained via a teacher-student framework to generate structured decompositions consisting of abstract reasoning steps and searchable fact requests. The supervision signals contain only planning traces and fact requests, without providing factual answers or retrieved evidence. At inference, the planner produces plans, while prompt-engineered modules perform retrieval and response synthesis. We evaluate the proposed framework on SEAL-0, an extremely challenging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Intelligent Tutoring Systems and Adaptive Learning
