Prompt-Level Distillation: A Non-Parametric Alternative to Model Fine-Tuning for Efficient Reasoning

Sanket Badhe; Deep Shah

arXiv:2602.21103·cs.CL·February 25, 2026

Prompt-Level Distillation: A Non-Parametric Alternative to Model Fine-Tuning for Efficient Reasoning

Sanket Badhe, Deep Shah

PDF

Open Access

TL;DR

Prompt-Level Distillation (PLD) offers a non-parametric, interpretable alternative to fine-tuning for efficient reasoning, significantly improving performance with minimal latency overhead and enabling human verification of logic.

Contribution

We introduce Prompt-Level Distillation, a method that extracts reasoning patterns into instructions for models, enhancing interpretability and efficiency without extensive fine-tuning.

Findings

01

Macro F1 scores improved from 57% to 90.0% and 67% to 83% on two datasets

02

Achieved frontier performance with negligible latency overhead

03

Enhanced interpretability for regulated industries

Abstract

Advanced reasoning typically requires Chain-of-Thought prompting, which is accurate but incurs prohibitive latency and substantial test-time inference costs. The standard alternative, fine-tuning smaller models, often sacrifices interpretability while introducing significant resource and operational overhead. To address these limitations, we introduce Prompt-Level Distillation (PLD). We extract explicit reasoning patterns from a Teacher model and organize them into a structured list of expressive instructions for the Student model's System Prompt. Evaluated on the StereoSet and Contract-NLI datasets using Gemma-3 4B, PLD improved Macro F1 scores from 57\% to 90.0\% and 67\% to 83\% respectively, enabling this compact model to match frontier performance with negligible latency overhead. These expressive instructions render the decision-making process transparent, allowing for full human…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Ethics and Social Impacts of AI