ReasonCACHE: Teaching LLMs To Reason Without Weight Updates
Sharut Gupta, Phillip Isola, Stefanie Jegelka, David Lopez-Paz, Kartik Ahuja, Mark Ibrahim, Mohammad Pezeshki

TL;DR
ReasonCACHE enables large language models to learn reasoning skills without weight updates by distilling demonstrations into a fixed key-value cache, surpassing traditional in-context learning and matching in-weight learning performance efficiently.
Contribution
The paper introduces ReasonCACHE, a novel method using Prefix Tuning to teach LLMs reasoning without overloading the context window or updating weights, outperforming standard ICL and matching IWL.
Findings
ReasonCACHE outperforms standard ICL on reasoning benchmarks.
It matches or surpasses in-weight learning approaches.
It is more efficient in data, inference cost, and parameters.
Abstract
Can Large language models (LLMs) learn to reason without any weight update and only through in-context learning (ICL)? ICL is strikingly sample-efficient, often learning from only a handful of demonstrations, but complex reasoning tasks typically demand many training examples to learn from. However, naively scaling ICL by adding more demonstrations breaks down at this scale: attention costs grow quadratically, performance saturates or degrades with longer contexts, and the approach remains a shallow form of learning. Due to these limitations, practitioners predominantly rely on in-weight learning (IWL) to induce reasoning. In this work, we show that by using Prefix Tuning, LLMs can learn to reason without overloading the context window and without any weight updates. We introduce , an instantiation of this mechanism that distills demonstrations into a fixed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Multimodal Machine Learning Applications
