TL;DR
This paper demonstrates that retrieving and utilizing thinking traces as a corpus significantly enhances reasoning tasks in language models, outperforming traditional web-based retrieval methods.
Contribution
It introduces T3, a method to transform thinking traces into structured, retrieval-friendly representations, improving reasoning performance with minimal additional inference cost.
Findings
Retrieving thinking traces improves reasoning performance across benchmarks.
Transforming traces into structured representations enhances retrieval effectiveness.
RAG with traces outperforms non-RAG baselines and web retrieval in reasoning tasks.
Abstract
Retrieval-augmented generation (RAG) has proven effective for knowledge-intensive tasks, but is widely believed to offer limited benefit for reasoning-intensive problems such as math and code generation. We challenge this assumption by showing that the limitation lies not in RAG itself, but in the choice of corpus. Instead of retrieving documents, we propose retrieving thinking traces, i.e., intermediate thinking trajectories generated during problem solving attempts. We show that thinking traces are already a strong retrieval source, and further introduce T3, an offline method that transforms them into structured, retrieval-friendly representations, to improve usability. Using these traces as a corpus, a simple retrieve-then-generate pipeline consistently improves reasoning performance across strong models and benchmarks such as AIME 2025--2026, LiveCodeBench, and GPQA-Diamond,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
