RTTC: Reward-Guided Collaborative Test-Time Compute

J. Pablo Mu\~noz; Jinjie Yuan

arXiv:2508.10024·cs.CL·August 15, 2025

RTTC: Reward-Guided Collaborative Test-Time Compute

J. Pablo Mu\~noz, Jinjie Yuan

PDF

TL;DR

RTTC introduces an adaptive framework that intelligently selects the optimal test-time compute strategy for large language models, significantly improving accuracy while reducing unnecessary computation across diverse tasks.

Contribution

This work presents RTTC, a reward-guided, adaptive approach for selecting test-time compute strategies, incorporating distributed retrieval, lightweight fine-tuning, and query-state caching.

Findings

01

RTTC outperforms vanilla RAG and TTT in accuracy across multiple benchmarks.

02

Adaptive strategy selection reduces computational overhead.

03

Query-State Caching improves efficiency by reusing historical query states.

Abstract

Test-Time Compute (TTC) has emerged as a powerful paradigm for enhancing the performance of Large Language Models (LLMs) at inference, leveraging strategies such as Test-Time Training (TTT) and Retrieval-Augmented Generation (RAG). However, the optimal adaptation strategy varies across queries, and indiscriminate application of TTC strategy incurs substantial computational overhead. In this work, we introduce Reward-Guided Test-Time Compute (RTTC), a novel framework that adaptively selects the most effective TTC strategy for each query via a pretrained reward model, maximizing downstream accuracy across diverse domains and tasks. RTTC operates in a distributed server-client architecture, retrieving relevant samples from a remote knowledge base and applying RAG or lightweight fine-tuning on client devices only when necessary. To further mitigate redundant computation, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.