TL;DR
This paper introduces QueST, a test-time self-training method that adapts large language models to individual queries using query-derived supervision, improving reasoning performance without external data.
Contribution
QueST is a novel framework that enables query-specific model adaptation during inference by generating supervision directly from the input query.
Findings
QueST outperforms existing test-time optimization baselines on seven reasoning benchmarks.
The method effectively adapts models to individual queries without external data.
Results demonstrate improved accuracy in mathematical and scientific reasoning tasks.
Abstract
Large language models (LLMs) are typically deployed with fixed parameters, and their performance is often improved by allocating more computation at inference time. While such test-time scaling can be effective, it cannot correct model misconceptions or adapt the model to the specific structure of an individual query. Test-time optimization addresses this limitation by enabling parameter updates during inference, but existing approaches either rely on external data or optimize generic self-supervised objectives that lack query-specific alignment. In this work, we propose Query-Conditioned Test-Time Self-Training (QueST), a framework that adapts model parameters during inference using supervision derived directly from the input query. Our key insight is that the input query itself encodes latent signals sufficient for constructing structurally related problem--solution pairs. Based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
