TL;DR
This paper introduces TF-TTCL, a training-free, test-time adaptation framework for large language models that improves reasoning by dynamically distilling and applying semantic rules during inference.
Contribution
It proposes a novel 'Explore-Reflect-Steer' loop enabling frozen LLMs to adapt online without additional training or gradient updates.
Findings
TF-TTCL outperforms zero-shot baselines on reasoning tasks.
The method effectively distills semantic rules from inference trajectories.
Dynamic rule retrieval improves reasoning robustness.
Abstract
Large language models (LLMs) demonstrate strong reasoning capabilities, but their performance often degrades under distribution shift. Existing test-time adaptation (TTA) methods rely on gradient-based updates that require white-box access and need substantial overhead, while training-free alternatives are either static or depend on external guidance. In this paper, we propose Training-Free Test-Time Contrastive Learning TF-TTCL, a training-free adaptation framework that enables a frozen LLM to improve online by distilling supervision from its own inference experiences. Specifically, TF-TTCL implements a dynamic "Explore-Reflect-Steer" loop through three core modules: 1) Semantic Query Augmentation first diversifies problem views via multi-agent role-playing to generate different reasoning trajectories; 2) Contrastive Experience Distillation then captures the semantic gap between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
