TARSE: Test-Time Adaptation via Retrieval of Skills and Experience for Reasoning Agents

Junda Wang; Zonghai Tao; Hansi Zeng; Zhichao Yang; Hamed Zamani; Hong Yu

arXiv:2603.01241·cs.IR·March 3, 2026

TARSE: Test-Time Adaptation via Retrieval of Skills and Experience for Reasoning Agents

Junda Wang, Zonghai Tao, Hansi Zeng, Zhichao Yang, Hamed Zamani, Hong Yu

PDF

Open Access

TL;DR

This paper introduces TARSE, a method that improves clinical question answering by retrieving relevant skills and experiences and performing test-time adaptation to enhance reasoning accuracy.

Contribution

It presents a novel framework that explicitly retrieves and aligns clinical skills and experiences at test time for better reasoning in medical agents.

Findings

01

Consistent performance improvements over baseline methods.

02

Effective retrieval of relevant skills and experiences.

03

Enhanced reasoning accuracy in medical question answering.

Abstract

Complex clinical decision making often fails not because a model lacks facts, but because it cannot reliably select and apply the right procedural knowledge and the right prior example at the right reasoning step. We frame clinical question answering as an agent problem with two explicit, retrievable resources: skills, reusable clinical procedures such as guidelines, protocols, and pharmacologic mechanisms; and experience, verified reasoning trajectories from previously solved cases (e.g., chain-of-thought solutions and their step-level decompositions). At test time, the agent retrieves both relevant skills and experiences from curated libraries and performs lightweight test-time adaptation to align the language model's intermediate reasoning with clinically valid logic. Concretely, we build (i) a skills library from guideline-style documents organized as executable decision rules, (ii)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Machine Learning in Healthcare