Skills on the Fly: Test-Time Adaptive Skill Synthesis for LLM Agents
Jingxing Wang, Chenyu Zhou, Zhihui Fu, Jun Wang, Weiwen Liu, Weinan Zhang, Jianghao Lin

TL;DR
This paper introduces SkillTTA, a method for dynamically synthesizing task-specific skills at test time for LLM agents, improving performance without retraining.
Contribution
It presents a novel test-time skill synthesis approach that retrieves relevant training trajectories to generate temporary, task-specific skills for LLM agents.
Findings
Improves success rates on SpreadsheetBench and BigCodeBench datasets.
Matches a memory-based baseline on ALFWorld with shorter trajectories.
Synthesized skills outperform raw trajectory prompts in experiments.
Abstract
LLM agents benefit from reusable skills, yet test-time tasks often require guidance more specific than a static skill library can provide. We propose \emph{SkillTTA}, a Test-Time Adaptive Skill Synthesis method that retrieves a small set of training trajectories relevant to the current task and synthesizes them into a temporary, task-specific textual skill. The solver model is kept fixed, so adaptation happens entirely through generated context rather than parameter updates. We evaluate the method on SpreadsheetBench, ALFWorld, and BigCodeBench. Compared with static trajectory-to-skill synthesis using GPT-5.5, task-specific skills improve SpreadsheetBench Pass@1 from 0.397 to 0.505 and BigCodeBench Pass@1 from 0.517 to 0.651. On ALFWorld, the method matches a heavier memory-learning baseline within four points of success rate while producing the shortest successful trajectories among…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
