Evidence Over Plans: Online Trajectory Verification for Skill Distillation
Yang Zhou, Zihan Dong, Zhenting Wang, Can Jin, Shiyu Zhao, Bangwei Guo, Difei Gu, Linjun Zhang, Mu Zhou, Dimitris N. Metaxas

TL;DR
This paper introduces SPARK, a method for online trajectory verification using the Posterior Distillation Index (PDI) to improve skill distillation grounded in environment interaction, outperforming prior approaches.
Contribution
The paper proposes PDI and SPARK for environment-grounded skill distillation, enabling online diagnostics and interventions to produce more effective skills.
Findings
SPARK-generated skills outperform no-skill baselines.
Skills surpass human-written skills on student models.
PDI-guided distillation yields efficient, transferable skills.
Abstract
Agent skills can remarkably improve task success rates by using human-written procedural documents, but their quality is difficult to assess without environment-grounded verification. Existing skill generation methods heavily rely on preference logs rather than direct environment interaction, often yielding negligible or even degraded gains. We identify that it is a fundamental timing bottleneck: robust skills should be posterior-based, distilled from empirical environment interaction rather than prior plans. In this study, we introduce the Posterior Distillation Index (PDI), a trajectory-level metric that quantifies how well a distilled skill is grounded in the task-environment evidence. To operationalize PDI, we present SPARK (Structured Pipelines for Autonomous Runnable tasKs and sKill generation) for preserving task execution evidence towards full trajectory-level analysis. SPARK…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
