Harnesses for Inference-Time Alignment over Execution Trajectories

Boyuan Wang; Bochao Li; Minghan Wang; Yuxin Tao; Fang Kong

arXiv:2605.21516·cs.LG·May 22, 2026

Harnesses for Inference-Time Alignment over Execution Trajectories

Boyuan Wang, Bochao Li, Minghan Wang, Yuxin Tao, Fang Kong

PDF

TL;DR

This paper analyzes harness engineering for LLMs at inference time, focusing on how task decomposition and guided execution influence performance, revealing limitations and proposing partial harness strategies for better success rates.

Contribution

It introduces a trajectory alignment perspective to quantify harness design effects, identifying failure modes and demonstrating the effectiveness of partial harnesses.

Findings

01

Over-decomposition and over-pruning can reduce success.

02

Guided execution reshapes local action distributions.

03

Partial harnesses can outperform fully structured workflows.

Abstract

Harness engineering has emerged as an important inference-time technique for large language model (LLM) agents, aiming to improve long-term performance through task decomposition and guided execution. However, more elaborate harnesses are not uniformly better: increasing decomposition or guidance can sometimes improve execution, but can also reduce final task success. We study harness design through the lens of inference-time trajectory alignment. This perspective separates harness into two mechanisms: task decomposition, which structures a task into sub-goals, and guided execution, which reshapes local action distributions during execution. This decomposition allows us to quantify how workflow granularity, retry budgets, and guidance-induced action reweighting shape the performance limits of harness design. It further reveals concrete failure modes, including over-decomposition,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.