HARBOR: Automated Harness Optimization
Biswa Sengupta, Jinhua Wang

TL;DR
This paper introduces HARBOR, an automated Bayesian optimization method for tuning complex language-model agent harnesses, reducing manual effort and improving configuration efficiency.
Contribution
It formalizes harness optimization as a constrained noisy Bayesian optimization problem and provides a reference solver, HARBOR, applicable across various agent configurations.
Findings
HARBOR outperforms manual tuning in a controlled case study.
The formulation is task-class agnostic and adaptable to different agent harnesses.
Automated optimization reduces complexity compared to manual stacking.
Abstract
Long-horizon language-model agents are dominated, in lines of code and in operational complexity, not by their underlying model but by the harness that wraps it: context compaction, tool caching, semantic memory, trajectory reuse, speculative tool prediction, and the glue that binds the model to a sandboxed execution environment. We argue that harness design is a first-class machine-learning problem and that automated configuration search dominates manual stacking once the flag space exceeds a handful of bits. We defend this claim in two steps. First, we formalize automated harness optimization as constrained noisy Bayesian optimization over a mixed-variable, cost-heterogeneous configuration space with cold-start-corrected rewards and a posterior chance-constrained safety check, and give a reference solver, HARBOR (Harness Axis-aligned Regularized Bayesian Optimization Routine), built…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
