Meta-Harness: End-to-End Optimization of Model Harnesses

Yoonho Lee; Roshen Nair; Qizheng Zhang; Kangwook Lee; Omar Khattab; Chelsea Finn

arXiv:2603.28052·cs.AI·March 31, 2026

Meta-Harness: End-to-End Optimization of Model Harnesses

Yoonho Lee, Roshen Nair, Qizheng Zhang, Kangwook Lee, Omar Khattab, Chelsea Finn

PDF

2 Repos 1 Models 2 Datasets

TL;DR

Meta-Harness is an automated system that optimizes the code harnesses for large language models, significantly improving performance across various tasks by automating harness design.

Contribution

It introduces Meta-Harness, an outer-loop system that searches and optimizes harness code for LLM applications, outperforming hand-engineered baselines.

Findings

01

Improves text classification accuracy by 7.7 points with fewer tokens.

02

Enhances math reasoning accuracy by 4.7 points on IMO problems.

03

Surpasses hand-engineered baselines in agentic coding tasks.

Abstract

The performance of large language model (LLM) systems depends not only on model weights, but also on their harness: the code that determines what information to store, retrieve, and present to the model. Yet harnesses are still designed largely by hand, and existing text optimizers are poorly matched to this setting because they compress feedback too aggressively. We introduce Meta-Harness, an outer-loop system that searches over harness code for LLM applications. It uses an agentic proposer that accesses the source code, scores, and execution traces of all prior candidates through a filesystem. On online text classification, Meta-Harness improves over a state-of-the-art context management system by 7.7 points while using 4x fewer context tokens. On retrieval-augmented math reasoning, a single discovered harness improves accuracy on 200 IMO-level problems by 4.7 points on average across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
dkhanal/meta-harness
model· ♡ 1
♡ 1

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.