ExecTune: Effective Steering of Black-Box LLMs with Guide Models
Vijay Lingam, Aditya Golatkar, Anwesan Pal, Ben Vo, Narayanan Sadagopan, Alessandro Achille, Jun Huan, Anoop Deoras, Stefano Soatto

TL;DR
This paper introduces ExecTune, a training method for Guide-Core Policies that enhances the performance and efficiency of black-box LLM systems by optimizing guide strategies for better execution and cost savings.
Contribution
It proposes ExecTune, a novel training recipe combining multiple techniques to improve guide strategies in GCoP, leading to better accuracy and lower inference costs.
Findings
GCoP with ExecTune improves accuracy by up to 9.2% over prior methods.
Inference costs are reduced by up to 22.4% with ExecTune.
ExecTune enables models like Claude Haiku 3.5 to outperform competitors on math and code tasks.
Abstract
For large language models deployed through black-box APIs, recurring inference costs often exceed one-time training costs. This motivates composed agentic systems that amortize expensive reasoning into reusable intermediate representations. We study a broad class of such systems, termed Guide-Core Policies (GCoP), in which a guide model generates a structured strategy that is executed by a black-box core model. This abstraction subsumes base, supervised, and advisor-style approaches, which differ primarily in how the guide is trained. We formalize GCoP under a cost-sensitive utility objective and show that end-to-end performance is governed by guide-averaged executability: the probability that a strategy generated by the guide can be faithfully executed by the core. Our analysis shows that existing GCoP instantiations often fail to optimize executability under deployment constraints,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
