SAT: Sequential Agent Tuning for Coordinator Free Plug and Play Multi-LLM Training with Monotonic Improvement Guarantees

Yi Xie; Yangyang Xu; Yi Fan; Bo Liu

arXiv:2605.05216·cs.LG·May 8, 2026

SAT: Sequential Agent Tuning for Coordinator Free Plug and Play Multi-LLM Training with Monotonic Improvement Guarantees

Yi Xie, Yangyang Xu, Yi Fan, Bo Liu

PDF

1 Repo

TL;DR

The paper introduces Sequential Agent Tuning (SAT), a scalable, decentralized training method for multi-LLM teams that guarantees monotonic improvement and allows plug-and-play upgrades, demonstrated on benchmark tasks.

Contribution

SAT provides a novel coordinator-free training paradigm with theoretical guarantees for monotonic improvement and plug-and-play agent upgrades in multi-LLM systems.

Findings

01

A team of three 4B agents trained with SAT outperforms larger models on benchmarks.

02

Swapping in stronger agents improves team performance by over 10%.

03

SAT ensures stable, scalable training with formal performance guarantees.

Abstract

Large language models (LLMs) with a large number of parameters achieve strong performance but are often prohibitively expensive to deploy. Recent work explores using teams of smaller, more efficient LLMs that collectively match or even outperform a single large model. However, jointly updating multiple agents introduces compounding distribution shifts, making coordination and stability during training difficult. We address this by introducing Sequential Agent Tuning (SAT), a coordinator-free training paradigm. SAT represents the team as a factorized policy and employs block-coordinate updates over agents, enabling scalable, decentralized training without a central controller. Specifically, we develop a sequence-aware, on-policy advantage estimator that conditions on the evolving team policy, coupled with per-agent KL trust regions that isolate occupancy drift. Theoretically, this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Yydc/SAT-AAMAS
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.