TeamTR: Trust-Region Fine-Tuning for Multi-Agent LLM Coordination

Yi Xie; Siao Liu; Falong Fan; Yuanqi Yao; Yue Zhao; Bo Liu

arXiv:2605.15207·cs.LG·May 18, 2026

TeamTR: Trust-Region Fine-Tuning for Multi-Agent LLM Coordination

Yi Xie, Siao Liu, Falong Fan, Yuanqi Yao, Yue Zhao, Bo Liu

PDF

1 Repo

TL;DR

TeamTR introduces a trust-region fine-tuning method for multi-agent LLM systems that mitigates coordination issues caused by context shifts, leading to improved performance and stability.

Contribution

We formalize the compounding occupancy shift problem in multi-agent LLM fine-tuning and propose TeamTR, a trust-region framework that enforces divergence control and improves coordination.

Findings

01

TeamTR outperforms baselines with 7.1% average improvement.

02

It mitigates coordination regressions in multi-agent systems.

03

Supports plug-and-play component replacement.

Abstract

Multi-agent LLM systems have shown promise for complex reasoning, yet recent evaluations reveal they often underperform single-model baselines. We identify a structural failure mode in sequential fine-tuning of shared-context teams: updating one agent shifts the team's context distribution, and when subsequent updates are evaluated on cached rollouts, this mismatch compounds. We formalize this as the compounding occupancy shift and prove that stale-occupancy evaluation incurs a penalty that scales quadratically with the number of agents. In contrast, intermediate-occupancy evaluation reduces this to linear scaling. We propose TeamTR, a trust-region framework that resamples trajectories after each component update and enforces per-agent divergence control, yielding rigorous per-update and per-stage improvement lower bounds. Experiments show that TeamTR outperforms single-agent and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Yydc/TeamTR
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.