Good Agentic Friends Do Not Just Give Verbal Advice: They Can Update Your Weights
Wenrui Bao, Huan Wang, Jian Wang, Zhangyang Wang, Kai Wang, Yuzhang Shang

TL;DR
This paper introduces TFlow, a weight-space communication framework for multi-agent LLM systems that improves efficiency and accuracy by using transient weight perturbations instead of token-based messages.
Contribution
TFlow enables instance-level adaptation in multi-agent LLMs through low-rank weight perturbations, reducing token processing and inference time while maintaining or improving accuracy.
Findings
TFlow improves accuracy by up to 8.5 points on five benchmarks.
It reduces processed tokens by up to 32.69%.
It cuts inference time by up to 4.6 times.
Abstract
Multi-agent LLM systems usually collaborate by exchanging natural-language messages. This interface is simple and interpretable, but it forces each sender's intermediate computation to be serialized into tokens and then reprocessed by the receiver, thereby increasing the generated-token cost, prefill overhead, and KV-cache memory. We study an alternative communication interface: instead of appending a sender's message to the receiver's context, compile the sender's hidden states into a transient, receiver-specific weight perturbation. We introduce TFlow (Thought Flow), a weight-space communication framework for a known and fixed receiver architecture. For each query, frozen role-prompted sender agents process the input, and a learned parameter generator maps their internal activations into low-rank LoRA perturbations targeting the receiver's modules. These perturbations are fused and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
