AgentSlimming: Towards Efficient and Cost-Aware Multi-Agent Systems

Yulang Chen; Haoxuan Peng; Jinyan Liu; Zichen Wen; Dongrui Liu; Linfeng Zhang

arXiv:2605.08813·cs.LG·May 12, 2026

AgentSlimming: Towards Efficient and Cost-Aware Multi-Agent Systems

Yulang Chen, Haoxuan Peng, Jinyan Liu, Zichen Wen, Dongrui Liu, Linfeng Zhang

PDF

1 Repo

TL;DR

AgentSlimming is a framework that compresses multi-agent workflows to reduce token costs by pruning and replacing redundant agents, maintaining performance and improving efficiency.

Contribution

It introduces a novel, plug-and-play compression method for multi-agent systems that effectively reduces costs while preserving or enhancing task performance.

Findings

01

Reduced token cost by up to 78.9% with negligible performance loss.

02

Achieved a Pareto-optimal trade-off between cost and quality.

03

Code is publicly available at https://github.com/CitrusYL/AgentSlimming

Abstract

Large Language Model-based Multi-Agent Systems (MAS) have demonstrated remarkable capabilities in complex tasks. However, manually designing optimal communication topologies is labor-intensive, while automated expansion methods often result in bloated structures with redundant agents, leading to excessive token consumption. To address this problem, we introduce \textbf{AgentSlimming}, a plug-and-play compression framework for graph-structured multi-agent workflows. Motivated by pruning and quantization in neural networks, AgentSlimming compresses workflows by first estimating the importance score of each agent with a hybrid mechanism, and then removes redundant agents or replaces them with low-cost ones, where each operation is validated using a baseline-anchored acceptance rule to prevent performance collapse. Experiments show that AgentSlimming reduces average token cost by up to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

CitrusYL/AgentSlimming
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.