Efficient and Interpretable Multi-Agent LLM Routing via Ant Colony Optimization
Xudong Wang, Chaoning Zhang, Jiaquan Zhang, Chenghao Li, Qigan Sun, Sung-Ho Bae, Peng Wang, Ning Xie, Jie Zou, Yang Yang, Hengtao Shen

TL;DR
This paper introduces AMRO-S, an efficient, interpretable routing framework for multi-agent systems that improves performance and resource utilization by combining semantic intent inference, task-specific memory, and asynchronous updates.
Contribution
AMRO-S is a novel routing approach that models semantic-aware path selection with low overhead, task-specific memory, and decoupled inference and learning, addressing limitations of existing methods.
Findings
Consistently outperforms baseline routing strategies in quality-cost trade-off.
Reduces inference latency through small language model intent inference.
Provides traceable routing evidence via structured pheromone patterns.
Abstract
Large Language Model (LLM)-driven Multi-Agent Systems (MAS) have demonstrated strong capability in complex reasoning and tool use, and heterogeneous agent pools further broaden the quality--cost trade-off space. Despite these advances, real-world deployment is often constrained by high inference cost, latency, and limited transparency, which hinders scalable and efficient routing. Existing routing strategies typically rely on expensive LLM-based selectors or static policies, and offer limited controllability for semantic-aware routing under dynamic loads and mixed intents, often resulting in unstable performance and inefficient resource utilization. To address these limitations, we propose AMRO-S, an efficient and interpretable routing framework for Multi-Agent Systems (MAS). AMRO-S models MAS routing as a semantic-conditioned path selection problem, enhancing routing performance…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
1. The method offers a clear and intuitive framework for dynamic routing in multi-agent systems. 2. Integrating pheromone-guided decisions improves interpretability and responsiveness. 3. The experimental setup is comprehensive, covering multiple datasets and baselines.
1. The novelty is limited. The approach transfers existing ACO techniques with minimal adaptation to the LLM domain. 2. There is no quantitative evaluation of interpretability, which is central to the paper’s motivation. 3. The method introduces many hyperparameters. Their tuning process is not well justified or automated. 4. Scalability to large agent graphs is not discussed. Real-world feasibility in latency-sensitive environments remains unclear.
S1: This is the first known application of Ant Colony Optimization to the task of LLM-based MAS routing, introducing a biologically inspired mechanism into the AI routing domain. S2: AMRO’s pheromone-guided routing introduces visualizable decision processes, addressing the black-box limitations common in LLM routing. S3: The probabilistic routing and pheromone update processes are rigorously formalized, which aids reproducibility. S4: Sensitivity analyses across various parameters show AMRO’
The best-case performance gains over the strongest baseline (MasRouter) are only 0.97% on average. This is modest considering the added complexity and novelty, and may not justify the overhead in practical systems. The use of ACO in this paper is largely a standard adaptation; the methodology borrows heavily from classical ACO formulations without substantive algorithmic innovation tailored to LLM routing. While routing optimization is valuable, the paper lacks discussion on how agent specia
1. First application of ant colony optimization to LLM-based multi-agent routing presents an interesting cross-domain perspective 2. Pheromone visualization provides some interpretability for routing decisions, improving over purely black-box LLM routing approaches 3. Load-aware routing selection mechanism shows practical value in high-concurrency testing scenarios 4. Detailed hyperparameter sensitivity analysis demonstrates the stability of the approach
1. ACO is a mature algorithm, and the paper primarily adapts it to LLM routing without deep algorithmic innovation. Equations 7-10 are standard ACO variants with unclear essential differences from traditional ACO 2. Compared to the strongest baseline MASRouter, average gain is only $0.97%$, with $1.7%$ improvement on MATH dataset. Such marginal improvements hardly justify the complexity of introducing ACO mechanisms 3. The system is rigidly designed as N-layer structure with n nodes per layer.
1. It proposes a practical, multi-objective dynamic routing mechanism. The essence of AMRO lies in a decentralized, feedback-based routing algorithm that skillfully combines historical performance (pheromones) with current states (load, latency) to dynamically balance multiple objectives, including quality, cost, speed, and load. 2. Using Ant Colony Optimization to optimize the MAS structure is more stable and incurs lower training costs compared to certain RL algorithms, and its effectiveness
1. The modeling and workflow of the proposed method are not clearly introduced in the main text, leading to confusion. For example, the definition of “layer” only appears in the caption of Figure 2 and is not elaborated upon in the main text. This may easily mislead readers into thinking that the “layers” refer to the central parts of Figure 2—LLM Type, Method Type, and Role Type—which resembles the architecture of MasRouter. 2. The algorithm appears insufficient in customizing the MAS framewor
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware-Defined Networks and 5G · Big Data and Digital Economy · Advanced Neural Network Applications
