Understanding Multi-Agent LLM Frameworks: A Unified Benchmark and Experimental Analysis

Abdelghny Orogat; Ana Rostam; Essam Mansour

arXiv:2602.03128·cs.AI·February 4, 2026

Understanding Multi-Agent LLM Frameworks: A Unified Benchmark and Experimental Analysis

Abdelghny Orogat, Ana Rostam, Essam Mansour

PDF

Open Access

TL;DR

This paper introduces a unified benchmark and analysis for multi-agent LLM frameworks, revealing how architectural choices significantly affect performance, accuracy, and scalability, and providing guidance for future design.

Contribution

It presents an architectural taxonomy and MAFBench, a standardized evaluation suite, enabling systematic comparison and analysis of multi-agent LLM frameworks.

Findings

01

Framework design impacts latency by over 100x

02

Planning accuracy can decrease by up to 30%

03

Coordination success drops from over 90% to below 30%

Abstract

Multi-agent LLM frameworks are widely used to accelerate the development of agent systems powered by large language models (LLMs). These frameworks impose distinct architectural structures that govern how agents interact, store information, and coordinate tasks. However, their impact on system performance remains poorly understood. This gap is critical, as architectural choices alone can induce order-of-magnitude differences in latency and throughput, as well as substantial variation in accuracy and scalability. Addressing this challenge requires (i) jointly evaluating multiple capabilities, such as orchestration overhead, memory behavior, planning, specialization, and coordination, and (ii) conducting these evaluations under controlled, framework-level conditions to isolate architectural effects. Existing benchmarks focus on individual capabilities and lack standardized framework-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Topic Modeling · Natural Language Processing Techniques