MMoA: An AI-Agent framework with recurrence for Memoried Mixure-of-Agent
Rui Chu

TL;DR
MMoA introduces a recurrent, context-aware mixture-of-agents framework with LSTM gating that enhances efficiency and maintains accuracy in large language model aggregation.
Contribution
It proposes MMoA, a novel recurrent architecture with adaptive agent routing, improving efficiency without sacrificing performance in multi-agent LLM systems.
Findings
Achieves comparable accuracy to traditional MoA on benchmarks.
Reduces computational overhead by dynamically activating fewer agents.
Improves runtime efficiency by up to 4.6%.
Abstract
The Mixture-of-Agents (MoA) framework has shown promise in improving large language model (LLM) performance by aggregating outputs from multiple agents. However, existing MoA systems often rely on static routers that do not fully capture temporal and contextual dependencies across aggregation layers. To address this limitation, we propose MMoA, a recurrent MoA architecture that integrates LSTM-based gating into the agent selection process. The recurrence router adaptively modulates agent contributions based on both current inputs and historical routing decisions, enabling more context-aware aggregation. We evaluate MMoA on standard instruction-following benchmarks, including AlpacaEval 2.0, MT-Bench, and Arena-Hard. The results show that MMoA achieves comparable accuracy to traditional MoA while reducing computational overhead by dynamically activating fewer agents. For example, on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
