MMoA: An AI-Agent framework with recurrence for Memoried Mixure-of-Agent

Rui Chu

arXiv:2605.19194·cs.CL·May 20, 2026

MMoA: An AI-Agent framework with recurrence for Memoried Mixure-of-Agent

Rui Chu

PDF

TL;DR

MMoA introduces a recurrent, context-aware mixture-of-agents framework with LSTM gating that enhances efficiency and maintains accuracy in large language model aggregation.

Contribution

It proposes MMoA, a novel recurrent architecture with adaptive agent routing, improving efficiency without sacrificing performance in multi-agent LLM systems.

Findings

01

Achieves comparable accuracy to traditional MoA on benchmarks.

02

Reduces computational overhead by dynamically activating fewer agents.

03

Improves runtime efficiency by up to 4.6%.

Abstract

The Mixture-of-Agents (MoA) framework has shown promise in improving large language model (LLM) performance by aggregating outputs from multiple agents. However, existing MoA systems often rely on static routers that do not fully capture temporal and contextual dependencies across aggregation layers. To address this limitation, we propose MMoA, a recurrent MoA architecture that integrates LSTM-based gating into the agent selection process. The recurrence router adaptively modulates agent contributions based on both current inputs and historical routing decisions, enabling more context-aware aggregation. We evaluate MMoA on standard instruction-following benchmarks, including AlpacaEval 2.0, MT-Bench, and Arena-Hard. The results show that MMoA achieves comparable accuracy to traditional MoA while reducing computational overhead by dynamically activating fewer agents. For example, on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.