M$^3$Prune: Hierarchical Communication Graph Pruning for Efficient Multi-Modal Multi-Agent Retrieval-Augmented Generation

Weizi Shao; Taolin Zhang; Zijie Zhou; Chen Chen; Chengyu Wang; Xiaofeng He

arXiv:2511.19969·cs.AI·November 26, 2025

M$^3$Prune: Hierarchical Communication Graph Pruning for Efficient Multi-Modal Multi-Agent Retrieval-Augmented Generation

Weizi Shao, Taolin Zhang, Zijie Zhou, Chen Chen, Chengyu Wang, Xiaofeng He

PDF

Open Access

TL;DR

M$^3$Prune introduces a hierarchical graph pruning method for multi-modal multi-agent systems that reduces token overhead and improves efficiency without sacrificing performance.

Contribution

The paper presents a novel hierarchical communication graph pruning framework for multi-modal multi-agent systems, optimizing communication efficiency and task performance.

Findings

01

Outperforms existing multi-agent mRAG systems in benchmarks.

02

Reduces token consumption significantly while maintaining high performance.

03

Effective intra- and inter-modal graph sparsification improves efficiency.

Abstract

Recent advancements in multi-modal retrieval-augmented generation (mRAG), which enhance multi-modal large language models (MLLMs) with external knowledge, have demonstrated that the collective intelligence of multiple agents can significantly outperform a single model through effective communication. Despite impressive performance, existing multi-agent systems inherently incur substantial token overhead and increased computational costs, posing challenges for large-scale deployment. To address these issues, we propose a novel Multi-Modal Multi-agent hierarchical communication graph PRUNING framework, termed M $^{3}$ Prune. Our framework eliminates redundant edges across different modalities, achieving an optimal balance between task performance and token overhead. Specifically, M $^{3}$ Prune first applies intra-modal graph sparsification to textual and visual modalities, identifying the edges…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning