M$^3$Prune: Hierarchical Communication Graph Pruning for Efficient Multi-Modal Multi-Agent Retrieval-Augmented Generation
Weizi Shao, Taolin Zhang, Zijie Zhou, Chen Chen, Chengyu Wang, Xiaofeng He

TL;DR
M$^3$Prune introduces a hierarchical graph pruning method for multi-modal multi-agent systems that reduces token overhead and improves efficiency without sacrificing performance.
Contribution
The paper presents a novel hierarchical communication graph pruning framework for multi-modal multi-agent systems, optimizing communication efficiency and task performance.
Findings
Outperforms existing multi-agent mRAG systems in benchmarks.
Reduces token consumption significantly while maintaining high performance.
Effective intra- and inter-modal graph sparsification improves efficiency.
Abstract
Recent advancements in multi-modal retrieval-augmented generation (mRAG), which enhance multi-modal large language models (MLLMs) with external knowledge, have demonstrated that the collective intelligence of multiple agents can significantly outperform a single model through effective communication. Despite impressive performance, existing multi-agent systems inherently incur substantial token overhead and increased computational costs, posing challenges for large-scale deployment. To address these issues, we propose a novel Multi-Modal Multi-agent hierarchical communication graph PRUNING framework, termed MPrune. Our framework eliminates redundant edges across different modalities, achieving an optimal balance between task performance and token overhead. Specifically, MPrune first applies intra-modal graph sparsification to textual and visual modalities, identifying the edges…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning
