Multi-Agentic AI for Fairness-Aware and Accelerated Multi-modal Large Model Inference in Real-world Mobile Edge Networks

Haiyuan Li; Hari Madhukumar; Shuangyi Yan; Yulei Wu; Dimitra Simeonidou

arXiv:2602.07215·eess.SY·February 10, 2026

Multi-Agentic AI for Fairness-Aware and Accelerated Multi-modal Large Model Inference in Real-world Mobile Edge Networks

Haiyuan Li, Hari Madhukumar, Shuangyi Yan, Yulei Wu, Dimitra Simeonidou

PDF

Open Access

TL;DR

This paper introduces a multi-agent AI framework for efficient, fair, and low-latency multi-modal large model inference in mobile edge networks, addressing resource heterogeneity and privacy concerns.

Contribution

It proposes a cooperative multi-agent system utilizing foundation models for optimized prompt routing and deployment in resource-constrained edge environments.

Findings

01

Reduces average latency by over 80%

02

Achieves fairness with a normalized Jain index of 0.90

03

Adapts quickly without fine-tuning

Abstract

Generative AI (GenAI) has transformed applications in natural language processing and content creation, yet centralized inference remains hindered by high latency, limited customizability, and privacy concerns. Deploying large models (LMs) in mobile edge networks emerges as a promising solution. However, it also poses new challenges, including heterogeneous multi-modal LMs with diverse resource demands and inference speeds, varied prompt/output modalities that complicate orchestration, and resource-limited infrastructure ill-suited for concurrent LM execution. In response, we propose a Multi-Agentic AI framework for latency- and fairness-aware multi-modal LM inference in mobile edge networks. Our solution includes a long-term planning agent, a short-term prompt scheduling agent, and multiple on-node LM deployment agents, all powered by foundation language models. These agents…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBig Data and Digital Economy · Advanced Neural Network Applications · Mobile Crowdsensing and Crowdsourcing