Wireless Agentic AI with Retrieval-Augmented Multimodal Semantic Perception
Guangyuan Liu, Yinqiu Liu, Ruichen Zhang, Hongyang Du, Dusit Niyato, Zehui Xiong, Sumei Sun, and Abbas Jamalipour

TL;DR
This paper introduces RAMSemCom, a retrieval-augmented semantic communication framework for wireless multi-agent systems that enhances efficiency and fidelity through iterative retrieval and deep reinforcement learning.
Contribution
It presents a novel RAMSemCom framework that combines retrieval-driven semantic refinement with DRL to optimize multimodal communication in wireless multi-agent scenarios.
Findings
Improved task completion efficiency in autonomous driving case study.
Reduced communication overhead compared to baseline methods.
Enhanced semantic fidelity through iterative retrieval.
Abstract
The rapid development of multimodal AI and Large Language Models (LLMs) has greatly enhanced real-time interaction, decision-making, and collaborative tasks. However, in wireless multi-agent scenarios, limited bandwidth poses significant challenges to exchanging semantically rich multimodal information efficiently. Traditional semantic communication methods, though effective, struggle with redundancy and loss of crucial details. To overcome these challenges, we propose a Retrieval-Augmented Multimodal Semantic Communication (RAMSemCom) framework. RAMSemCom incorporates iterative, retrieval-driven semantic refinement tailored for distributed multi-agent environments, enabling efficient exchange of critical multimodal elements through local caching and selective transmission. Our approach dynamically optimizes retrieval using deep reinforcement learning (DRL) to balance semantic fidelity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOpportunistic and Delay-Tolerant Networks · Age of Information Optimization · Multimodal Machine Learning Applications
