Distributed Collaborative Inference System in Next-Generation Networks   and Communication

Chuan Zhang; Xixi Zheng; Xiaolong Tao; Chenfei Hu; Weiting Zhang and; Liehuang Zhu

arXiv:2412.12102·cs.NI·December 18, 2024

Distributed Collaborative Inference System in Next-Generation Networks and Communication

Chuan Zhang, Xixi Zheng, Xiaolong Tao, Chenfei Hu, Weiting Zhang and, Liehuang Zhu

PDF

Open Access

TL;DR

This paper proposes a multi-level collaborative inference system for 6G networks that reduces latency and improves efficiency in generative AI tasks by deploying models across network layers and optimizing task offloading.

Contribution

It introduces a novel deployment and task offloading strategy combined with an early exit mechanism for efficient GAI inference in next-generation networks.

Findings

01

Reduces inference latency by up to 17%

02

Maintains high inference accuracy

03

Enhances efficiency in resource-constrained devices

Abstract

With the rapid advancement of artificial intelligence, generative artificial intelligence (GAI) has taken a leading role in transforming data processing methods. However, the high computational demands of GAI present challenges for devices with limited resources. As we move towards the sixth generation of mobile networks (6G), the higher data rates and improved energy efficiency of 6G create a need for more efficient data processing in GAI. Traditional GAI, however, shows its limitations in meeting these demands. To address these challenges, we introduce a multi-level collaborative inference system designed for next-generation networks and communication. Our proposed system features a deployment strategy that assigns models of varying sizes to devices at different network layers. Then, we design a task offloading strategy to optimise both efficiency and latency. Furthermore, a modified…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCognitive Computing and Networks