Edge Large AI Models: Collaborative Deployment and IoT Applications
Zixin Wang, Yuanming Shi, and Khaled. B. Letaief

TL;DR
This paper presents a framework for deploying large AI models at the network edge, enabling real-time IoT applications through collaborative training and modular inference to optimize resource use and reduce latency.
Contribution
It introduces a novel collaborative deployment framework for edge large AI models, including adaptive training and microservice-based inference, tailored for resource-constrained IoT environments.
Findings
Reduces communication and computation overhead during model fine-tuning.
Improves resource utilization and inference latency through modular architecture.
Enables diverse IoT applications with context-aware generative tasks.
Abstract
Large artificial intelligence models (LAMs) emulate human-like problem-solving capabilities across diverse domains, modalities, and tasks. By leveraging the communication and computation resources of geographically distributed edge devices, edge LAMs enable real-time intelligent services at the network edge. Unlike conventional edge AI, which relies on small or moderate-sized models for direct feature-to-prediction mappings, edge LAMs leverage the intricate coordination of modular components to enable context-aware generative tasks and multi-modal inference. We shall propose a collaborative deployment framework for edge LAM by characterizing the LAM intelligent capabilities and limited edge network resources. Specifically, we propose a collaborative training framework over heterogeneous edge networks that adaptively decomposes LAMs according to computation resources, data modalities,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing
