GenAI at the Edge: Comprehensive Survey on Empowering Edge Devices
Mozhgan Navardi, Romina Aalishah, Yuzhe Fu, Yueqian Lin, Hai Li, Yiran, Chen, Tinoosh Mohsenin

TL;DR
This survey reviews recent techniques for deploying Generative AI models on resource-limited edge devices, addressing challenges like model size and computational demands to enable real-world applications.
Contribution
It categorizes and summarizes software, hardware, and framework optimization methods for efficient GenAI deployment on edge devices, providing a comprehensive roadmap.
Findings
Identifies key challenges in edge deployment of GenAI.
Summarizes recent optimization techniques across categories.
Provides a roadmap for practical implementation.
Abstract
Generative Artificial Intelligence (GenAI) applies models and algorithms such as Large Language Model (LLM) and Foundation Model (FM) to generate new data. GenAI, as a promising approach, enables advanced capabilities in various applications, including text generation and image processing. In current practice, GenAI algorithms run mainly on the cloud server, leading to high latency and raising security concerns. Consequently, these challenges encourage the deployment of GenAI algorithms directly on edge devices. However, the large size of such models and their significant computational resource requirements pose obstacles when deploying them in resource-constrained systems. This survey provides a comprehensive overview of recent proposed techniques that optimize GenAI for efficient deployment on resource-constrained edge devices. For this aim, this work highlights three main categories…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Semiconductor Detectors and Materials · IoT and Edge/Fog Computing
