A Review on Edge Large Language Models: Design, Execution, and Applications
Yue Zheng, Yuhao Chen, Bin Qian, Xiufang Shi, Yuanchao Shu, Jiming, Chen

TL;DR
This survey reviews recent advancements in deploying large language models on resource-limited edge devices, focusing on design, optimization, and applications to bridge the gap between LLM capabilities and edge constraints.
Contribution
It provides a comprehensive overview of techniques for resource-efficient edge LLM deployment, covering design, optimization, and application domains, and identifies future research directions.
Findings
Summarizes state-of-the-art edge LLM techniques
Highlights challenges in resource constraints and hardware heterogeneity
Suggests future research directions for edge LLMs
Abstract
Large language models (LLMs) have revolutionized natural language processing with their exceptional understanding, synthesizing, and reasoning capabilities. However, deploying LLMs on resource-constrained edge devices presents significant challenges due to computational limitations, memory constraints, and edge hardware heterogeneity. This survey provides a comprehensive overview of recent advancements in edge LLMs, covering the entire lifecycle: from resource-efficient model design and pre-deployment strategies to runtime inference optimizations. It also explores on-device applications across various domains. By synthesizing state-of-the-art techniques and identifying future research directions, this survey bridges the gap between the immense potential of LLMs and the constraints of edge computing.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing · Recommender Systems and Techniques · Technology and Data Analysis
