Cognitive Edge Computing: A Comprehensive Survey on Optimizing Large Models and AI Agents for Pervasive Deployment
Xubin Wang, Qing Li, Weijia Jia

TL;DR
This survey explores methods and system architectures for deploying large, reasoning-capable AI models on resource-limited edge devices, emphasizing optimization, adaptive intelligence, and evaluation protocols.
Contribution
It provides a comprehensive framework for cognitive edge computing, integrating model optimization, system design, and adaptive techniques tailored for resource-constrained environments.
Findings
Unified framework for edge AI deployment
Evaluation protocol for edge-specific metrics
Identification of remaining challenges and guidelines
Abstract
This article surveys Cognitive Edge Computing as a practical and methodical pathway for deploying reasoning-capable Large Language Models (LLMs) and autonomous AI agents on resource-constrained devices at the network edge. We present a unified, cognition-preserving framework spanning: (1) model optimization (quantization, sparsity, low-rank adaptation, distillation) aimed at retaining multi-step reasoning under tight memory/compute budgets; (2) system architecture (on-device inference, elastic offloading, cloud-edge collaboration) that trades off latency, energy, privacy, and capacity; and (3) adaptive intelligence (context compression, dynamic routing, federated personalization) that tailors computation to task difficulty and device constraints. We synthesize advances in efficient Transformer design, multimodal integration, hardware-aware compilation, privacy-preserving learning, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIoT and Edge/Fog Computing
MethodsSparse Evolutionary Training
