Generative AI on the Edge: Architecture and Performance Evaluation
Zeinab Nezami, Maryam Hafeez, Karim Djemame, and Syed Ali Raza Zaidi

TL;DR
This paper evaluates the performance of various Large Language Models on low-cost edge devices like Raspberry Pi, demonstrating their feasibility for localized AI inference in 6G networks without cloud dependence.
Contribution
It provides a systematic architecture and performance analysis of Generative AI models on commodity edge hardware, filling a gap in practical deployment insights.
Findings
Lightweight LLMs can run effectively on Raspberry Pi with acceptable throughput.
CPU-only deployment supports edge applications with low resource usage.
Edge GenAI enables localized inference in bandwidth-constrained environments.
Abstract
6G's AI native vision of embedding advance intelligence in the network while bringing it closer to the user requires a systematic evaluation of Generative AI (GenAI) models on edge devices. Rapidly emerging solutions based on Open RAN (ORAN) and Network-in-a-Box strongly advocate the use of low-cost, off-the-shelf components for simpler and efficient deployment, e.g., in provisioning rural connectivity. In this context, conceptual architecture, hardware testbeds and precise performance quantification of Large Language Models (LLMs) on off-the-shelf edge devices remains largely unexplored. This research investigates computationally demanding LLM inference on a single commodity Raspberry Pi serving as an edge testbed for ORAN. We investigate various LLMs, including small, medium and large models, on a Raspberry Pi 5 Cluster using a lightweight Kubernetes distribution (K3s) with modular…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Scientific Computing and Data Management
