Sustainability Is Not Linear: Quantifying Performance, Energy, and Privacy Trade-offs in On-Device Intelligence
Eziyo Ehsani, Luca Giamattei, Ivano Malavolta, Roberto Pietrantuono

TL;DR
This paper investigates the complex trade-offs between performance, energy consumption, and privacy in deploying large language models on mobile devices, providing empirical insights and revealing counter-intuitive energy dynamics.
Contribution
It introduces a reproducible pipeline for profiling LLMs on mobile devices and offers new empirical findings on model quantization, architecture, and optimal model sizes for energy-efficient on-device AI.
Findings
Quantization yields negligible energy savings despite memory reduction.
Mixture-of-Experts architectures offer high capacity with low energy use.
Mid-sized models like Qwen2.5-3B strike a good balance between quality and energy efficiency.
Abstract
The migration of Large Language Models (LLMs) from cloud clusters to edge devices promises enhanced privacy and offline accessibility, but this transition encounters a harsh reality: the physical constraints of mobile batteries, thermal limits, and, most importantly, memory constraints. To navigate this landscape, we constructed a replicable and reproducible experimental pipeline to profile the complex interplay between energy consumption, latency, and quality of LLMs on mobile devices. We harness this pipeline to conduct an empirical case study on a flagship Android device, capturing granular metrics across eight LLMs ranging from 0.5B to 9B parameters without requiring root access, ensuring our findings reflect realistic user conditions. The findings highlight the trade-offs between generation quality, performance, power and resource consumption, revealing which LLMs offer the best…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
