Characterizing and Understanding Energy Footprint and Efficiency of Small Language Model on Edges

Md Romyull Islam; Bobin Deng; Nobel Dhar; Tu N. Nguyen; Selena He; Yong Shi; Kun Suo

arXiv:2511.11624·cs.DC·November 18, 2025

Characterizing and Understanding Energy Footprint and Efficiency of Small Language Model on Edges

Md Romyull Islam, Bobin Deng, Nobel Dhar, Tu N. Nguyen, Selena He, Yong Shi, Kun Suo

PDF

Open Access

TL;DR

This paper evaluates the energy efficiency of small language models on edge devices, analyzing their performance, power consumption, and tradeoffs to guide deployment in energy-constrained environments.

Contribution

It provides a comprehensive empirical analysis of SLMs on edge hardware, highlighting key factors affecting energy efficiency and practical deployment insights.

Findings

01

Jetson Orin Nano with GPU offers highest energy-to-performance ratio

02

Llama 3.2 balances accuracy and power efficiency effectively

03

TinyLlama is suitable for low-power scenarios with reduced accuracy

Abstract

Cloud-based large language models (LLMs) and their variants have significantly influenced real-world applications. Deploying smaller models (i.e., small language models (SLMs)) on edge devices offers additional advantages, such as reduced latency and independence from network connectivity. However, edge devices' limited computing resources and constrained energy budgets challenge efficient deployment. This study evaluates the power efficiency of five representative SLMs - Llama 3.2, Phi-3 Mini, TinyLlama, and Gemma 2 on Raspberry Pi 5, Jetson Nano, and Jetson Orin Nano (CPU and GPU configurations). Results show that Jetson Orin Nano with GPU acceleration achieves the highest energy-to-performance ratio, significantly outperforming CPU-based setups. Llama 3.2 provides the best balance of accuracy and power efficiency, while TinyLlama is well-suited for low-power environments at the cost…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBig Data and Digital Economy · IoT and Edge/Fog Computing · Advanced Neural Network Applications