ELANA: A Simple Energy and Latency Analyzer for LLMs

Hung-Yueh Chiang; Bokun Wang; and Diana Marculescu

arXiv:2512.09946·cs.DC·December 12, 2025

ELANA: A Simple Energy and Latency Analyzer for LLMs

Hung-Yueh Chiang, Bokun Wang, and Diana Marculescu

PDF

Open Access

TL;DR

ELANA is a lightweight, open-source profiling tool designed to evaluate the latency and energy consumption of large language models across various hardware platforms, aiding optimization and research.

Contribution

We introduce ELANA, a simple, versatile profiler for analyzing LLM performance metrics on diverse hardware, supporting all Hugging Face models and energy logging.

Findings

01

Supports multi-GPU and edge GPU platforms.

02

Compatible with all Hugging Face models.

03

Includes energy consumption logging.

Abstract

The latency and power consumption of large language models (LLMs) are major constraints when serving them across a wide spectrum of hardware platforms, from mobile edge devices to cloud GPU clusters. Benchmarking is crucial for optimizing efficiency in both model deployment and next-generation model development. To address this need, we open-source a simple profiling tool, \textbf{ELANA}, for evaluating LLMs. ELANA is designed as a lightweight, academic-friendly profiler for analyzing model size, key-value (KV) cache size, prefilling latency (Time-to-first-token, TTFT), generation latency (Time-per-output-token, TPOT), and end-to-end latency (Time-to-last-token, TTLT) of LLMs on both multi-GPU and edge GPU platforms. It supports all publicly available models on Hugging Face and offers a simple command-line interface, along with optional energy consumption logging. Moreover, ELANA is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Natural Language Processing Techniques · Machine Learning in Materials Science