Fine-Grained Energy Prediction For Parallellized LLM Inference With PIE-P
Anurag Dutt, Young Won Choi, Avirup Sil, Anshul Gandhi, Aruna Balasubramanian, Niranjan Balasubramanian

TL;DR
This paper introduces PIE-P, a novel fine-grained energy prediction framework for multi-GPU LLM inference that accounts for parallelism complexities, enabling accurate energy estimation where hardware monitors are impractical.
Contribution
PIE-P is the first framework to accurately predict energy consumption for parallelized LLM inference across multiple GPUs, addressing communication and synchronization complexities.
Findings
PIE-P significantly outperforms existing prediction methods.
It provides accurate energy estimates across various parallelism strategies.
The framework effectively handles non-determinism and communication overheads.
Abstract
With the widespread adoption of Large Language Models (LLMs), energy costs of running LLMs is quickly becoming a critical concern. However, precisely measuring the energy consumption of LLMs is often infeasible because hardware-based power monitors are not always accessible and software-based energy measurement tools are not accurate. While various prediction techniques have been developed to estimate LLM energy consumption, these approaches are limited to single-GPU environments and thus are not applicable to modern LLM inference which is typically parallelized across multiple GPUs. In this work, we remedy this gap and introduce PIE-P, a fine-grained energy prediction framework for multi-GPU inference, including tensor, pipeline, and data parallelism. Predicting the energy under parallelized inference is complicated by the non-determinism in inter-GPU communication, additional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Digital Economy · Machine Learning in Materials Science · Advanced Neural Network Applications
