Towards Resource-Efficient LLMs: End-to-End Energy Accounting of Distillation Pipelines
Katherine Lambert, Sasha Luccioni

TL;DR
This paper introduces a comprehensive energy accounting framework for distillation pipelines in large language models, measuring full end-to-end energy costs to promote more resource-efficient AI development.
Contribution
It presents a detailed energy measurement method for distillation, revealing hidden costs and providing guidelines for energy-efficient model training.
Findings
Energy costs vary significantly across distillation phases.
Synthetic-data fine-tuning can be more energy-efficient than traditional methods.
Open-source tools enable standardized, reproducible energy accounting.
Abstract
The rise in deployment of large language models has driven a surge in GPU demand and datacenter scaling, raising concerns about electricity use, grid stress, and the impacts of modern AI workloads. Distillation is often promoted as one of the most effective paths to obtain cheaper, more efficient models, yet these claims rarely account for the full end-to-end energy and resource costs, including crucial teacher-side workloads such as data generation, logit caching, and evaluation. We present a comprehensive energy accounting framework that measures the complete computational cost of distillation pipelines via detailed stage-wise tracking of GPU device power consumption. In our experiments, we separate and log empirical energy use across distinct phases and systematically measure the energy and emissions of two common distillation methods: the classic logit-based knowledge distillation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
