Towards Resource-Efficient LLMs: End-to-End Energy Accounting of Distillation Pipelines

Katherine Lambert; Sasha Luccioni

arXiv:2605.13981·cs.LG·May 15, 2026

Towards Resource-Efficient LLMs: End-to-End Energy Accounting of Distillation Pipelines

Katherine Lambert, Sasha Luccioni

PDF

TL;DR

This paper introduces a comprehensive energy accounting framework for distillation pipelines in large language models, measuring full end-to-end energy costs to promote more resource-efficient AI development.

Contribution

It presents a detailed energy measurement method for distillation, revealing hidden costs and providing guidelines for energy-efficient model training.

Findings

01

Energy costs vary significantly across distillation phases.

02

Synthetic-data fine-tuning can be more energy-efficient than traditional methods.

03

Open-source tools enable standardized, reproducible energy accounting.

Abstract

The rise in deployment of large language models has driven a surge in GPU demand and datacenter scaling, raising concerns about electricity use, grid stress, and the impacts of modern AI workloads. Distillation is often promoted as one of the most effective paths to obtain cheaper, more efficient models, yet these claims rarely account for the full end-to-end energy and resource costs, including crucial teacher-side workloads such as data generation, logit caching, and evaluation. We present a comprehensive energy accounting framework that measures the complete computational cost of distillation pipelines via detailed stage-wise tracking of GPU device power consumption. In our experiments, we separate and log empirical energy use across distinct phases and systematically measure the energy and emissions of two common distillation methods: the classic logit-based knowledge distillation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.