The Hidden Cost of Thinking: Energy Use and Environmental Impact of LMs Beyond Pretraining
Jacob Morrison, Noah A. Smith, Emma Strubell

TL;DR
This paper provides a comprehensive analysis of the environmental impact of the entire development pipeline for large language models, highlighting significant energy and water consumption beyond just pretraining.
Contribution
It offers the first detailed breakdown of environmental costs across all stages of language model development, emphasizing overlooked post-training expenses.
Findings
Reasoning models are 17x more energy-intensive post-training than instruction-tuned models.
Development costs constitute 82.2% of total compute, a 65% increase over pretraining-only estimates.
Total estimated energy consumption is 12.3 GWh, emitting 4,251 tons of CO2 equivalent.
Abstract
Modern language model development extends far beyond pretraining, yet environmental reporting remains narrowly focused on the cost of training a single final model. In this work, we provide the first detailed breakdown of the environmental impact of a full model development pipeline, from pretraining through supervised fine-tuning, preference optimization, and reinforcement learning, for Olmo 3, a family of 7 billion and 32 billion parameter models in both instruction-following and reasoning variants. We find that reasoning models are 17x more expensive to post-train than their instruction-tuned counterparts in terms of datacenter energy, driven by reinforcement learning rollout generation. Development costs (including experimentation, failed runs, and ablations) account for 82.2% of total compute, a roughly 65% increase over the ~50% reported for pretraining-focused pipelines in prior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
