# Energy and Policy Considerations for Deep Learning in NLP

**Authors:** Emma Strubell, Ananya Ganesh, Andrew McCallum

arXiv: 1906.02243 · 2019-06-07

## TL;DR

This paper highlights the significant financial and environmental costs of training large NLP models, quantifies these costs, and offers recommendations to mitigate them for more sustainable and equitable NLP research.

## Contribution

It provides the first comprehensive analysis of the energy and financial costs associated with training recent large NLP models and suggests practical ways to reduce these impacts.

## Key findings

- Training large NLP models incurs high energy consumption and costs.
- Environmental impact of NLP models is substantial and quantifiable.
- Recommendations can help reduce costs and promote equity in NLP research.

## Abstract

Recent progress in hardware and methodology for training neural networks has ushered in a new generation of large networks trained on abundant data. These models have obtained notable gains in accuracy across many NLP tasks. However, these accuracy improvements depend on the availability of exceptionally large computational resources that necessitate similarly substantial energy consumption. As a result these models are costly to train and develop, both financially, due to the cost of hardware and electricity or cloud compute time, and environmentally, due to the carbon footprint required to fuel modern tensor processing hardware. In this paper we bring this issue to the attention of NLP researchers by quantifying the approximate financial and environmental costs of training a variety of recently successful neural network models for NLP. Based on these findings, we propose actionable recommendations to reduce costs and improve equity in NLP research and practice.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.02243/full.md

## References

19 references — full list in the complete paper: https://tomesphere.com/paper/1906.02243/full.md

---
Source: https://tomesphere.com/paper/1906.02243