Batching for Green AI -- An Exploratory Study on Inference
Tim Yarally, Lu\'is Cruz, Daniel Feitosa, June Sallou, Arie van, Deursen

TL;DR
This study explores how input batching during inference impacts energy consumption and response times in neural networks, revealing significant effects and questioning the rapid rise in energy use relative to accuracy.
Contribution
It provides an empirical analysis of batching effects on energy and response times during inference and discusses the evolution of neural network efficiency over the last decade.
Findings
Batching significantly reduces energy consumption and response times during inference.
Energy consumption has increased faster than accuracy over the past decade.
ShuffleNetV2 offers a good balance of performance and low energy use.
Abstract
The batch size is an essential parameter to tune during the development of new neural networks. Amongst other quality indicators, it has a large degree of influence on the model's accuracy, generalisability, training times and parallelisability. This fact is generally known and commonly studied. However, during the application phase of a deep learning model, when the model is utilised by an end-user for inference, we find that there is a disregard for the potential benefits of introducing a batch size. In this study, we examine the effect of input batching on the energy consumption and response times of five fully-trained neural networks for computer vision that were considered state-of-the-art at the time of their publication. The results suggest that batching has a significant effect on both of these metrics. Furthermore, we present a timeline of the energy efficiency and accuracy of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAir Quality Monitoring and Forecasting · Advanced Neural Network Applications · Traffic Prediction and Management Techniques
