Language Models in Software Development Tasks: An Experimental Analysis of Energy and Accuracy
Negar Alizadeh, Boris Belchev, Nishant Saurabh, Patricia Kelbert,, Fernando Castor

TL;DR
This paper evaluates the trade-offs between energy consumption and accuracy in 18 families of language models used for software development tasks, highlighting the efficiency of quantized models and the lack of a one-size-fits-all solution.
Contribution
It provides an empirical analysis of the performance and energy efficiency of various locally-deployed language models across different hardware setups for software development.
Findings
Larger models with higher energy budgets do not always yield better accuracy.
Quantized large models often outperform full-precision medium models in efficiency and accuracy.
No single model suits all software development tasks.
Abstract
The use of generative AI-based coding assistants like ChatGPT and Github Copilot is a reality in contemporary software development. Many of these tools are provided as remote APIs. Using third-party APIs raises data privacy and security concerns for client companies, which motivates the use of locally-deployed language models. In this study, we explore the trade-off between model accuracy and energy consumption, aiming to provide valuable insights to help developers make informed decisions when selecting a language model. We investigate the performance of 18 families of LLMs in typical software development tasks on two real-world infrastructures, a commodity GPU and a powerful AI-specific GPU. Given that deploying LLMs locally requires powerful infrastructure which might not be affordable for everyone, we consider both full-precision and quantized models. Our findings reveal that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Techniques and Practices · Software Engineering Research · Software System Performance and Reliability
