Greening Large Language Models of Code

Jieke Shi; Zhou Yang; Hong Jin Kang; Bowen Xu; Junda He; David Lo

arXiv:2309.04076·cs.SE·January 15, 2024·2 cites

Greening Large Language Models of Code

Jieke Shi, Zhou Yang, Hong Jin Kang, Bowen Xu, Junda He, David Lo

PDF

Open Access 1 Repo

TL;DR

This paper introduces Avatar, a method to create compact, energy-efficient, and deployable models of code from large language models, achieving significant reductions in size, energy use, and latency with minimal performance loss.

Contribution

Avatar formulates model optimization as a multi-objective problem solved with SMT and tailored algorithms, enabling deployment on resource-constrained devices.

Findings

01

Models reduced to 3 MB, 160× smaller than original.

02

Energy consumption decreased up to 184×.

03

Inference latency improved up to 76×.

Abstract

Large language models of code have shown remarkable effectiveness across various software engineering tasks. Despite the availability of many cloud services built upon these powerful models, there remain several scenarios where developers cannot take full advantage of them, stemming from factors such as restricted or unreliable internet access, institutional privacy policies that prohibit external transmission of code to third-party vendors, and more. Therefore, developing a compact, efficient, and yet energy-saving model for deployment on developers' devices becomes essential. To this aim, we propose Avatar, a novel approach that crafts a deployable model from a large language model of code by optimizing it in terms of model size, inference latency, energy consumption, and carbon footprint while maintaining a comparable level of effectiveness. The key idea of Avatar is to formulate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

soarsmu/Avatar
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software System Performance and Reliability · Software Engineering Techniques and Practices