ALPINE: An adaptive language-agnostic pruning method for language models for code
Mootez Saad, Jos\'e Antonio Hern\'andez L\'opez, Boqi Chen and, D\'aniel Varr\'o, Tushar Sharma

TL;DR
ALPINE is an adaptive, language-agnostic pruning method for code language models that significantly reduces computational resources and environmental impact while maintaining high predictive performance.
Contribution
It introduces a pluggable, adaptive pruning layer for Transformer models that compresses input sequences up to three times, reducing FLOPs, memory, and CO2 emissions.
Findings
Up to 50% reduction in FLOPs
58.1% decrease in memory footprint
28.1% improvement in throughput
Abstract
Language models of code have demonstrated state-of-the-art performance across various software engineering and source code analysis tasks. However, their demanding computational resource requirements and consequential environmental footprint remain as significant challenges. This work introduces ALPINE, an adaptive programming language-agnostic pruning technique designed to substantially reduce these models' computational overhead. The proposed method offers a pluggable layer that can be integrated with all Transformer-based models. With ALPINE, input sequences undergo adaptive compression throughout the pipeline, reaching a size up to less their initial size, resulting in significantly reduced computational load. Our experiments on two software engineering tasks, defect prediction and code clone detection across three language models CodeBERT, GraphCodeBERT and UniXCoder…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
