Deriving Coding-Specific Sub-Models from LLMs using Resource-Efficient   Pruning

Laura Puccioni; Alireza Farshin; Mariano Scazzariello; Changjie Wang,; Marco Chiesa; Dejan Kostic

arXiv:2501.05248·cs.LG·January 10, 2025

Deriving Coding-Specific Sub-Models from LLMs using Resource-Efficient Pruning

Laura Puccioni, Alireza Farshin, Mariano Scazzariello, Changjie Wang,, Marco Chiesa, Dejan Kostic

PDF

Open Access

TL;DR

This paper presents a resource-efficient method to derive programming-language-specific sub-models from large language models using unstructured pruning, enabling faster, more accessible code generation tailored to specific languages.

Contribution

It introduces a novel approach for extracting coding-specific sub-models from LLMs with minimal accuracy loss, supported by analysis of domain-specific activation patterns.

Findings

01

Effective extraction of language-specific sub-models demonstrated

02

Domain-specific tasks activate distinct model regions

03

Significant reduction in computational resources needed

Abstract

Large Language Models (LLMs) have demonstrated their exceptional performance in various complex code generation tasks. However, their broader adoption is limited by significant computational demands and high resource requirements, particularly memory and processing power. To mitigate such requirements, model pruning techniques are used to create more compact models with significantly fewer parameters. However, current approaches do not focus on the efficient extraction of programming-language-specific sub-models. In this work, we explore the idea of efficiently deriving coding-specific sub-models through unstructured pruning (i.e., Wanda). We investigate the impact of different domain-specific calibration datasets on pruning outcomes across three distinct domains and extend our analysis to extracting four language-specific sub-models: Python, Java, C++, and JavaScript. We are the first…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Rights Management and Security · Algorithms and Data Compression · Advanced Data Storage Technologies

MethodsPruning · Focus