LEGO: Language Model Building Blocks

Shrenik Bhansali; Alwin Jin; Tyler Lizzo; Larry Heck

arXiv:2410.18287·cs.CL·October 25, 2024

LEGO: Language Model Building Blocks

Shrenik Bhansali, Alwin Jin, Tyler Lizzo, Larry Heck

PDF

Open Access

TL;DR

LEGO is a novel method for extracting and recombining small, task-specific language model blocks from large models, enhancing efficiency, privacy, and robustness in NLP applications.

Contribution

LEGO introduces a new approach combining LLM pruning, federated learning, and aggregation to create customizable, efficient, and privacy-preserving small language models.

Findings

01

LEGO enables efficient fine-tuning and inference with task-specific models.

02

LEGO maintains robustness and generalization despite data heterogeneity.

03

LEGO demonstrates versatility in model heterogeneity and privacy preservation.

Abstract

Large language models (LLMs) are essential in natural language processing (NLP) but are costly in data collection, pre-training, fine-tuning, and inference. Task-specific small language models (SLMs) offer a cheaper alternative but lack robustness and generalization. This paper proposes LEGO, a novel technique to extract SLMs from an LLM and recombine them. Using state-of-the-art LLM pruning strategies, we can create task- and user-specific SLM building blocks that are efficient for fine-tuning and inference while also preserving user data privacy. LEGO utilizes Federated Learning and a novel aggregation scheme for the LLM reconstruction, maintaining robustness without high costs and preserving user data privacy. We experimentally demonstrate the versatility of LEGO, showing its ability to enable model heterogeneity and mitigate the effects of data heterogeneity while maintaining LLM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsPruning