AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient   and Instant Deployment

Yonggan Fu; Zhongzhi Yu; Junwei Li; Jiayi Qian; Yongan Zhang; Xiangchi; Yuan; Dachuan Shi; Roman Yakunin; Yingyan Celine Lin

arXiv:2411.10606·cs.LG·November 19, 2024

AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment

Yonggan Fu, Zhongzhi Yu, Junwei Li, Jiayi Qian, Yongan Zhang, Xiangchi, Yuan, Dachuan Shi, Roman Yakunin, Yingyan Celine Lin

PDF

Open Access 1 Repo 1 Video

TL;DR

AmoebaLLM is a framework that enables instant extraction of optimized, shape-adaptive LLM subnets post fine-tuning, facilitating rapid deployment across diverse platforms with tailored efficiency and accuracy.

Contribution

It introduces a novel method for constructing arbitrary-shape LLM subnets that are immediately deployable after one-time fine-tuning, addressing deployment efficiency challenges.

Findings

01

Achieves state-of-the-art accuracy-efficiency trade-offs.

02

Enables rapid, on-demand LLM subnet deployment.

03

Demonstrates superior adaptability across platforms.

Abstract

Motivated by the transformative capabilities of large language models (LLMs) across various natural language tasks, there has been a growing demand to deploy these models effectively across diverse real-world applications and platforms. However, the challenge of efficiently deploying LLMs has become increasingly pronounced due to the varying application-specific performance requirements and the rapid evolution of computational platforms, which feature diverse resource constraints and deployment flows. These varying requirements necessitate LLMs that can adapt their structures (depth and width) for optimal efficiency across different platforms and application specifications. To address this critical gap, we propose AmoebaLLM, a novel framework designed to enable the instant derivation of LLM subnets of arbitrary shapes, which achieve the accuracy-efficiency frontier and can be extracted…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

GATECH-EIC/AmoebaLLM
jaxOfficial

Videos

AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment· slideslive

Taxonomy

TopicsData Stream Mining Techniques