Resource Management for Low-latency Cooperative Fine-tuning of   Foundation Models at the Network Edge

Hai Wu; Xu Chen; and Kaibin Huang

arXiv:2407.09873·cs.IT·July 16, 2024·1 cites

Resource Management for Low-latency Cooperative Fine-tuning of Foundation Models at the Network Edge

Hai Wu, Xu Chen, and Kaibin Huang

PDF

Open Access

TL;DR

This paper proposes a resource management framework for low-latency cooperative fine-tuning of foundation models at the network edge, optimizing computation and communication to enable efficient multi-device adaptation.

Contribution

It introduces the DEFT paradigm with a novel CRUNCH algorithm and a joint bandwidth-and-block allocation method for low-latency model fine-tuning at the edge.

Findings

01

Significant latency reduction demonstrated on GLUE benchmark.

02

Effective multi-device cooperation for fine-tuning large models.

03

Optimized resource allocation improves edge device performance.

Abstract

The emergence of large-scale foundation models (FoMo's) that can perform human-like intelligence motivates their deployment at the network edge for devices to access state-of-the-art artificial intelligence. For better user experiences, the pre-trained FoMo's need to be adapted to specialized downstream tasks through fine-tuning techniques. To transcend a single device's memory and computation limitations, we advocate multi-device cooperation within the device-edge cooperative fine-tuning (DEFT) paradigm, where edge devices cooperate to simultaneously optimize different parts of fine-tuning parameters within a FoMo. However, the parameter blocks reside at different depths within a FoMo architecture, leading to varied computation latency-and-memory cost due to gradient backpropagation-based calculations. The heterogeneous on-device computation and memory capacities and channel conditions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGeological Modeling and Analysis · Modular Robots and Swarm Intelligence

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Residual Connection · Layer Normalization · Linear Layer · Attention Dropout · Linear Warmup With Linear Decay · Adam · Dropout · Weight Decay