Resource Management for Low-latency Cooperative Fine-tuning of Foundation Models at the Network Edge
Hai Wu, Xu Chen, and Kaibin Huang

TL;DR
This paper proposes a resource management framework for low-latency cooperative fine-tuning of foundation models at the network edge, optimizing computation and communication to enable efficient multi-device adaptation.
Contribution
It introduces the DEFT paradigm with a novel CRUNCH algorithm and a joint bandwidth-and-block allocation method for low-latency model fine-tuning at the edge.
Findings
Significant latency reduction demonstrated on GLUE benchmark.
Effective multi-device cooperation for fine-tuning large models.
Optimized resource allocation improves edge device performance.
Abstract
The emergence of large-scale foundation models (FoMo's) that can perform human-like intelligence motivates their deployment at the network edge for devices to access state-of-the-art artificial intelligence. For better user experiences, the pre-trained FoMo's need to be adapted to specialized downstream tasks through fine-tuning techniques. To transcend a single device's memory and computation limitations, we advocate multi-device cooperation within the device-edge cooperative fine-tuning (DEFT) paradigm, where edge devices cooperate to simultaneously optimize different parts of fine-tuning parameters within a FoMo. However, the parameter blocks reside at different depths within a FoMo architecture, leading to varied computation latency-and-memory cost due to gradient backpropagation-based calculations. The heterogeneous on-device computation and memory capacities and channel conditions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeological Modeling and Analysis · Modular Robots and Swarm Intelligence
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Residual Connection · Layer Normalization · Linear Layer · Attention Dropout · Linear Warmup With Linear Decay · Adam · Dropout · Weight Decay
