MobiZO: Enabling Efficient LLM Fine-Tuning at the Edge via Inference Engines

Lei Gao; Amir Ziashahabi; Yue Niu; Salman Avestimehr; Murali Annavaram

arXiv:2409.15520·cs.LG·September 23, 2025

MobiZO: Enabling Efficient LLM Fine-Tuning at the Edge via Inference Engines

Lei Gao, Amir Ziashahabi, Yue Niu, Salman Avestimehr, Murali Annavaram

PDF

Open Access 1 Video

TL;DR

MobiZO is a novel framework that enables efficient fine-tuning of large language models directly on edge devices by combining parallelized gradient estimation, specialized modules, and seamless integration with inference engines.

Contribution

It introduces a resource-efficient fine-tuning method for LLMs on edge devices, leveraging parallelism and a new module to reduce computational costs and memory usage.

Findings

01

Achieves significant runtime speedups

02

Reduces memory consumption

03

Improves fine-tuning accuracy on edge devices

Abstract

Large Language Models (LLMs) are currently pre-trained and fine-tuned on large cloud servers. The next frontier is LLM personalization, where a foundation model can be fine-tuned with user/task-specific data. Given the sensitive nature of such private data, it is desirable to fine-tune these models on edge devices to improve user trust. However, fine-tuning on resource-constrained edge devices presents significant challenges due to substantial memory and computational demands, as well as limited infrastructure support. We observe that inference engines (e.g., ExecuTorch) can be repurposed for fine-tuning by leveraging zeroth-order (ZO) optimization, which uses multiple forward passes to approximate gradients. While promising, direct application of ZO methods on edge devices is inefficient due to the high computational cost of multiple forward passes required for accurate gradient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

MobiZO: Enabling Efficient LLM Fine-Tuning at the Edge via Inference Engines· underline

Taxonomy

TopicsVLSI and Analog Circuit Testing · Advancements in Photolithography Techniques · Semiconductor materials and devices