Memory-Efficient Structured Backpropagation for On-Device LLM Fine-Tuning
Juneyoung Park, Yuri Hong, Seongwan Kim, Jaeho Lee

TL;DR
This paper introduces MeSP, a memory-efficient backpropagation method for on-device LLM fine-tuning that reduces memory usage by exploiting low-rank structures, enabling feasible training on devices with limited memory.
Contribution
MeSP is a novel backward pass derivation that leverages LoRA's low-rank structure to significantly reduce memory consumption while maintaining exact gradients.
Findings
Achieves 49% average memory reduction on Qwen2.5 models.
Reduces peak memory from 361MB to 136MB for 0.5B models.
Shows near-zero correlation of MeZO's gradient estimates with true gradients.
Abstract
On-device fine-tuning enables privacy-preserving personalization of large language models, but mobile devices impose severe memory constraints, typically 6--12GB shared across all workloads. Existing approaches force a trade-off between exact gradients with high memory (MeBP) and low memory with noisy estimates (MeZO). We propose Memory-efficient Structured Backpropagation (MeSP), which bridges this gap by manually deriving backward passes that exploit LoRA's low-rank structure. Our key insight is that the intermediate projection can be recomputed during backward at minimal cost since rank , eliminating the need to store it. MeSP achieves 49\% average memory reduction compared to MeBP on Qwen2.5 models (0.5B--3B) while computing mathematically identical gradients. Our analysis also reveals that MeZO's gradient estimates show near-zero correlation with true…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Big Data and Digital Economy · Machine Learning and Data Classification
