Fine-tuning Multimodal Transformers on Edge: A Parallel Split Learning Approach

Timo Fudala; Vasileios Tsouvalas; Nirvana Meratnia

arXiv:2502.06355·cs.DC·July 10, 2025

Fine-tuning Multimodal Transformers on Edge: A Parallel Split Learning Approach

Timo Fudala, Vasileios Tsouvalas, Nirvana Meratnia

PDF

Open Access

TL;DR

This paper introduces MPSL, a parallel split learning method enabling efficient, scalable fine-tuning of multimodal transformers on edge devices without sharing labels or requiring client synchronization.

Contribution

MPSL is a novel parallel split learning approach that reduces computation and communication costs for multimodal transformer fine-tuning on resource-constrained devices.

Findings

01

MPSL matches or outperforms Federated Learning in accuracy.

02

Client-side computation is reduced by 250x.

03

MPSL offers superior scalability with model growth.

Abstract

Multimodal transformers integrate diverse data types like images, audio, and text, advancing tasks such as audio-visual understanding and image-text retrieval; yet their high parameterization limits deployment on resource-constrained edge devices. Split Learning (SL), which partitions models at a designated cut-layer to offload compute-intensive operations to the server, offers a promising approach for distributed training of multimodal transformers, though its application remains underexplored. We present MPSL, a parallel SL approach for computational efficient fine-tuning of multimodal transformers in a distributed manner, while eliminating label sharing, client synchronization, and per-client sub-model management. MPSL employs lightweight client-side tokenizers and a unified modality-agnostic encoder, allowing flexible adaptation to task-specific needs. Our evaluation across 7…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications