Sequential Compression Layers for Efficient Federated Learning in   Foundational Models

Navyansh Mahla; Sunny Gupta; Amit Sethi

arXiv:2412.07021·cs.LG·March 11, 2025

Sequential Compression Layers for Efficient Federated Learning in Foundational Models

Navyansh Mahla, Sunny Gupta, Amit Sethi

PDF

Open Access

TL;DR

This paper introduces a new parameter-efficient fine-tuning method for federated learning of large models, replacing LoRA with a small MLP layer, leading to better performance in language and vision tasks.

Contribution

A novel MLP-based fine-tuning approach that outperforms LoRA in federated learning settings for large language and vision models.

Findings

01

Outperforms LoRA-based methods in federated fine-tuning

02

Effective for both language models and vision encoders

03

Addresses LoRA's bottlenecks in federated settings

Abstract

Federated Learning (FL) has gained popularity for fine-tuning large language models (LLMs) across multiple nodes, each with its own private data. While LoRA has been widely adopted for parameter efficient federated fine-tuning, recent theoretical and empirical studies highlight its suboptimal performance in the federated learning context. In response, we propose a novel, simple, and more effective parameter-efficient fine-tuning method that does not rely on LoRA. Our approach introduces a small multi-layer perceptron (MLP) layer between two existing MLP layers the up proj (the FFN projection layer following the self-attention module) and down proj within the feed forward network of the transformer block. This solution addresses the bottlenecks associated with LoRA in federated fine tuning and outperforms recent LoRA-based approaches, demonstrating superior performance for both language…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Advanced Graph Neural Networks · Cryptography and Data Security