Towards the Next Frontier of LLMs, Training on Private Data: A Cross-Domain Benchmark for Federated Fine-Tuning

Daniel M. Jimenez-Gutierrez; Enrique Zuazua; Georgios Kellaris; Joaquin del Rio; Oleksii Sliusarenko; Xabi Uribe-Etxebarria

arXiv:2605.13936·cs.LG·May 15, 2026

Towards the Next Frontier of LLMs, Training on Private Data: A Cross-Domain Benchmark for Federated Fine-Tuning

Daniel M. Jimenez-Gutierrez, Enrique Zuazua, Georgios Kellaris, Joaquin del Rio, Oleksii Sliusarenko, Xabi Uribe-Etxebarria

PDF

TL;DR

This paper presents a federated learning framework for fine-tuning large language models on private, non-IID institutional data in healthcare and finance, demonstrating near-centralized performance with efficient parameter-efficient strategies.

Contribution

It introduces a practical federated fine-tuning approach using PEFT methods on private data, enabling LLM adaptation without data sharing across institutions.

Findings

01

Federated fine-tuning approaches perform close to centralized training.

02

PEFT methods like QLoRA and IA3 improve efficiency with minimal accuracy loss.

03

The framework effectively handles non-IID data across different domains.

Abstract

The recent success of large language models (LLMs) has been largely driven by vast public datasets. However, the next frontier for LLM development lies beyond public data. Much of the world's most valuable information is private, especially in highly regulated sectors such as healthcare and finance, where data include patient histories or customer communications. Unlocking this data could represent a major leap forward, enabling LLMs with deeper domain expertise and stronger real-world utility. Yet, these data cannot be shared because they are distributed across institutions and constrained by privacy, regulatory, and organizational barriers. Moreover, institutional datasets are typically non-independent and identically distributed (non-IID), differing across sites in population characteristics, data modalities, documentation patterns, and task-specific label distributions. In this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.