MedLeak: Multimodal Medical Data Leakage in Secure Federated Learning with Crafted Models

Shanghao Shi; Md Shahedul Haque; Abhijeet Parida; Chaoyu Zhang; Marius George Linguraru; Y.Thomas Hou; Syed Muhammad Anwar; and Wenjing Lou

arXiv:2407.09972·cs.LG·July 1, 2025

MedLeak: Multimodal Medical Data Leakage in Secure Federated Learning with Crafted Models

Shanghao Shi, Md Shahedul Haque, Abhijeet Parida, Chaoyu Zhang, Marius George Linguraru, Y.Thomas Hou, Syed Muhammad Anwar, and Wenjing Lou

PDF

Open Access

TL;DR

MedLeak is a novel attack that enables a malicious federated learning server to recover private medical data from client updates by introducing crafted models, exposing vulnerabilities in secure aggregation protocols.

Contribution

This paper introduces MedLeak, a new privacy attack that exploits model updates in federated learning to recover sensitive medical data without requiring optimization.

Findings

01

High recovery rates on medical image datasets

02

Effective data recovery on medical text datasets

03

Recovered data supports downstream disease classification

Abstract

Federated learning (FL) allows participants to collaboratively train machine learning models while keeping their data local, making it ideal for collaborations among healthcare institutions on sensitive data. However, in this paper, we propose a novel privacy attack called MedLeak, which allows a malicious FL server to recover high-quality site-specific private medical data from the client model updates. MedLeak works by introducing an adversarially crafted model during the FL training process. Honest clients, unaware of the insidious changes in the published models, continue to send back their updates as per the standard FL protocol. Leveraging a novel analytical method, MedLeak can efficiently recover private client data from the aggregated parameter updates, eliminating costly optimization. In addition, the scheme relies solely on the aggregated updates, thus rendering secure…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsChaos-based Image/Signal Encryption · Cryptography and Data Security · Privacy-Preserving Technologies in Data

MethodsSparse Evolutionary Training