# Categorical and phenotypic image synthetic learning as an alternative to federated learning

**Authors:** Nghi C. D. Truong, Chandan Ganesh Bangalore Yogananda, Benjamin C. Wagner, James M. Holcomb, Divya D. Reddy, Niloufar Saadat, Jason Bowerman, Kimmo J. Hatanpaa, Toral R. Patel, Baowei Fei, Matthew D. Lee, Rajan Jain, Richard J. Bruce, Ananth J. Madhuranthakam, Marco C. Pinho, Joseph A. Maldjian

PMC · DOI: 10.1038/s41467-025-64385-z · Nature Communications · 2025-10-23

## TL;DR

CATphishing generates synthetic MRI data to enable privacy-preserving collaboration in medical imaging AI without sharing raw data or requiring constant communication.

## Contribution

Introduces CATphishing, a synthetic data generation method as an alternative to federated learning for multi-center medical imaging collaborations.

## Key findings

- CATphishing achieves accuracy comparable to centralized training and federated learning for mutation and tumor-type classification.
- Synthetic data generated by CATphishing maintains high fidelity, preserving essential medical imaging features.
- The method reduces privacy risks and communication overhead in multi-center collaborations.

## Abstract

Multi-center collaborations are crucial in developing robust and generalizable machine learning models in medical imaging. Traditional methods, such as centralized data sharing or federated learning (FL), face challenges, including privacy issues, communication burdens, and synchronization complexities. We present CATegorical and PHenotypic Image SyntHetic learnING (CATphishing), an alternative to FL using Latent Diffusion Models (LDM) to generate synthetic multi-contrast three-dimensional magnetic resonance imaging data for downstream tasks, eliminating the need for raw data sharing or iterative inter-site communication. Each institution trains an LDM to capture site-specific data distributions, producing synthetic samples aggregated at a central server. We evaluate CATphishing using data from 2491 patients across seven institutions for isocitrate dehydrogenase mutation classification and three-class tumor-type classification. CATphishing achieves accuracy comparable to centralized training and FL, with synthetic data exhibiting high fidelity. This method addresses privacy, scalability, and communication challenges, offering a promising alternative for collaborative artificial intelligence development in medical imaging.

Methods for developing machine learning models in medical imaging across multi-centre collaborations face important challenges, including technical burdens and privacy issues. Here, the authors introduce CATegorical and PHenotypic Image SyntHetic learnING - CATphishing - as an alternative to Federated Learning to generate synthetic multi-contrast 3D MRI data for downstream tasks.

## Linked entities

- **Genes:** Idh (Isocitrate dehydrogenase) [NCBI Gene 44291]

## Full-text entities

- **Diseases:** tumor (MESH:D009369)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12550077/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12550077/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/PMC12550077/full.md

---
Source: https://tomesphere.com/paper/PMC12550077