# Privacy-preserving data quality assessment for federated health data networks

**Authors:** Radovan Tomášik, Tobias Kussel, Zdenka Dudová, Radoslava Kacová, Roman Hrstka, Martin Lablans, Petr Holub

PMC · DOI: 10.1186/s12911-025-03328-6 · BMC Medical Informatics and Decision Making · 2026-01-27

## TL;DR

This paper introduces a privacy-preserving method for assessing data quality in health networks without exposing raw data.

## Contribution

A novel framework using differential privacy for federated data quality assessment in health data networks.

## Key findings

- Differential privacy can enable federated quality assessment without compromising privacy.
- A proof-of-concept system successfully provides meaningful quality metrics from synthetic health data.
- Local quality metrics can be aggregated securely across different data models.

## Abstract

Assessing data quality in federated health data systems presents unique challenges, particularly when data custodians cannot expose raw data due to privacy regulations. Traditional quality assessment approaches often require centralised access, which conflicts with the principles of data sovereignty and confidentiality.

In this study, we evaluate the utility of federated data quality assessment with differential privacy techniques to safeguard sensitive health data. The aim is to develop tooling and demonstrate a proof-of-concept implementation over a synthetic dataset of observational medical data.

We present a privacy-preserving framework for evaluating data quality in federated environments using differential privacy. Our approach enables individual data providers to compute local quality metrics and share only aggregated, privacy-protected results. We implement a proof-of-concept that supports predefined quality checks across different data models and demonstrate how meaningful insights into data quality can be obtained without compromising sensitive information.

This work demonstrates that differential privacy can be effectively applied to enable federated quality assessment in health data networks without compromising individual privacy. By implementing a proof-of-concept system over synthetic health data, we show that it is possible to obtain meaningful quality metrics in a decentralised setting.

## Full-text entities

- **Diseases:** DQMS (MESH:D012893), cancer (MESH:D009369)
- **Chemicals:** CQL (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12918291/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12918291/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/PMC12918291/full.md

---
Source: https://tomesphere.com/paper/PMC12918291