# EUPID—configurable privacy-preserving record linkage in federated health data spaces

**Authors:** Dieter Hayn, Emanuel Sandner, Martin Baumgartner, Bernhard Jammerbund, Fabian Wiesmüller, Stefan Beyer, Hannah Vinatzer, Angelika Rzepka, Klaus Donsa, Karl Kreiner, Guenter Schreier

PMC · DOI: 10.3389/fdgth.2026.1751234 · Frontiers in Digital Health · 2026-02-09

## TL;DR

EUPID is a privacy-preserving system for linking health data across institutions, helping researchers study rare diseases without compromising patient identities.

## Contribution

The paper introduces EUPID Services, a configurable PPRL solution for secure patient data linkage in federated health data spaces.

## Key findings

- EUPID has pseudonymized over 16 million patients across six contexts in Austria's Health Data Donation Space.
- Only four false negative matches were identified due to typing errors, with no false positives detected.
- EUPID supports FAIR data principles and is positioned to support future European health data initiatives like the EHDS.

## Abstract

Rare disease research relies heavily on secondary use of health data due to the scarcity of clinical guidelines and data sharing between research institutions and hospitals. Linking rare disease patients is challenging due to increased re-identification risk in small cohorts, thus limiting the data's potential for research. Privacy-Preserving Record Linkage (PPRL) enables the linkage of disparate datasets while safeguarding the identities of involved participants.

The aim of the present paper is to provide an up-to-date description of the concept and the technical details of the European Patient Identity (EUPID) Services, a configurable PPRL solution which is currently used for rare disease research in Europe to bridge healthcare and research. They support different algorithms for record linkage (configurable selection of quasi-identifiers, various hashing algorithms, phonetic hashing, Bloom filters), re-identification and flexible specification of the pseudonym format. Furthermore, their setup is also flexible whether to install standalone instances or integrate with a central EUPID Services deployment.

The EUPID Services have been used in various research applications since 2014. As of July 2025, 6,356 unique patients have been registered to the central EUPID Services within the domain Paediatric Oncology in Europe, and 10,340 pseudonyms for 12 EUPID Contexts have been generated. Within the Austrian Health Data Donation Space, which represents a federated PPRL infrastructure supporting asynchronous record linkage, more than 16 million patients were pseudonymised in six different contexts. Overall, four cases of false negative matches have been identified, which were caused by typing errors. So far, no false positive match has ever been detected.

In view of the upcoming European legislatives like the European Health Data Space (EHDS), connecting patient data securely and safely will become increasingly important and useful. The EUPID Services support such linkage in a Findable, Accessible, Interoperable and Reusable (FAIR) manner and thus could represent a vital and proven part of future national and European research networks.

## Linked entities

- **Diseases:** rare disease (MONDO:0021200)

## Full-text entities

- **Diseases:** Rare Diseases (MESH:D035583), TTP (MESH:D015840), disease (MESH:D004194), PPRL (MESH:C537758), Cancer (MESH:D009369)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12927036/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12927036/full.md

## References

73 references — full list in the complete paper: https://tomesphere.com/paper/PMC12927036/full.md

---
Source: https://tomesphere.com/paper/PMC12927036