The Advanced Confidentiality Engine as a Scalable Tool for the Pseudonymization of Biomedical Data in Translational Settings: Development and Usability Study
Armin Müller, Eric Wündisch, Felix Nikolaus Wirth, Sophie Meier zu Ummeln, Joachim Weber, Fabian Prasser

TL;DR
The paper introduces ACE, a scalable open-source tool for pseudonymizing biomedical data, which helps protect privacy while enabling efficient data sharing in research.
Contribution
The paper presents ACE, a novel open-source pseudonymization service with a lean architecture and high scalability for biomedical data processing.
Findings
ACE can handle approximately 6000 transactions per second across various workload scenarios.
ACE combines cryptographic pseudonymization with persistence-based flexibility through features like domain-based structuring and metadata annotation.
Abstract
Pseudonymization refers to a process in which data that directly identify individuals, such as names and addresses, are stored separately from data needed for scientific purposes. The connection between both types of data is maintained through a protected link, represented by pseudonyms. This is a central data protection method in translational research, which enables researchers to collect, process, and share data while adhering to “data protection by design and by default” and data minimization best practices. However, integrating pseudonymization into high-throughput data processing workflows is challenging, and open-source solutions are rare. A typical example is the need to pseudonymize millions of electronic health records for secondary use in translational research platforms. This paper introduces the Advanced Confidentiality Engine (ACE), a highly scalable open-source…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Ethics in Clinical Research · Scientific Computing and Data Management
