# Immunocompromised Status Definition in Observational Studies Using Electronic Health Records: A Scoping Review and a Proposal for a Phenotype Identification Algorithm

**Authors:** Judit Riera‐Arnau, Nicoletta Luxi, Fabio Riefolo, Martín Solorzano, Irene Pazos, Elena Ballarín, Lise Skovgaard Svingel, Lorenzo Chiusaroli, Elisa Martín‐Merino, Elisa Barbieri, María Lopez‐Lasanta, Sima Mohammadi, Denis Rotta, Alexandra Pacurariu, Catherine Cohet, Miriam Sturkenboom, Carlos E. Durán, Carlos E. Durán, Carlos E. Durán, Miriam Sturkenboom, Judit Riera‐Arnau, Nicoletta Luxi, Olaf Klungel, Patrick Souverein, Sima Mohammadi, Fabio Riefolo, Irene Pazos, Rosa Gini, Davide Messina, Giuseppe Roberto, Carlo Giaquinto, Elisa Barbieri, Luca Stona, Felipe Villalobos, Martín Solorzano, Carlo Alberto Bissacco, Antonio Gimeno, Beatriz Poblador, Mercedes Aza, Aida Moreno, Alejandro Santos, Vera Ehrenstein, Lise Skovgaard Svingel, Benjamin Randeris Johannesen, Cécile Droz‐Perroteau, Laure Carcaillon‐Bentata, Anna‐Mija Tolppanen, Sirpa Hartikainen, Thuan Vo, Anne Paakinaho, Blair Rajamaki, Hedvig Nordeng, Saeed Hayati, Mahmoud Zidan, Juan José Carreras Martínez, Arantxa Urchueguía Fornes, Elisa Correcher Martínez, Javier Díez‐Domingo, Mar Martin, Patricia Garcia‐Poza, Airam de Burgos, Belén Castillo‐Cano, Elisa Martín‐Merino

PMC · DOI: 10.1002/pds.70362 · 2026-03-27

## TL;DR

This paper proposes a new algorithm to identify immunocompromised individuals in electronic health records, addressing challenges due to the dynamic nature of immune status.

## Contribution

The first systematic attempt to define immunocompromised populations in EHR data using a modular phenotype algorithm.

## Key findings

- HIV/AIDS and organ transplantation were the most frequently used diagnoses to define immunocompromised status.
- Common immunosuppressive drugs included methotrexate, corticosteroids, and TNF-alpha inhibitors.
- A modular algorithm combining diagnoses, medications, and procedures was developed to identify immunocompromised individuals in EHRs.

## Abstract

Immunocompromised individuals experience an impaired immune function due to conditions that might be either congenital or acquired over the course of their lives. Epidemiological studies often rely on clinical definitions which, in some cases, benefit from being translated into machine‐readable algorithms for application to electronic health records (EHRs) databases. The transient nature of certain immunocompromised states and the variability of phenotypes, definitions, coding practices, and data availability entangle this operation. To address these challenges, we conducted a scoping review of existing immunocompromised status definitions in MEDLINE, focusing on epidemiologic and pharmacoepidemiologic studies involving immunocompromised populations. Data extraction was guided by clinical experts, categorizing conditions and medications into seven categories: genetic/hereditary conditions, infectious diseases, malignancies and chemotherapy, organ and stem‐cell transplantations, severe systemic conditions, immunosuppressive drugs, and autoimmune conditions associated with immunosuppressant use. Out of 137 citations, 56 studies were included. Most of the studies focused on a particular disease or therapeutic area. Frequently cited diagnoses included HIV/AIDS (17.9%) and organ transplantation (14.2%). Methotrexate, corticosteroids, TNF‐alpha inhibitors, and calcineurin inhibitors were the most common drugs used to define immunocompromised status. Building on this review and expert opinion, we developed a phenotype algorithm that combines diagnostic, therapeutic, and procedural data in a modular way to identify immunocompromised populations in EHR data sources. The proposed phenotype algorithm can be applied across diverse data sources, settings and research questions. Future research should test its applicability across heterogeneous EHR data sources.

A proper phenotype algorithm to identify immunocompromised individuals is crucial in epidemiologic and pharmacoepidemiologic research.This is the first systematic attempt to describe the operational definitions to identify immunocompromised populations in studies using electronic healthcare records data sources.Based on the review results and expert opinion, we developed a phenotype algorithm to identify immunocompromised individuals when conducting research using EHR data sources. Several clinical conditions, medicines, and laboratory tests were joined using sequential logic.A challenge when attempting to identify individuals with immunocompromised status is the dynamic nature of secondary immunodeficiencies. The proposed phenotype algorithm partially addresses this challenge by considering the time‐period most likely to define the immunocompromised status.

A proper phenotype algorithm to identify immunocompromised individuals is crucial in epidemiologic and pharmacoepidemiologic research.

This is the first systematic attempt to describe the operational definitions to identify immunocompromised populations in studies using electronic healthcare records data sources.

Based on the review results and expert opinion, we developed a phenotype algorithm to identify immunocompromised individuals when conducting research using EHR data sources. Several clinical conditions, medicines, and laboratory tests were joined using sequential logic.

A challenge when attempting to identify individuals with immunocompromised status is the dynamic nature of secondary immunodeficiencies. The proposed phenotype algorithm partially addresses this challenge by considering the time‐period most likely to define the immunocompromised status.

Immunocompromised individuals have weakened immune systems due to a congenital condition, an illness acquired later in life, or certain medical treatments. Some people might be immunocompromised only for a short period of time, and the variability of symptoms and diagnoses can vary widely. Hence, the identification of immunocompromised individuals in electronic healthcare records (EHRs) poses several challenges. We reviewed published studies to understand how other researchers have previously identified immunocompromised patients in EHR to conduct pharmacoepidemiologic studies. We grouped medical conditions and medications into seven categories: genetic conditions, infections, cancer and cancer treatments, organ and stem‐cell transplants, serious systemic illnesses, immunosuppressive drugs, and certain autoimmune diseases. In total, we reviewed 56 studies. The most frequently used diagnoses to identify immunocompromised individuals were HIV/AIDS and organ transplantation. Commonly used medications included methotrexate, corticosteroids, and other drugs that suppress the immune system. Building on this review and expert opinion, we developed an algorithm that combines diagnostic codes, medicines, and medical procedures to help identify immunocompromised populations in EHR data sources, facilitating the generation of more robust evidence in future studies.

## Linked entities

- **Chemicals:** methotrexate (PubChem CID 4112)

## Full-text entities

- **Genes:** TNF (tumor necrosis factor) [NCBI Gene 7124] {aka DIF, IMD127, TNF-alpha, TNFA, TNFSF2, TNLG1F}, MTOR (mechanistic target of rapamycin kinase) [NCBI Gene 2475] {aka FRAP, FRAP1, FRAP2, RAFT1, RAPT1, SKS}, MBTPS1 (membrane bound transcription factor peptidase, site 1) [NCBI Gene 8720] {aka CAOP, PCSK8, S1P, SEDKF, SKI-1}, DHODH (dihydroorotate dehydrogenase (quinone)) [NCBI Gene 1723] {aka DHOdehase, POADS, URA1}
- **Diseases:** Hematological and solid organ malignancies (MESH:D019337), Streptococcus Group B infection (MESH:D011008), HIV/AIDS (MESH:D016263), kidney and liver disease (MESH:D008107), burns (MESH:D002056), Decompensated cirrhosis (MESH:D006333), Infectious diseases (MESH:D003141), cancer (MESH:D009369), Listeria infection (MESH:D008088), Crohn's disease (MESH:D003424), ulcerative colitis (MESH:D003093), cirrhosis (MESH:D005355), enterohepatic arthropathies (MESH:D007592), meningitis (MESH:D008580), ALPS (MESH:D056735), combined immunodeficiencies (MESH:D053632), cystic fibrosis (MESH:D003550), sepsis (MESH:D018805), neurologic damage (MESH:D020196), chronic (MESH:D002908), SCID (MESH:D016511), hepatocellular dysfunction (MESH:D018248), systemic lupus erythematosus (MESH:D008180), autoimmune conditions (MESH:D001327), progressive multifocal leukoencephalopathy (MESH:D007968), hepatitis (MESH:D056486), Cryoglobulinemia (MESH:D003449), IgG4-related disease (MESH:D000077733), ascites (MESH:D001201), bleeding esophageal varices (MESH:D004932), bacterial, viral, fungal, and opportunistic infections (MESH:D014777), systemic sclerosis (MESH:D012595), End-stage kidney disease (MESH:D007676), rheumatoid arthritis (MESH:D001172), hematological neutropenia (MESH:D006402), pemphigoid (MESH:D010391), cytomegalovirus (MESH:D003586), AIDS (MESH:D000163), secondary immunodeficiencies (MESH:D000068376), SIDs (MESH:D009202), jirovecii infection (MESH:D016720), immunodeficiencies (MESH:D007153), frailty (MESH:D000073496), portal hypertension (MESH:D006975), HCC (MESH:D006528), systemic arthritis (MESH:D001168), Haematological neutropenia (MESH:D009503), psoriatic (MESH:D015535), ankylosing spondylitis (MESH:D013167), asplenia (MESH:D059446), hemophagocytic syndromes (MESH:D051359), Pneumocystis (MESH:D011020), adult-onset Still's disease (MESH:D016706), DiGeorge syndrome (MESH:D004062), leukemia (MESH:D007938), Pneumonia (MESH:D011014), malnutrition (MESH:D044342), peritonitis (MESH:D010538), Bacterial Infections (MESH:D001424), juvenile arthritis (MESH:D001171)
- **Chemicals:** everolimus (MESH:D000068338), cobicistat (MESH:D000069547), voriconazole (MESH:D065819), prednisone (MESH:D011241), fluconazole (MESH:D015725), Sulfasalazine (MESH:D012460), bilirubin (MESH:D001663), Methotrexate (MESH:D008727), sphingosine-1-phosphate (MESH:C060506), mesalazine (MESH:D019804), itraconazole (MESH:D017964), pomalidomide (MESH:C467566), posaconazole (MESH:C101425), caspofungin (MESH:D000077336), azathioprine (MESH:D001379), amphotericin B (MESH:D000666), eculizumab (MESH:C481642), sirolimus (MESH:D020123), leflunomide (MESH:D000077339), lenalidomide (MESH:D000077269), thalidomide (MESH:D013792), J05A (-)
- **Species:** Homo sapiens (human, species) [taxon 9606], Human immunodeficiency virus 1 (no rank) [taxon 11676], Sagamiharavirus PP (species) [taxon 2956385], Human immunodeficiency virus (species) [taxon 12721]

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC13031886/full.md

---
Source: https://tomesphere.com/paper/PMC13031886