Effectiveness, barriers, and facilitating factors of strategies for active delabeling of patients with penicillin allergy labels: a systematic review
Hannah Nürnberg, Claudia Maria Denkinger, Tabea Krause, Lars Oetken, Sophie Rauer, Amelie Rapp, Lisa Marie Kern, Torsten Hoppe-Tichy, Tilman Schöning, Elham Khatamzas, Benedict Morath

TL;DR
This study reviews strategies to remove incorrect penicillin allergy labels in hospitalized patients and identifies effective methods and barriers to their implementation.
Contribution
The paper systematically evaluates delabeling strategies for penicillin allergy and identifies facilitators and barriers to their implementation.
Findings
Skin testing had the highest success rate (87%) for delabeling incorrect penicillin allergies.
Interdisciplinary team involvement and standardized algorithms facilitated successful delabeling.
Barriers included patient refusal, staff resistance, and financial limitations in smaller hospitals.
Abstract
Penicillin allergy is the most frequently reported allergy in hospitalized patients, although rarely confirmed. Given the negative outcomes associated with incorrect penicillin allergy diagnoses, implementation of programmatic delabeling strategies is important for successful antibiotic stewardship. This systematic review evaluates the effectiveness of different delabeling strategies and highlights elements and settings that facilitate or constrict their implementation. Following the PRISMA-statement PubMed/MEDLINE, EMBASE, Cochrane Library, and grey literature databases “Worldcat” and “OpenGrey” were searched for studies reporting on interventions to identify, evaluate, or rule out incorrect penicillin allergy in hospitalized adults. Data extraction included settings, intervention types, and their effectiveness, barriers, facilitators, and regulatory factors. This review included 42…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3- —Universitätsklinikum Heidelberg (8914)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDrug-Induced Adverse Reactions · Contact Dermatitis and Allergies · Pharmacovigilance and Adverse Drug Reactions
Introduction
Penicillin allergy is the most frequently reported drug allergy in medical records with 5% to 15% of hospitalized patients claiming to be allergic, though more than 90% of those will not have a confirmed allergy [1–3]. This mislabeling can lead to a lifetime avoidance of beta-lactam antibiotics, resulting in increased utilization of less effective second-line antibiotics and an increased risk of treatment failure and mortality [4–6]. Alternative antibiotics are associated with more adverse drug reactions, longer hospital stays, higher healthcare costs, and increased rates of antibiotic resistance [7–11]. Large studies show that patients with documented penicillin allergies have higher rates of Clostridioides difficile, methicillin-resistant staphylococcus aureus (MRSA) and vancomycin-resistant enterococci (VRE) infections or colonization [12, 13]. Penicillin allergies also appear to have significant psychological effects on patients [14]. Given these challenges, structured testing for penicillin allergy is essential before prescribing alternative antibiotics to ensure effective antibiotic stewardship. Implementing delabeling strategies in clinical practice requires consideration of each setting’s unique characteristics, including structural, personnel, and regulatory factors [15]. The competencies of healthcare professionals involved, such as pharmacists, nurses, physicians, and clinical specialists, may impact the feasibility of these strategies.
Previously published reviews have focused on the effectiveness and safety of delabeling strategies and their feasibility of implementation by different professional groups [16–18]. Other reviews have conducted a cost analysis [10]. This systematic review is not only intended to expand the perspective on effectiveness and feasibility, but also to summarize structures and factors that enable, facilitate or restrict delabeling strategies in different settings.
Methods
This systematic review adheres to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guideline [19]. The detailed methodology is described in the review protocol [20]. As not all studies reported the numbers of identified, eligible, or tested patients, effectiveness outcomes were extracted and calculated in relation to the included patients for the respective intervention to allow consistent comparison across studies. Facilitators and barriers influencing the implementation of penicillin allergy delabeling strategies were categorized according to the Consolidated Framework for Implementation Research (CFIR) [21].
Eligibility criteria
Prospective studies involving adult patients (18 years and older) with a documented penicillin allergy in any hospital setting, i.e., inpatients or outpatient clinics, were included. No particular study design was excluded. Only original work published in English or German were considered.
Eligible studies needed to focus on measures to identify, evaluate, or exclude false penicillin allergy diagnoses. The primary outcome was the effectiveness of the delabeling intervention, measured by the number of patients cleared from their penicillin allergy diagnosis due to the respective delabeling intervention. Secondary outcomes had to expand the perspective on implementation and included (1) barriers and facilitators in implementing delabeling interventions and (2) measurable healthcare outcomes related to penicillin allergy, such as rates of postoperative infections, length of hospital stays, treatment costs, and use of broad-spectrum antibiotics. For inclusion, the studies only had to report on the primary outcome.
Search strategy
The databases PubMed/MEDLINE, EMBASE and Cochrane Library were searched for peer-reviewed literature from 1992 to September 3rd 2024. To ensure completeness, the databases “Worldcat” and “OpenGrey” were searched for grey literature. The search strategies used for the different databases are provided in Table S1 in the supplementary material.
Study selection process
All identified publications were collected using MS Excel^®^ (Microsoft, Richmond, USA) and duplicates resulting from searches on different databases were removed prior to the screening process. Titles and abstracts were reviewed by three independent reviewers for the inclusion criteria.
Full text screening was performed by two reviewers on the basis of consensus. Discrepancies were resolved by a third reviewer.
Data extraction
Data were extracted to a predefined Excel sheet by one reviewer and reviewed by another. Any occurring discrepancies during the data collection process were resolved by consensus or by a third reviewer. The extracted data included information on the study design, specific inpatient setting, type of intervention (e.g. risk stratification, oral challenge, skin testing or combinations) as well as the professional groups involved in the intervention and their influence on feasibility. The effectiveness of the interventions was determined by the number of patients who were cleared from their penicillin allergy label or switched to a beta-lactam antibiotic. To assess transferability, regulatory factors that significantly influenced the feasibility of the intervention were extracted. All circumstances and structures that influenced the implementation of the intervention were identified as barriers and facilitators.
Bias assessment
To assess reliability of the included studies, a bias assessment was carried out by two reviewers using the Newcastle-Ottawa scale [22]. Disagreements were resolved by a third reviewer. The scale has three categories: Selection, comparability, and outcome. The application of the scale and necessary reported content are available in the supplementary data.
Data synthesis
A descriptive analysis was performed as this review intends to comprehensively characterize available penicillin allergy delabeling strategies. Effectiveness was assessed based on percentage of patients in the studies that were successfully delabeled as a result of the intervention. For this, effectiveness was stratified based on the chosen intervention, and in relation to the risk group. To illustrate the effectiveness of the various delabeling interventions, average effectiveness was calculated by adding up the number of delabeled patients across all studies and subsequently divided by the total number of patients included in those studies. To support interpretation of the pooled effectiveness, key study characteristics such as intervention type, clinical setting, and patient risk group were narratively summarized.
Results
Study inclusion
A total of 5957 articles were identified, with 596 duplicates removed prior to screening. The screening of 5361 See Figure 1 titles and 769 abstracts resulted in 174 full texts being assessed for inclusion. Reasons for exclusion were a retrospective study design (n = 35), inclusion of children and adolescents under 18 (n = 16), no original research (n = 48), no hospital setting (n = 5), inclusion of patients with allergies other than penicillin allergy (n = 5), or penicillin allergy testing as part of initial diagnoses (n = 23). Ultimately, 42 studies were included in this systematic review (Fig. 1).
Fig. 1. Study selection process
Characteristics of included studies
Most of the included studies were small, single centre cohort studies. The studies were conducted in eight different countries with a majority in the United States (n = 18), Australia (n = 6), and Canada (n = 5) (Table 1). There was only one international multicentre study that was performed in the USA, Australia, and Canada [23]. The delabeling interventions were carried out in different clinical inpatient settings, mostly surgical, oncology (n = 6), internal or general medicine (n = 9) wards (n = 5) or during surgical pre-assessment (n = 5). In 19 studies the setting was not specified, beyond the included patients being “inpatients” or “hospitalized” (Table 1).
A total of 5017 of 6269 participating patients (80%) were delabeled following penicillin allergy delabeling interventions, including skin tests followed by oral challenge (n = 2487), direct delabeling using patients’ history, interviews or questionnaires (n = 870), direct oral challenges (n = 1028), skin testing (n = 189), and outpatient testing (*n *= 58). Some studies only included patients with low-risk penicillin allergy for their interventions (n = 9), other studies linked the intervention to current antibiotic therapy, e.g. aztreonam, or at least prioritized patients with antibiotic therapy (*n *= 12).
Table 1. Characteristics of included studiesStudyYear of publicationCountryStudy design(As reported by study)Clinical settingInterventionPatients included in the studyNumber of included patientsOutcomesR. J. Arasaratnam et al. [24]2024USASingle-centre, quality improvement studyMedical unitDirect delabeling, direct oral challengePatients with penicillin allergy documented in EHR154 Patients includedEffectiveness: Number of delabeled patients (34% (n = 53) of patients included)Safety: Number of adverse eventsJ. Bodega-Azuara et al. [25]2024SpainSingle-centre, prospective studyInpatient (unspecified)Skin Test followed by oral challengePatients with penicillin allergy, treated with non-beta lactam antibiotic therapy91 Patients includedEffectiveness: Number of delabeled patients (84% (n = 76) of patients included)Other: Number of patients receiving beta-lactamsK. Drummond et al. [26]2024AustraliaMulticentre, prospective cohort studyInpatient (unspecified)Direct delabeling, direct oral challengePatients with type A adverse drug reaction to penicillin488 included patientsEffectiveness: Number of delabeled patients (83% (n = 404) of patients included)Other: Inpatient antibiotic use post-allergy assessmentA. M. Hitchcock et al. [27]2024USASingle-centre, quasi-experimental studyOrthopaedic unit (surgical)Direct delabelingPatients with documented beta-lactam allergy135 patients included in pre-intervention cohort66 patients included in post-intervention cohortEffectiveness: Number of patients delabeled or with updated allergy (32% (n = 21) of patients included in post-intervention cohort)Other: Number of patients receiving cefazolin, interview length, 90-day clinical outcomeM. T. Krishna et al. [28]2024United KingdomMulticentre, prospective observational studyMedical unit, infectious diseases ward, pre-operative assessment, oncology unitsDirect oral challengePatients with alleged or suspected reaction to any penicillin270 patients includedEffectiveness: Number of delabeled patients (45% (n = 122) of patients included)Safety: Number of adverse eventsFeasibility: Conversion rate from screening to consentD. Lanoue et al. [29]2024CanadaSingle-centre, prospective intervention studyMedical unit, surgical unitDirect delabeling, direct oral challengePatients with penicillin allergy55 Patients includedEffectiveness: Number of delabeled patients (56% (n = 31) of patients included)Other: Resource utilization, costs of interventionL. E. Merz et al. [30]2024USASingle-centre, quality improvement studyOncologyOutpatient testingPatients with penicillin allergy documented in the EMR89 Patients includedEffectiveness: Number of delabeled patients (44% (n = 39) of patients included)Number of patients receiving a beta-lactamSafety: Number of adverse eventsG. J. Molina-Molina [31]2024SpainMulticentre, prospective studyInpatient (unspecified)Skin Test followed by oral challengePatients with beta-lactam allergy documented in the EMR249 Patients includedEffectiveness: Number of delabeled patients (75% (n = 186) of patients included)Other: Use of antibiotic alternatives, demographic and clinical characteristics of patients with beta-lactam allergy labelsM. T. Rose et al. [32]2024AustraliaMulticentre, parallel, 2-arm, open-label, randomised clinical trialIntensive care unitDirect oral challengePatients with low-risk penicillin allergy (PEN-FAST Score < 3)80 Patients includedEffectiveness: Number of patients delabeled (49% (n = 39) of patients included)Safety: Number of patients with adverse eventsFeasibility: Number of patients that consented to participate in the studyM. Sobrino-Garcia et al. [33]2024SpainSingle-centre, prospective studyInpatient (unspecified)Skin Test followed by oral challengePatients with beta-lactam allergy177 patients includedEffectiveness: Number of delabeled patients (60% (n = 107) of patients included)Other: Reduction of alternative antibiotics, treatment costsJ. C. Y. Wong et al. [34]2024ChinaMulticentre, prospective, pragmatic studyInpatient (unspecified)Skin Test followed by oral challengePatients with low-risk penicillin allergy who completed penicillin allergy evaluation228 patients included75 patients included in allergist cohort153 patients included in non-allergist cohortOther: Difference in effectiveness (delabeling rate) and safety between allergist and non-allergists, improvement in HR-QoL of penicillin allergy evaluation (93% (n = 70) of patients included in allergist cohort; 94% (n = 144) of patients in non-allergist cohort)M. B. Alnaes et al. [35]2023NorwaySingle-centre, prospective studyInfectious diseases ward, allergy clinicDirect oral challenge, Outpatient testingPatients with self-reported or EHR-documented penicillin allergy149 patients includedEffectiveness: Number of delabeled patients (87% (n = 130) of patients included)J. Brayson et al. [36]2023USASingle-centre, prospective studyMedical unit, surgical unitDirect oral challengePatients with penicillin allergy and currently on or requiring antibiotic therapy132 patients includedEffectiveness: Number of delabeled patients (100% (n = 132) of patients included)Safety: Number of adverse eventsOther: Medical records update by the general practitionerA. M. Copaescu et al. [23]2023USA, Canada, AustraliaMulticentre, parallel, 2-arm, noninferiority, open-label, international randomized clinical trialAllergy clinicDirect oral challenge, Skin Test followed by oral challengePatients with penicillin allergy and PEN-FAST Score < 3378 patients included190 patients included in intervention group187 patients included in control groupEffectiveness: Number of delabeled patients (98% (n = 186) of patients included in intervention group, 99% (n = 186) of patients included in control group).Safety: Number of adverse events of patients.T. S. Li et al. [37]2023ChinaSingle-centre, prospective studyInpatient (unspecified)Skin Test followed by oral challengePatients with suspected penicillin allergy who underwent penicillin allergy evaluation372 patients includedEffectiveness: Number of delabeled patients (90% (n = 335) of patients included)Safety: Number of adverse events Other: Use of second-line antibioticsS. Wade et al. [38]2023United KingdomSingle-centre, prospective quality improvement studyPre-surgical assessmentDirect delabelingPatients with reported penicillin allergy21 patients includedEffectiveness: Number of delabeled patients (100% (n = 21) of patients included)Other: Antibiotic prophylaxis, update of medical records, communication to the general practitionerM. T. DesBiens et al. [39]2022USASingle-centre, prospective observational studyInpatient (unspecified)Direct oral challengePatients with penicillin allergy and antibiotic therapy186 patients includedEffectiveness: Tolerance of the beta lactam challenge, with no adverse event (88% (n = 163) of patients included)Safety: Number of adverse eventsOther: Improvement of antibiotic useH. Bediako et al. [40]2022USASingle-centre, prospective studyMedical unit, surgical unitDirect delabelingPatients with documented penicillin allergy33 patients includedEffectiveness: Number of delabeled patients (45% (n = 15) of patients included)S. Livirya et al. [41]2022New ZealandSingle-centre, prospective cohort studyMedical unitDirect delabeling, Direct oral challengePatients with penicillin allergy150 patients includedEffectiveness: Number of delabeled patients (75% (n = 112) of patients included)Other: Number of relabeled patients after 6 monthsK. Y. L. Chua et al. [3]2021AustraliaMulticentre, prospective, comparative effectiveness studyOncologyDirect delabeling, Direct oral challengePatients with penicillin allergy361 patients includedEffectiveness: Number of delabeled patients (98% (n = 355) of patients included) Other: Antibiotic use, readmission rate, length of stay, inpatient/90-day mortalityS. Gaudreau et al. [42]2021CanadaSingle-centre, transversal, prospective, quasi-experimental studyInpatient (unspecified)Skin Test followed by oral challengePatients with penicillin allergy receiving antibiotic therapy and who have an infection that could be treated with a penicillin55 patients includedEffectiveness: Number of delabeled patients (47% (n = 26) of patients included)Other: Acceptance of pharmacist suggestions about antibiotic treatment, use of an antibiotic with a narrow spectrum of activityY. Ham et al. [43]2021USASingle-centre, prospective studyInpatient (unspecified)Direct delabeling, direct oral challenge, Skin Test followed by oral challengePatients with penicillin allergy. Prioritization of patients with active infections whose antibiotics would change based on testing50 patients includedEffectiveness: Number of delabeled patients (96% (n = 48) of patients included)Safety: Number of adverse eventsOther: Percentage switched to beta-lactam therapyS. Kwiatkowski et al. [44]2021USASingle-centre, prospective, quasi-experimental studyPre-operative assessmentDirect delabeling, outpatient testingPatients with beta-lactam allergy and a surgery where a beta-lactam antibiotic was considered first for SSI prophylaxis87 patients included50 patients included in no-intervention group37 patients included in intervention groupEffectiveness: Number of allergy labels updated (100% (n = 37) of patients included in the intervention group)Other: 30-day surgical site infection and CDI, acute kidney injury, allergic reactions, length of stayY. C. Song et al. [45]2021USASingle-centre, prospective pilot studyMedical unit, surgical unit, pregnancy unitsDirect delabelingPatients with penicillin allergy documented in EHR12 patients includedEffectiveness: Number of delabeled patients (100% (n = 12) of patients included)Feasibility: Time spent by pharmacistl. Steenvoorden et al. [46]2021NorwaySingle-centre, prospective interventional studyMedical unitDirect oral challengePatients with reported low-risk penicillin allergy57 patients includedEffectiveness: Number of delabeled patients (98% (n = 56) of patients included)Other: Prevalence of penicillin, practicability of methodN. P. Torney et al. [47]2021USASingle-centre observational, prospective cohort studyInpatient (unspecified)Skin Test followed by oral challengePatients with self-reported or documented history of a type 1 or unknown type of allergy reaction to penicillin that occurred more than five years prior to current admission90 patients includedEffectiveness: Percentage of patients who received PAST and were transitioned to a preferred β-lactam (84% (n = 76) of patients included)S. Harmon et al. [48]2020USASingle-centre, prospective studyInpatient (unspecified)Skin Test followed by oral challengePatients with penicillin allergy. Prioritization of patients receiving a nonoptimal antibiotic regimen47 patients identified31 patients includedEffectiveness: Number of delabeled patients (87% (n = 27) of patients included)Other: Number of patients whose antibiotic therapy was deescalated/ provider acceptance of recommendation, cost savingsK. L. Mann et al. [49]2020USASingle-centre, prospective studyMedical unitDirect delabelingPatients with penicillin allergy documented in the EHR. Prioritization of patients with antibiotic therapy175 patients includedEffectiveness: Number of patients with any change to their allergy profile (76% (n = 133) of patients included)Other: Types of allergy changes, time from admission to interview, number of eligible patients successfully transitioned to non-carbapenem β-lactamA. Ramsey et al. [50]2020USASingle-centre, prospective studyInpatient (unspecified)Direct oral challenge, Skin Test followed by oral challengePatients with reported penicillin allergy and receiving antibiotic therapy100 patients included52 patients included in PAST-group48 patients included in DOC-groupEffectiveness: Number of negative test results (91% (n = 91) of patients included; 85% (n = 44) of patients included in PST-group; 98% (n = 47) of patients included in DOC-group)Safety: Number of positive test resultsOther: Antibiotic use, cost savings, costs of skin prick test and direct challengeJ. A. Trubiano et al. [51]2020AustraliaMulticentre, prospective cohort studyInpatient (unspecified, oncology, outpatient cohortSkin Test followed by oral challengePatients with reported penicillin allergy622 patients includedEffectiveness: Number of negative test results (91% (n = 564) of patients included)Number of any positive result of a penicillin allergy test (9% (n = 58) of patients included);Other: Validation of risk stratification toolM. Devchand et al. [52]2019AustraliaSingle-centre, prospective studyInpatient (unspecified)Direct delabeling,oral challenge,Skin testPatients with documented penicillin allergy and antibiotic therapy106 patients includedEffectiveness: Number of delabeled patients (38% (n = 40) of patients included)Other: Antibiotic useT. du Plessis et al. [53]2019New ZealandSingle-centre, prospective interventional studyInpatient (unspecified)Direct delabeling, direct oral challenge, Outpatient testingPatients with reported penicillin allergy250 patients includedEffectiveness: Number of delabeled patients (80% (n = 199) of patients included)Safety: Number of adverse eventsOther: Antibiotic use, antibiotic cost, length of hospital stays, patients’ perceptionsF. Foolad et al. [54]2019USASingle-centre, prospective studyOncologySkin Test followed by oral challengePatients with type 1 reaction or unknown reaction to penicillin, receiving aztreonam49 patients includedEffectiveness: Number of patients with a negative skin prick test or oral challenge (94% (n = 46) of patients included)Other: Number of patients switched to beta-lactam agent, cost savingsL. Savic et al. [55]2019United KingdomSingle-centre, prospective studyPre-surgical assessmentDirect oral challengePatients with low-risk reaction to penicillin, occurred > 15 years ago74 patients includedEffectiveness: Number of delabeled patients (74% (n = 55) of patients included)Safety: Number of adverse eventsOther: Surgical prophylaxis antibiotic useM. Taremi et al. [56]2019USASingle-centre, prospective, quality improvement studyOncologySkin Test followed by oral challengePatients with history of possible type 1 reactions to penicillin100 patients includedEffectiveness: Number of delabeled patients (95% (n = 95) of patients included) Other: Antibiotic useJ. R. Chen et al. [57]2018USASingle-centre, prospective, quasi-experimental studyInpatient (unspecified)Skin Test followed by oral challengePatients with penicillin allergy and an active order for aztreonam136 included59 included for active Screening (AS)-only77 included in AS-clinical Decision support groupEffectiveness: Number of patients tested negative (39% (n = 53) of patients included);Number of patients testedSafety: Number of adverse eventsOther: Proportion of inpatients on aztreonam receiving a skin test consult, time from admission to testing completion, whether the consultation occurred in the emergency department or an inpatient unit.Y. Moussa et al. [58]2018CanadaSingle-centre, prospective studyPre-surgical assessmentSkin Test followed by oral challengePatients with history of an allergic reaction to penicillin194 patients includedEffectiveness: Number of delabeled patients (94% (n = 183) of included patients)Other: perioperative antibiotic useA. Ramsey et al. [59]2018USASingle-centre, prospective studyInpatient (unspecified)Skin testPatients with penicillin allergy and antibiotic therapy50 patients includedEffectiveness: Number of patients with a negative skin prick test (94% (n = 47) of patients included)Other: Number of patients switched to a penicillin-based antibioticJ. A. Leis et al. [60]2017CanadaMulticentre, prospective studyInpatient (unspecified)Skin Test followed by oral challengePatients with reported beta-lactam allergy90 patients includedEffectiveness: Number of patients receiving and tolerating preferred β-lactam therapy after negative skin testing (92% (n = 83) of patients included)Other: infection related clinical outcomesJ. Marwood et al. [61]2017AustraliaSingle-centre, prospective studyEmergency departmentSkin Test followed by oral challengePatients with self-reported history of penicillin allergy100 patients includedEffectiveness: Perceived allergy status at the end of testing (81% (n = 81) of patients included)Safety: Overall and individually summarized adverse eventsM. E. Arroliga et al. [62]2003USASingle-centre, prospective observational StudyIntensive care unitsSkin testPatients with documented penicillin allergy96 patients includedEffectiveness: Number of patients with negative penicillin skin test (89% (n = 85) of patients included)Safety: Number of adverse events after changing antimicrobial therapy to beta-lactamOther: Percentage of negative tested patients changed to a beta-lactam antimicrobialR. J. Warrington et al. [63]2000CanadaSingle-centre, prospective studyInpatients (unspecified)Skin testPatients with history of penicillin allergy67 patients includedEffectiveness: Number of patients with negative penicillin skin test (79% (n = 53) of patients included)Safety: Number of adverse events as a result of beta-lactam therapyOther: Percentage of negative tested patients changed to a beta-lactam antimicrobialAbbreviations: HR-QoL: Health-related Quality of life | CDI: Clostridioides difficile infection | PAST: Penicillin allergy skin test | EHR: electronic health record | EMR: electronic medical record | SSI: surgical side infection | DOC: Direct oral challenge
Effectiveness of delabeling interventions
In total 6269 patients (n = 42 studies) were included for penicillin allergy delabeling interventions, with 5017 (80%) successfully delabeled. The effectiveness of the respective studies varied from 32% to 99%. The average effectiveness of the most common interventions is shown in Fig. 2. Skin testing, performed in three studies, had the highest success rate at 87% of included patients (185/213), followed by 84% (2513/2976) for a combination of skin testing and oral challenge, conducted in 17 studies. Direct oral challenges were performed in nine studies resulting in delabeling in 77% (912/1184) of the patients included for intervention. Interventions involving direct delabeling showed lower effectiveness (67% (230/344) of included patients), but effectiveness could be improved by combining it with direct oral challenges (80% (1081/1358) of included patients). The studies included different patient populations (e.g. low-risk individuals, patients on antibiotics or with infections) with varying sample sizes. Since the studies included different risk groups, Fig. 2 also shows how the average effectiveness is distributed across the various risk groups.
Fig. 2. Average effectiveness of interventions and combined interventions. Black stacked columns show proportion of not delabeled patients. Delabeled patients (blue stacked) were stratified according to different patient population of the included studies
Facilitating factors and barriers to penicillin allergy delabeling
Factors influencing the implementation of penicillin allergy delabeling were systematically evaluated in several areas according to the Consolidated Framework for Implementation Research (CFIR): Intervention Characteristics (e.g. complexity, adaptability), Outer Setting (e.g. external influences such as patient needs), Inner Setting (e.g. internal environment, resources, communication), Characteristics of Individuals (e.g. knowledge, beliefs, attitudes of people involved in the process), and Process (e.g. implementation actions and strategies). An overview of identified barriers and facilitators is presented in Table 2.
Intervention characteristics
Several studies have highlighted the relative advantage of penicillin allergy delabeling strategies such as direct delabeling and oral provocation over skin testing. Direct delabeling and oral challenge were time- and resource-efficient as they avoid the cost and complexity of skin testing [3, 30, 35, 40, 50]. Oral provocation has the advantage of excluding false positive reactions to skin tests, and reducing the duration of hospitalisation [23, 50]. The main advantage of direct delabeling was its time efficiency, which allowed integration into existing workflows, e.g. history taking by nursing staff [30, 40, 49]. However, skin testing, remained valuable in acute care settings, especially when patients were unable to provide information about their allergy history due to their clinical condition [48]. Financial issues, including reimbursement and the high cost and time required for skin testing, were particularly challenging for smaller hospitals [30, 43, 55, 60]. An important support factor identified in seven studies was the use of a simple, standardized algorithm for the intervention procedure [34, 36, 40, 41, 45, 51].
Outer setting
A major barrier, reported by 16 studies, was patient refusal to undergo penicillin allergy testing or to participate in studies. Reasons included feeling unwell, being overwhelmed by other diagnoses (e.g., cancer), fear of allergic reactions, anxiety about needles, or general unease [24, 38, 42, 46, 48, 53, 54, 56, 57, 59]. Recall bias, especially in elderly or frail patients, made accurate risk stratification difficult due to inaccurate memories of the initial reaction [31, 41, 44, 58]. Further, the lack of standardized guidelines and regulatory restrictions on who can perform skin tests were barriers to progress [47, 48, 53]. Additional information is provided in Table 2.
Inner setting
In terms of the inner setting, structural factors were significant (Table 2). Five studies reported that successful delabeling programs can be delivered without an allergy service, using non-specialists, pharmacists, or general physicians, most often using direct delabeling and oral provocation tests (Fig. 3) [3, 24, 29, 46, 50]. In the included studies, collaboration between key stakeholders, including specialists and non-specialists, was common [25, 36, 40, 44, 47, 54–56, 59]. While protocols often relied on allergists, collaboration between allergists and non-specialists could address this need, with allergists training and supervising the team and reducing uncertainty [28, 34, 50, 55, 58]. Hospitals with established antimicrobial stewardship programs and/or ID services might integrate delabeling into routine care more easily [3, 24, 42, 48, 52, 53, 60, 62]. However, staff shortages constrained progress and limited recruitment to weekdays [47, 54, 55]. Additionally, time constraints, short hospital stays, and early discharge were additional challenges that limited time for allergy assessments or patient education during follow-up [26, 28, 42, 49, 53, 57, 60].
Characteristics of individuals
Training played a key role in delabeling programs, with allergists training nurses, pharmacists, and physicians to perform tasks like interviews, skin testing, intradermal drug testing or oral challenges [24, 36, 42, 43, 47, 48, 55, 60]. Lack of confidence and knowledge in assessing, evaluating, or understanding the importance of delabeling was a common barrier among healthcare providers, further challenging delabeling efforts [34, 53]. In some cases, medical staff resisted delabeling, often because alternative antibiotics were already started before allergy evaluation, or due to concerns about the reliability of skin test, particularly for immunocompromised patients [42, 45, 54–56].
Process
The process could be supported by involving local champions or allergists to guide and supervise the teams [28, 34, 50, 55, 58]. Short hospital stays and early discharge often challenged planning of delabeling efforts, but some studies suggested that prioritizing patients currently receiving antibiotics could improve resource efficiency [24, 49, 57, 60]. The choice of intervention often depended on the clinical setting, with skin tests more commonly used in the ICU, where patients might be unable to provide allergy histories, while oral challenges and direct delabeling were more commonly used in surgical and internal medicine settings (see Supplementary Figure S1) [3, 24, 32, 36, 40, 41, 45, 46, 62].
Fig. 3. Healthcare providers involved in the delabeling process. “general”: without further specialization (prescribing pharmacist were seen as general pharmacists). “ID”: specialization in infectious diseases or antimicrobial/ antibiotic stewardship. “trained”: trained/ educated (e.g. by allergists) to perform different process steps
Table 2. Facilitators and barriers for penicillin allergy delabeling mapped to CFIR domains and constructsCFIR DomainCFIR ConstructFacilitatorsBarriers Intervention Characteristics
Relative Advantage Direct delabeling and oral challenge save time, cost, and resources [3, 30, 35, 40, 50].Oral challenge reduces false positive reactions [23, 50]Skin testing seen as resource-intensive and time-consuming [30, 43, 55, 60] Design Quality & Packaging Standardized algorithms (checklists, risk stratification) [34, 36, 40, 41, 45, 51]n.a. Complexity Simple protocols for direct delabeling and oral provocation allow for integration into existing workflows [30, 40, 49]Complexity of skin testing [30, 43, 55, 60] Outer Setting
Patient Needs & Resources Patient reassurance through oral challenges and skin tests [46].Patient refusal [24, 38, 40, 42, 46, 48, 53, 54, 56, 57, 59]Lack of consent capacity and allergy information in older, critically ill or psychiatric patients [31, 32, 41, 44, 58, 62]Patient mistrust direct delabeling [24, 38, 40, 42, 46, 48, 53, 54, 56, 57, 59]Relabeling [25, 55] External Policies & Incentives n.a.Lack of standardized guidelines and regulatory restrictions on who can perform testing [47, 48, 53] Inner Setting
Structural Characteristics Non-specialists can perform delabeling [3, 24, 29, 46, 50]Lack of allergy services [35, 59].Staff shortage [47, 54, 55]. Networks & Communication Multidisciplinary collaboration [25, 36, 40, 44, 47, 54–56, 59]Allergists supervising the team to reduce uncertainty [28, 34, 50, 55, 58]n.a. Implementation Climate Antibiotic stewardship programs or infectious disease services [3, 24, 42, 48, 52, 53, 60, 62].Time constraints, short hospital stay, and early discharge [26, 28, 42, 49, 53, 57, 60]. Readiness for Implementation Existing workflows (e.g. medication assessment, history taking by nursing staff) [30, 40, 49].Clear allergy documentation, electronic prescribing system [49, 61].n.a. Characteristics of Individuals
Knowledge & Beliefs Training of pharmacists, nurses, and physicians to perform delabeling tasks [24, 36, 42, 43, 47, 48, 55, 60].Lack of knowledge and confidence in assessing allergies [34, 53]. Self-Efficacy Training and guidance improve staff confidence [35].Resistance of medical staff to delabeling [42, 45, 54–56]. Process
Planning Prioritization of patients most likely to benefit (e.g. already on antibiotics) for resource efficiency [24, 49, 57, 60].n.a. Engaging Involvement of local champions and allergists to guide the process [28, 34, 50, 55, 58]Low patient participation (e.g. missed appointments, low survey completion) [25, 30] Executing Tailored interventions based on clinical setting (ICU vs. surgical wards) [3, 24, 32, 36, 40, 41, 45, 46, 62].n.a.
Impact on healthcare
Due to the prospective study designs, most studies reported direct changes in antibiotic use following allergy testing. Patients on antibiotics were often switched to preferred beta-lactams, and other studies compared antibiotic use between delabeled and non-delabeled patients [3, 26, 37, 44, 52, 53, 62]. However, only a few studies investigated overall changes in antibiotic consumption, showing significant reductions in aztreonam use after the interventions [54, 57]. One study also reported five (10%) surgical site infections in the standard-care group, compared to none in the intervention group, of which all were associated with the use of alternative antibiotics [44].
In terms of costs, one study focused on the costs of implementing interventions [9], while others explored direct savings in antibiotic costs [3, 48, 54, 56, 57, 59]. Antibiotic costs were found to be up to 2.5 times higher in patients with confirmed allergies than in delabeled patients [33, 53].
Bias assessment
Of the 42 included studies, no studies had a high (1–3 points), 33 studies had a moderate (4–6 points) and nine studies had a low risk (7–9 points) of bias. The results of the bias assessment are listed in Table S2 in the supplementary material. Most studies were small, single centre cohort studies and thoroughly reported on the identification of penicillin allergy labels as well as the results of the tests conducted. However, some of these studies focused on smaller patient populations (such as patients on antibiotic therapy or only low-risk patients) or often did not check whether patients had previously undergone a penicillin allergy test, which introduces a moderate risk of bias. Studies that were rated with low risk of bias were generally able to meet these criteria. Additionally, there were studies considering another factor that could influence the outcome of the delabeling process by comparing testing strategies and the implementation of the intervention by different healthcare professionals.
Discussion
In this systematic review, we found that the effectiveness of delabeling strategies varies widely across the 42 included studies, ranging from 32% to 99%, depending on the setting and intervention. Skin testing and oral challenges were more effective (77–87%) than direct delabeling (67%). However, effectiveness varied even for similar interventions and was influenced by factors such as patient populations, sample sizes, and risk group classifications. The results suggest that the choice of intervention should be setting-specific to optimize implementation and effectiveness (see Supplementary Figure S1). Direct delabeling and oral challenges offered time-, cost- and resource-efficient alternatives to skin testing, and provided valuable options for easier integration into existing workflows. Other systematic reviews supported delabeling large patient numbers without allergy specialists or skin testing [16, 64]. Oral provocation helped to minimize false positive tests and shortened hospital stays [23, 26, 35]. In addition, direct delabeling, performed by non-specialists (e.g. pharmacists, medical students or nurses) was also effective [30, 40, 49]. However, in specific settings such as intensive care or among elderly patients, skin testing prior to penicillin administration appeared more appropriate, as these patients were often unable to provide information on their allergy history [41, 62].
Understaffing has been reported as a common challenge that may constrain the implementation of penicillin allergy delabeling interventions [24, 25, 47, 55, 60, 65]. Multidisciplinary collaboration, clear communication, and standardized workflow have been suggested to have a positive influence on implementation of penicillin allergy delabeling [25, 36, 41, 44, 45, 47, 55, 56]. In this review, pharmacists were identified to play a key role in delabeling by performing direct delabeling, oral challenges as well as skin tests after training [43, 47, 48, 57]. However, legal restrictions may limit the ability of pharmacists to perform certain procedures, especially in countries without medical prescribing rights for pharmacists [42, 66]. Many of the included studies were conducted outside Europe, and many of the pharmacists were specialized or even independent prescribers [36, 50, 53, 54, 56, 59, 60]. Given these extended competencies (e.g. prescribing authority, skin testing), the transferability to general pharmacists in countries without such tasks remains questionable. Developing safe, guideline-based interventions that can be implemented by non-specialized staff, could further empower these professions to play a key role in penicillin allergy delabeling [67, 68].
Clinicians appeared to lack the experience and skills and sometimes report to be not confident in performing delabeling [69]. Therefore, expanding delabeling training into the clinical education, and targeted education to clinical staff such as nurses, physician assistants, or medical students, could support broader implementation of delabeling and address high workload challenges [70]. As well as healthcare providers, patients themselves also play a key role in the success of delabeling strategies. Patient refusal, often due to fear of allergic reactions, or lack of motivation and understanding of the purpose of delabeling, remains a significant challenge [25, 40, 42, 46, 50, 55, 70]. Hence, simply testing and delabeling patients might be insufficient if patients do not understand or accept delabeling. Up to 36% of patients regain the penicillin label after a negative allergy evaluation [71, 72]. One included study using direct delabeling and oral challenge reported a 10% relabeling rate (4 of 41 patients) after six months, while another study found 49% of skin test-negative patients still labeled as allergic at discharge [41, 63]. Many interventions of included studies lacked follow-up data or information on whether the effect is sustained.
One follow-up study found that while 94% of patient hospital records were correctly updated after delabeling, only 37% of primary care records reflected the change [73]. Future studies should investigate in exploring ways to connect hospital and primary care to improve the sustainability of delabeling, e.g. by improving documentation and communication, and promoting patient understanding.
Limitations
The systematic review has some limitations. Literature that is only available in other databases, published after September 3rd 2024, as well as relevant studies published in other languages than English or German may be missing. In addition, most studies are quasi-experimental, or single-centre studies, which often carry a higher risk of bias and might be limited in transferability. We have addressed this risk through a rigorous bias assessment and reported in detail on the characteristics of the respective studies to better assess transferability. All studies were conducted in developed countries and may not be applicable to middle- or low-income countries. A major limitation across the literature was the lack of standardized risk stratification for penicillin allergy, which also impacted this review. Data were heterogeneous, and comparability between studies was limited, which precluded a formal meta-analysis. The development of national or international guidelines with standardized risk stratification would facilitate the comparison of delabeling strategies. The pooled effectiveness analysis provides a general overview of delabeling success, but should be interpreted with caution. The included studies differed in terms of design, patient selection, clinical setting, and delabeling approach. These differences are described in the narrative synthesis of study characteristics and should be taken into account when interpreting the results. The pooled data provide a general indication of effectiveness, but may not fully reflect the specific contexts in which the interventions were applied.
Although most studies used quantitative methods, many still included helpful information about factors that influence implementation, such as workflows, structural conditions, or staff behavior. Even if these aspects were not the main focus, they offered useful insights into barriers and facilitators. Mapping these findings to CFIR constructs allowed for a more systematic interpretation of implementation challenges and facilitators in different settings.
Conclusion
The effectiveness of different delabeling strategies varies, with skin testing and oral challenges demonstrating the highest success rates. A key finding from this review is the importance of using standardized algorithms and interdisciplinary approaches to support implementation of penicillin allergy delabeling. However, several barriers, including patient refusal, medical staff resistance, logistical challenges, and financial limitations constrain widespread implementation. Addressing these challenges will require both improving patient and provider education, enhancing collaboration of healthcare professionals, and developing clear protocols for delabeling. In resource-limited settings, oral challenges, and direct delabeling after standardized risk stratification may offer practical alternatives to more resource-intensive methods like skin testing. To maximize the potential of penicillin allergy delabeling and improve antimicrobial therapy in patients, interdisciplinary teams should be formed that work within clearly defined processes with standardized risk stratification.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary Material 1: Supplementary data 1. Definition of different delabeling strategies. 2. Bias assessment specification. Figure S1: Interventions across healthcare providers and settings. Table S1: Search strategy. Table S2: Detailed results of bias assessment.
Supplementary Material 2
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Sacco KA, Brigham BA, Imam TJ, Burton JS. MC., Clinical outcomes following inpatient penicillin allergy testing: A systematic review and meta-analysis. 2017.10.1111/all.1316828370003 · doi ↗ · pubmed ↗
- 2Mattingly TJ 2, et al. The cost of Self-Reported penicillin allergy: A systematic review. J Allergy Clin Immunol Pract. 2018;6(5):1649–54.e 4.10.1016/j.jaip.2017.12.03329355644 · doi ↗ · pubmed ↗
- 3Skivington K et al. A new framework for developing and evaluating complex interventions: update of Medical Research Council guidance. BMJ, 2021;374:n 2061.10.1136/bmj.n 2061 PMC 848230834593508 · doi ↗ · pubmed ↗
- 4Damschroder LJ et al. Fostering implementation of health services research findings into practice: a consolidated framework for advancing implementation science. Implement Sci, 2009. 4.10.1186/1748-5908-4-50PMC 273616119664226 · doi ↗ · pubmed ↗
- 5Wells GA, O’Connell BSD, Peterson J, Welch V, Losos M. P Tugwell. The Newcastle-Ottawa Scale (NOS) for assessing the quality of nonrandomised studies in meta-analyses. 2013 15.06.2023]; Available from: https://www.ohri.ca/programs/clinical_epidemiology/oxford.asp
- 6Li TS et al. Prospective assessment of penicillin allergy (PAPA): evaluating the performance of penicillin allergy testing and post-delabelling outcomes among Hong Kong Chinese. Asian Pac J Allergy Immunol, 2023.10.12932/AP-270922-146937061932 · doi ↗ · pubmed ↗
- 7Song YC et al. Effectiveness and feasibility of Pharmacist-Driven penicillin allergy De-Labeling pilot program without skin testing or oral challenges. Pharm (Basel), 2021. 9(3).10.3390/pharmacy 9030127 PMC 829332834287342 · doi ↗ · pubmed ↗
- 8Moussa Y et al. De-labeling of beta-lactam allergy reduces intraoperative time and optimizes choice in antibiotic prophylaxis. Surgery, 2018.10.1016/j.surg.2018.03.00429751965 · doi ↗ · pubmed ↗
