Consensus on key domains for emergency medical teams deployment evaluation: a Delphi method study

Tiffany Yeung; Daniel G. Bausch; Arlinda Cerga Pashoja; Joanna Schellenberg

PMC · DOI:10.1186/s13031-026-00751-y·February 5, 2026

Consensus on key domains for emergency medical teams deployment evaluation: a Delphi method study

Tiffany Yeung, Daniel G. Bausch, Arlinda Cerga Pashoja, Joanna Schellenberg

PDF

Open Access

TL;DR

This study creates a standardized framework for evaluating emergency medical teams during disasters by reaching consensus among stakeholders.

Contribution

A novel consensus-based framework for evaluating emergency medical team deployments using stakeholder input.

Findings

01

Participants reached consensus on 23 themes and 162 of 165 questions for EMT evaluation.

02

17 themes and 55 questions achieved 100% consensus among stakeholders.

03

The proposed framework is based on WHO guidelines and stakeholder feedback.

Abstract

Emergency Medical Teams (EMT) deploy to provide urgent care during and after sudden onset disasters. Although the World Health Organisation has set standards for EMTs on training, personnel, and operations, there is presently no standardised evaluation framework to assess deployments. As a step towards creation of an EMT deployment evaluation framework with an agreed upon common set of criteria, we used a two-round modified Delphi method to elicit perspectives from key stakeholders, including EMT members, researchers, funders, representatives of deploying governments, and host organisations. We asked participants to rate themes and questions relevant to EMT evaluation, derived from a previous study, using a four-point Likert scale, with consensus considered reached when 75% or more respondents rated the themes or questions ‘slightly suitable’ or ‘highly suitable’. Two rounds of the…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens(human · species)

Figures4

Click any figure to enlarge with its caption.

The Flow of Evaluation Themes and Questions Over Two Delphi Rounds

Participants' Response on Who Should be Involved in the Evaluation of an EMT Deployment

Participants’ Response on When Should the Evaluation be Completed

Keywords

Natural disastersHumanitarianEmergenciesDisaster aidEvaluationDelphi method

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDisaster Response and Management · Global Health and Surgery · Trauma and Emergency Care Studies

Full text

Background

An Emergency Medical Team (EMT) is composed of health professionals providing urgent care during sudden onset disasters, both “natural” and human-caused [1]. In 2014, the World Health Organization (WHO) launched the EMT Initiative to set common standards for EMT healthcare services to enhance and ensure response quality. The EMT initiative includes a process of EMT classification, encouraging all EMTs to register within the WHO EMT system, and have their services and skills validated by the initiative [2].

Initially focused on surgical care following earthquakes, the WHO EMT Initiative has adapted to new challenges, creating guidelines for infectious diseases and conflict zones. While the WHO EMT initiative sets minimum standards, countries may nevertheless accept teams that have not been classified by WHO, leading to concerns about EMT service quality and accountability. Even before the WHO initiative was established, there were concerns regarding timing, relevance, and integration of EMT activities with local health systems [3–6]. Scepticism regarding the motivations behind international aid continues to grow, particularly among beneficiaries, and media critiques often hold EMTs accountable more effectively than funders [7, 8]. Transparency in evaluation methods is critical in learning from past deployments [9]. Recent literature stresses the need for quality assurance and adherence to WHO standards [10, 11].

Despite a 2023 study highlighting the need for standardised reporting and sharing of experiences [12], the EMT sector lacks a standardised evaluation framework to assess EMT deployments. Instead, it often relies on daily reports submitted during deployment that emphasise clinical data, an approach that is not conducive to overall assessment of accountability and effectiveness and making it hard to draw lessons across deployments [7, 11, 13, 14]. EMT evaluation that have been performed tend to focus on funder requirements rather than improvements. There is often reluctance to share data or publish data [15, 16], perhaps to avoid manifesting shortcomings in expectations and performance gaps [6, 17–19], rendering it difficult to make evidence-based decisions [7, 11, 13, 14, 20]. The publications that do exist often focus on descriptive narratives and ‘lives saved’, rather than a broader range of measures that can provide a better idea on usage of resources to better support local health infrastructure [13, 18, 19, 21–23]. Standardised evaluations of EMT deployments could help improve adherence to the standards set by WHO [3, 24]. In its strategic priorities for 2022–2024, the Inter-Agency Standing Committee, an inter-agency forum of the United Nations and non-UN humanitarian partners founded in 1991 to strengthen humanitarian assistance, included improving ‘Accountability to Affected People,’ ensuring that no one is overlooked, and that community feedback is considered [25, 26]. However, international disaster rules and standards require improved implementation, and practical methods to support adherence must be developed.

While there are no established frameworks for EMT evaluations, various prominent organisations in the humanitarian field have developed frameworks for longer-term humanitarian assistance. An extensive review of existing reports, documents, and guidelines related to humanitarian assistance evaluations we did reveals that, although most humanitarian organisations have their own evaluation frameworks, the criteria used vary widely, and are generally not directly applicable to EMT evaluations, which require quicker assessments during EMT deployments compared to the longer humanitarian assistance [27].

Given the lack of an evaluation framework for EMT deployments, we sought to gather views of experts in this field on key themes and questions to be incorporated into a future EMT evaluation, with a goal of eventually producing a framework that can demonstrate accountability, foster discussions for improvement and knowledge sharing, and provide an overview of resource allocation. We focussed on the three most common types of disasters that required international aid: floods, tropical cyclones, and earthquakes and related tsunamis.

The Organisation for Economic Co-operation and Development classifies evaluations into five types: formative, process, output, outcome or performance, and impact [28]. This study focussed on process, output, and outcome evaluations, which assess short-term outputs, which are most relevant to EMT deployments.

Methods

The reporting in this paper adheres to the Conducting and Reporting DElphi Studies (CREDES) checklist [29].

Study design

This is a mixed methods study using the Delphi method, including both qualitative and quantitative data.

Ethics approval

Ethics approval was provided by the London School of Hygiene & Tropical Medicine Ethics Committee (Ref: 29517). Participants gave written informed consent before each round of the Delphi questionnaire.

The Delphi method

The Delphi method is a widely used approach that facilitates consensus among experts, particularly in the development of standards, frameworks, and guidelines [30]. We selected the Delphi method to obtain expert consensus on a relatively novel topic. This approach allows participants to remain anonymous, thereby reducing the influence of dominant individuals and facilitating unbiased consensus-building. We employed a modified Delphi method, drawing on findings from published literature and qualitative interviews to inform the development of the first-round questionnaire [31]. As a result, the process did not begin with open-ended questions [32]. We administered two rounds of questionnaires. In addition to questions about the content of an EMT evaluation, in Round 1 we included questions related to the process – who should conduct the evaluation and how long post-EMT deployment. Figure 1 shows the stages of the Delphi method process in this research, from the preparation to analysis.

Fig. 1. Stages of the Delphi Method

In a previous study we generated common evaluation themes and questions from interviewing EMT stakeholders about their perspectives and current practices in evaluating EMT deployments [31]. Responses to interview questions regarding indicators of successful EMT deployments, necessary data for improving future operations, and participants’ prior experiences and evaluative ideals yielded a set of variables critical to deployment assessment, which we subsequently rewrote as questions, grouped thematically, and organised within the six After Action Review pillars established by the WHO to assess response to significant public health events (Table 1) [33].

Table 1. Evaluation themes that reached consensus across two rounds of Delphi methodAfter Action Review PillarEvaluation themesAgreement Intensity(Absolutely Suitable + Slightly Suitable)1. LeadershipLeadership100%Teamwork100%2. Partner coordinationDeploying organisation100%Local Communities93%Host Organisations100%World Health Organization, Emergency Medical Teams Coordination Cell and other Emergency Medical Teams100%3. Information management and planningPreparation of evaluation93%Information Management (new in Round 2)100%Documentation100%4. Health operations and technical expertiseTeam Members’ Experience100%Filling in gaps of needs86%Patient Numbers100%Clinical Care100%Referral of Patients100%Local engagement100%Post-deployment follow-up100%5. Operations support and logisticsArrival100%Quality assurance100%Equipment, supplies and logistics93%Safety (new in Round 2)94%Exit100%6. Finance and administrationFinance & Administration100%Publications and reports100%

The After Action Review was developed as part of the WHO International Health Regulations Monitoring and Evaluation Framework to evaluate nation-wide emergency response but is not specifically oriented toward individual EMT’s actions. Rather, its pillars represent broad, overarching areas, from which WHO suggests that users design an appropriate review framework specific to each response based on the different contexts. Thus, while it served as a useful framework for development of questions and themes for this study, the WHO After Action Review Toolkit cannot serve in and of itself as an EMT evaluation.

We asked participants to rate the evaluation themes and respective questions on their perceived suitability for an EMT deployment evaluation framework, using a four-point Likert scale: ‘highly suitable’, ‘slightly suitable’, ‘slightly unsuitable’ and ‘highly unsuitable’. We then gave participants the opportunity to explain their choices, as well as to provide new suggestions to be added to the list.

Round 2 included evaluation themes and questions that had not reached consensus in the first round, along with items that received suggestions for modification and new proposals emerging from participants’ comments in Round 1. For questions that reached consensus but were accompanied by suggested revisions, we asked participants to indicate whether they preferred the original version or supported the proposed changes.

We considered all participant comments provided in both rounds of the Delphi, adjusting and incorporating suggestions for new evaluation themes and questions from Round 1 into Round 2 as appropriate. We took note of any patterns to identify respondents who consistently provided low ratings.

Sampling and recruitment

A panel of 10–15 experts is considered ideal for the Delphi method [34]. Through purposeful sampling based on their roles and experience working in or with EMTs, we identified and sent invitations to 27 potential participants representing various stakeholder groups, accounting for potential non-responses (Table 2). The inclusion criteria required participants to hold a leadership role and have experience working with or within EMTs during floods, tropical cyclones, or earthquakes and related tsunamis, both before and after the COVID-19 pandemic.

Table 2. Characteristics of participants in the Delphi studyParameterParticipants, n (%)Round 1Round 2Type of stakeholderResearchers in Emergency Medical Teams and humanitarian assistance3 (20%)3 (19%)World Health Organization classified and unclassified Emergency Medical Teams8 (53%)9 (56%)Deploying governments or funders1 (7%)1 (6%)Host governments2 (13%)2 (13%)Host partners (e.g., local health services)1 (7%)1 (6%)GenderMale11 (73%)11 (69%)Female4 (27%)5 (31%)World Bank income groups of participants’ nationalityLow income0 (0%)0 (0%)Lower-middle income1 (7%)1 (6%)Upper-middle income5 (33%)5 (31%)High income9 (60%)10 (63%)World Health Organisation Region of participants’ nationalityAfrica0 (0%)0 (0%)Americas1 (7%)2 (13%)Eastern Mediterranean2 (13%)2 (13%)Europe7 (47%)7 (44%)South-East Asia1 (7%)1 (6%)Western Pacific3 (20%)3 (19%)

Piloting the questionnaire

We piloted the first Delphi questionnaire with seven individuals, five with substantial experience working in EMTs and two with expertise in questionnaire administration who provided feedback on structure and flow. These individuals and their responses were not included in the subsequent study. Based on feedback from the pilot, we revised the questionnaire layout, introductory text, and wording to enhance clarity. We used an online surveying tool, Jisc Online Surveys (Jisc, United Kingdom). The questionnaire was available in English only.

Data analysis

For each round, we defined consensus as 75% or more participants rating the evaluation themes and questions as ‘slightly suitable’ or ‘highly suitable’. Questions that reached consensus but did not yield agreement on suggested modifications were retained within the ‘reached consensus’ category, with annotations indicating the lack of consensus on the proposed changes. Similarly, for suggested changes, we considered the change agreed if 75% or more participants rated ‘I agree with the change’.

Results

Participants

Of the 27 invited participants, 15 (56%) participated in Round 1 and 16 (59%) in Round 2 (Table 2). All Round 1 participants also took part in Round 2, with one additional participant joining in the second round. The remaining invited individuals who did not participate offered no explanations for their lack of participation.

Delphi method findings

After two rounds of the Delphi process, all 23 proposed evaluation themes and 162 out of 165 questions achieved consensus (Fig. 2). Consensus was unanimous (100%) for 17 themes and 55 questions (Table 1). Notably, the majority of items receiving 100% consensus fell under the After Action Review pillar of “Health operations and technical expertise”, suggesting that this theme consistently aligns with the views and interests of participants. Many themes and questions also aligned closely with the existing literature on EMTs, indicating their prominence in current EMT publications. A full list of the evaluation themes and questions and their respective level of agreement is contained in Appendix 1.

Fig. 2. The Flow of Evaluation Themes and Questions Over Two Delphi Rounds

For the three themes and questions on which consensus was reached but without agreement on proposed changes, the disagreements were largely related to syntax (i.e. “host organisation” versus “local government” or “ministry of health”) and how a particular theme (“Preparation of evaluation”) should be categorised (i.e. under “Information management and planning” or “Operations support and logistics”).

Consensus was not reached for three questions:

How long did it take for patients to arrive at the hospital? (Agreement ≤ 69%. No comments provided.)
Did the EMT do any follow-up to patients after the it exited the country? (Agreement ≤ 63%. Five participants commented on the difficulty and infeasibility of doing this in the field.)
Were there repeat visits of EMT members to the disaster area after the deployment? (Agreement ≤ 69%. No comments provided.)

Conducting the evaluation

Participants’ views on who should conduct evaluations varied, but most thought it required a team approach, often including the EMT leader, and sometimes external consultants (Fig. 3).

Fig. 3. Participants' Response on Who Should be Involved in the Evaluation of an EMT Deployment

A majority (53%) of participants thought that the evaluation should take place within two months after end of the deployment, and 93% thought within three months (Fig. 4).

Fig. 4. Participants’ Response on When Should the Evaluation be Completed

Discussion

Despite widespread agreement of the need, there is currently no standardised framework for evaluating EMT deployments. Our study represents a first step toward a solution, highlighting relevant themes and questions for such a framework for which there is significant consensus, indeed nearly unanimity, among key stakeholders.

The Delphi method proved to be a useful tool, ensuring anonymity and preventing “groupthink”, which can lead to unchallenged, poor-quality decision-making [35]. In the two-round Delphi method, consensus was achieved on all 23 evaluation themes and 162 (98%) of 165 questions for EMT deployments, including unanimous (100%) agreement on 17 themes and 55 questions. Many of these themes and questions were previously noted in the existing literature. Minor disagreements generally pertained to syntax, such as “host organisation” versus “local government”, or the positioning of a particular theme within the document. For questions that reached consensus but did not reach agreement, standard terms used in WHO’s Classification and Minimum Standards for EMTs, or the After Action Review guideline, could be used to guide the choice of words in future studies and a final EMT evaluation framework. The overwhelming majority (93%) of participants recommended that evaluation occur within three months post-deployment, rejecting evaluation of themes or questions that would require longer-term follow-up of patients, which they deemed unfeasible once the EMT has left the field. These views are consistent with WHO’s guidance on After action Review, which calls for review within three months after official declaration of the end of the event. While we do not dispute the challenges involved, we think that longer-term follow-up and review, likely collaborating with local healthcare providers, must nevertheless be considered to achieve continuity of care, as well as a full understanding of optimal approaches and the sustainable impact of EMT deployments. Issues of post-deployment follow-up have been increasingly signalled in other research [10, 36].

Participants favoured team-led evaluations, such as by the EMT leader or external consultants. EMT members are expected to undergo competency-based training prior to deploying, so the most appropriate skills are brought to the field, and team leaders are expected to undergo specific leadership training [37], which could enhance their awareness and participation in deployment evaluation. A clear command and communication structure within the EMT can facilitate decision-making [24]. Team members should be well-informed about the deployment context before deployment [38].

For EMTs to be effective, collaboration among key players—such as local governments, host organisations, the WHO, and NGOs—is essential. Each stakeholder influences EMT operations, requiring active engagement from all [39]. Timely information sharing is critical to minimise delays in responding to disaster victims, potentially reducing morbidity and mortality [12]. During sudden onset disasters, EMTs should report to local health ministries to enhance coordination, and a robust reporting system led by the WHO is essential [39, 40]. Additionally, EMTs must respect host country regulations, as accepting international aid can have political implications [41]. Effective EMT deployment requires the host country to request and approve assistance [42]. Life-saving interventions can only be performed within 48 h of a sudden onset disaster, but they often arrive too late [43]. EMTs should have ready-to-deploy equipment and restock supplies locally for better self-sufficiency [14, 44]. Real-time patient data such as epidemiological data and health needs can help identify underserved populations [45], maintain care quality, and for early outbreak detection [46, 47].

EMTs provide crucial primary care when local services are compromised [21, 22]. In areas with weakened local health infrastructure, EMTs may receive patients from clinics or act as referral centres for local hospitals to prevent overwhelming local facilities [48, 49]. Building relationships with local health providers boosts community confidence and facilitates the resumption of local services [47]. As humanitarian assistance evolves, considering the perspectives of patients and communities is essential to align aid with their needs [50, 51]. Sustainability of care post-deployment is increasingly important, particularly for ongoing health needs [20, 52]. Developing follow-up protocols, and longer deployments can enhance sustainability [23].

Current guidelines focus on clinical care but often overlook aspects like patient safety and mental health [53]. Shared challenges include language barriers and cultural differences, which can affect community acceptance [54, 55]. Equipment must be suitable for local conditions [12, 56], and teams must be prepared for unstable logistics [57, 58].

Effective coordination of resources is essential for targeting aid, aligning capabilities with local needs, based on rapid needs assessments after sudden onset disasters [12, 39, 47, 59]. EMTs should also publish all collected data and establish a global information network to share lessons learned with other teams [54].

The recent upheaval in the global health architecture and funding begs the question of whether WHO or another UN-based international organisation is the appropriate coordinating body for EMTs, including for deployment evaluation (62). This question is largely beyond the scope of our study, most of which was conducted before the onset of these radical changes. Nevertheless, regardless of whether WHO or another international body is to ultimately play the primary coordinating role, as long as EMTs exist in some form, the need for evaluation will exist, and thus our research remains pertinent. Furthermore, the scrutiny that our research brings to the EMT landscape may help to assess its true value and appropriate place in the repertoire of global health offerings. In particular, transformation to ensure that the whole EMT process is appropriately community- and local stakeholder-driven is a major question and goal (63). Moreover, in situations where conflict may arise among stakeholders, the presence of a neutral coordinating body such as WHO is crucial to ensure coherence and collective action.

We note the following limitations in our study:

Participation was below 60%, with potential selection bias due to purposive sampling, and some key stakeholders or regions may be underrepresented, limiting generalizability to all global EMT contexts.
Although data completeness was high, many participants did not provide rationales for their responses, and the study focused on identifying agreement rather than exploring reasons for disagreement, which may have limited insight into contentious areas.
Conducting the survey only in English and excluding contexts like armed conflicts may have restricted broader input.
The proposed framework still requires validation in operational EMT settings.

Conclusions

This study aimed to develop a consensus-based evaluation framework for EMT deployments using a modified Delphi method. The resulting evaluation themes and questions reflect strong stakeholder consensus and may inform future framework development. The consensus achieved suggests a shared understanding of core domains for assessing EMT deployments, particularly in the area of health operations and technical expertise. Despite a number of limitations, detailed above, the Delphi approach enabled anonymity and broad engagement, yielding perspectives that provide a valuable foundation for the development of a systematic EMT evaluation tool.

While our study does not result in, and nor do we propose here, a finished EMT evaluation tool, it lays a foundation for further discussion and research toward that goal. Next steps will be to synthesize the information gained through our study into a draft EMT deployment evaluation, including refining the language and data collection methods, with clear definitions and numerators/denominators where relevant. This process must entail iterative feedback from EMT practitioners and, importantly, community members who can represent the views of those who would be on the receiving end of deployments. The scope of the evaluation will also have to be adapted for different disaster types and EMT types. The draft evaluation could then first be tested in training and simulation settings, revised as needed, and then progress to field testing after an actual deployment. This would again be followed by broad stakeholder review and appropriate revision, ultimately creating an optimized evidence-based practical and adaptable tool across different disaster contexts and deployment environments, to enhance quality, accountability, and sustainability of emergency health responses worldwide.

Supplementary Information

Supplementary Material 1

Bibliography6

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1United Nations Office for Disaster Risk Reduction. Definition: disaster: Prevention Web; 2023 [Available from: https://www.preventionweb.net/terminology/disaster#:~:text=A%20sudden%2Donset%20disaster%20is,critical%20infrastructure%20failure%2C%20transport%20accident.
2WHO EMT Initiative. WHO EMT Extranet. 2024 [Available from: https://extranet.who.int/emt/
3Brolin K, Hawajri O, von Schreeb J. Foreign medical teams in the Philippines after typhoon Haiyan 2013 - who were they, when did they arrive and what did they do? P Lo S Curr. 2015;7.10.1371/currents.dis.0cadd 59590724486 bffe 9a 0340 b 3e 718PMC 444741726064780 · doi ↗ · pubmed ↗
4Inter-Agency Standing Committee. IASC Strategic Priorities 2022–2023 2022 [Available from: https://interagencystandingcommittee.org/sites/default/files/migrated/2022-01/IASC%20Strategic%20Priorities%20%282022-2023%29.pdf
5Inter-Agency Standing Committee. IASC, Strategic Priorities. 2022–2024 2024 [Available from: https://interagencystandingcommittee.org/inter-agency-standing-committee/iasc-strategic-priorities-2022-2024
6Amat Camacho N, Hughes A, Burkle FJ, Ingrassia P, Ragazzoni L, Redmond A et al. Education and Training of Emergency Medical Teams: Recommendations for a Global Operational Learning Framework. PLOS Currents. 2016;21(ecurrents.dis.292033689209611 ad 5e 4a 7a 3e 61520 d 0).10.1371/currents.dis.292033689209611 ad 5e 4a 7a 3e 61520 d 0PMC 510468727917306 · doi ↗ · pubmed ↗