Establishing Evidence Criteria for Implementation Strategies: A Delphi Study for HIV Services
Virginia Mckay, alithia zamantakis, Ana Michaela Pachicano, James Merle, Morgan Purrier, McKenzie Swan, Dennis Li, Brian Mustanski, Justin D Smith, Lisa Hirschhorn, Nanette Benbow

TL;DR
This paper introduces a new rubric to evaluate implementation strategies for HIV services, developed through expert input and tested for reliability.
Contribution
The Best Practices Rubric is a novel tool for assessing the evidence behind implementation strategies, specifically tailored for HIV services.
Findings
The rubric includes five evaluation domains and four evidence levels confirmed through Delphi rounds.
The rubric achieved 98% reliability when applied to 18 implementation strategies for pre-exposure prophylaxis.
The tool is adaptable for use in other health areas beyond HIV.
Abstract
There are no criteria specifically for evaluating the quality of implementation research and recommend implementation strategies likely to have impact to practitioners. We describe the development and application of the Best Practices Rubric, a set of criteria to evaluate the evidence supporting implementation strategies, in the context of HIV. We developed the Best Practices Rubric from 2022–2023 in three phases. (1) We purposively selected and recruited by email participants representing a mix of expertise in HIV service delivery, quality improvement, and implementation science. We developed a draft rubric and criteria based on a literature review and key informant interviews. (2) The rubric was then informed and revised through two e-Delphi rounds using a survey delivered online through Qualtrics. The first and second round Delphi surveys consisted of 71 and 52 open and close-ended…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHealth Policy Implementation Science · Evaluation and Performance Assessment · Mental Health and Patient Involvement
Introduction
Implementation science is dedicated to improving the uptake and use of evidence-based interventions, practices, and policies to capitalize on scientific knowledge and impact human health. Central to the goals of implementation research is building the evidence for implementation strategies, defined as techniques or change efforts to promote the adoption, implementation, and sustainment of evidence-based interventions (EBIs) (1). In a recent review, scholars within the field of implementation science recognized that a more robust research agenda related to implementation strategies is needed to yield the promised benefits of improved EBI implementation for practitioners (2). Within this agenda was a call for more research on the effectiveness of implementation strategies. Expanding on this priority, criteria on which to evaluate evidence quality are needed to assess whether the evidence supporting the effectiveness of any given strategy is sufficient. Without criteria on which to evaluate implementation research focusing on strategies, it is difficult to recommend strategies that are likely to be the most valuable for practitioners or to identify strategies that may hold initial promise but would benefit from more robust research. Criteria are also an essential element of the creation of a compendium of evidence-based implementation strategies, which is a key dissemination approach for delivering evidence to implementers.
At the intervention level, criteria and rubrics are available to synthesize research outcomes and evaluate research quality behind the evidence supporting an intervention and make recommendations about their use, such as Grading of Recommendations Assessment, Development, and Evaluation (GRADE) or that used by the United States Preventative Services Task Force (3, 4). These guidelines often consider different domains of research outcomes and quality, like the health outcomes, the research design, and potential for bias in the outcomes because of the research design. Based on these guides, health institutions, like the Preventative Services Task Force, make recommendations about the best interventions across a wide set of health conditions to assist providers and organizations in making clinical and policy-level decisions. To our knowledge, no equivalent set of criteria for implementation strategies are available. As such, it is difficult to discern the quality of evidence supporting an implementation strategy and whether strategies should be recommended to practitioners to support the implementation of EBIs.
Existing criteria, like GRADE, may serve as a valuable starting point for building criteria applicable to the field of implementation research (5). Effectiveness research and associated evaluation criteria, which heavily emphasizes internal validity, considers the highest quality evidence to be from research designs like double-blind randomized control trials. In implementation research, internal validity tends to be more balanced with external validity so that the results are generalizable to target communities. With external validity in mind, implementation research is typically conducted in practice settings and involves assessment of the organizations and providers who will be impacted by the implementation strategy and subsequently the intervention under consideration. As a result, it is often inappropriate, impractical, and/or undesirable to leverage research designs like randomized controlled trials, because it is not possible to blind practitioners to the strategy and/or intervention or randomize at the unit of analysis (6–8). These realities make direct application of intervention-level criteria inappropriate - necessitating criteria specific to the field (3).
HIV and Implementation Research
We describe our efforts to develop a set of criteria and evaluation process for implementation strategies to address the HIV epidemic in the United States. Improvements in the US HIV epidemic have been modest over the last two decades, with disparities among communities disproportionally affected by HIV increasing (9). In an attempt to address HIV incidence, the Centers for Disease Control and Prevention have curated a repository of EBIs to support HIV prevention since the early 2000s and supported dissemination and implementation of a subset of these (10). Furthermore, major biomedical advancements such as pre-exposure prophylaxis (PrEP), have proven to be very effective at preventing HIV. Yet many of these interventions have not been widely implemented with equity to yield their intended benefit. Only an estimated 30% of individuals who would benefit from PrEP receive it, with growing disparities by race, gender, income, citizenship status, and intersectional marginalization (11–14). Uptake and adherence remain suboptimal along the HIV care continuum (i.e., prevention, testing, diagnosis, linkage-to-care, and treatment), indicating, in part, failed implementation and opportunities to develop evidence-informed implementation strategies (11). In 2019, the Ending the HIV Epidemic (EHE) Initiative was launched as a coordinated effort among several federal agencies to address HIV-related implementation problems. In alignment with EHE, the National Institutes of Health supported a number of mechanisms and projects to conduct research on implementation strategies (15). With the growing mass of HIV-related implementation research has come an equally growing knowledgebase focusing on numerous implementation strategies targeting multiple aspects of the HIV care continuum, in a wide scope of settings, evaluating various implementation outcomes (16).
In an effort to create, synthesize, and disseminate generalizable knowledge, the Implementation Science Coordination Initiative (ISCI) was funded by the National Institutes of Health to provide technical assistance in implementation research funded by the EHE Initiative, coordinate research efforts, synthesize literature through systematic reviews, develop tools to assist researchers, and disseminate synthesized and evaluated research findings to researchers, policymakers, providers, and more (17, 18). As part of this effort, we developed a rubric to evaluate the level of evidence of HIV-related implementation strategies to identify best-practice strategies that can promote effective implementation and uptake of EBIs by HIV practitioners nationwide.
Methods
Overview
We conducted the project in three phases: 1) a literature review in tandem with key informant interviews to generate criteria for our tool, 2) a modified Delphi to evaluate and revise our initial tool and criteria; 3) a pilot application of our rubric to a set of implementation research studies. Delphi data were collected from March 2022 to June 2023. Piloting occurred in the fall of 2023. Our data collection protocol was reviewed by the Institutional Review Board at Northwestern University and determined to be non-human subjects research. All data collection instruments have been included as a supplemental file (Supplemental File A), and data are available in a de-identified format from the first author on reasonable request. Methods and results are reported according to STROBE reporting guidelines (Supplemental File B).
Key Informant Interviews and Literature Review
We first conducted a review of the scientific and grey literature of existing compilations of criteria for assessing EBIs. We utilized this literature to construct an interview guide for key informant experts with questions focusing broadly on informants’ perceptions of the state of the field of HIV implementation research and key points that would need to be considered for or incorporated into a tool to assess evidence to define best practices in implementation strategies. We identified, recruited, and interviewed a range of experts, including implementation scientists, HIV providers and implementers, and representatives from related fields of public health research (e.g., quality improvement), and public health agency officials. All interviews were scheduled in the Spring of 2022 and were approximately 30–45 minutes long. Interviews were recorded and transcribed via Zoom. Two Ph.D.-level researchers with expertise in qualitative and mixed methods research performed an inductive, thematic process of analysis to explore patterns and categorize responses. Based on their responses, we iteratively developed a preliminary tool, criteria, and decision diagram for evaluating the quality of implementation strategy research.
Modified Delphi
Identification and Recruitment of Delphi Participants.
We conducted an asynchronous, modified Delphi with participants of similar expertise as our key informants in two rounds. Participants were recruited using snowball recommendations from those that were interviewed as key informants. Our eligibility criteria included fluent English speakers and those working in the fields of HIV, mental health, substance misuse, social services, primary care, women’s health, or other related areas of public health. If participants were unable to complete the survey, an alternative contact could be recommended. After this first invitation, we sent semiweekly reminder emails for six weeks. A 50 gift card was given to participants for completing the second survey.
Data Collection and Measures.
The surveys were implemented using Qualtrics. The surveys were piloted with members of the ISCI research team to ensure question clarity. Each survey took participants approximately 45–75 minutes to complete.
First-round Delphi instrument.
This survey consisted of 71 items. The goal was to generate consensus about which aspects of the tool were most important and least important and whether we had included all the elements that participants felt were necessary. The first portion of the survey gathered demographic and basic information about the participant (e.g., age, race, ethnicity, gender), characteristics of the participant’s work (e.g., I work primarily in… select all areas that apply”), as well as the participant’s experience in implementation research (e.g., How would you describe your knowledge level of implementation science?).
The second portion of the survey evaluated proposed domains (Overall Evidence of Effectiveness, Study Design Quality, Implementation Outcomes, Equity Impact, Strategy Specification, and Bundled strategies) and corresponding criteria. Participants were asked to agree or disagree (Yes/No) with the adding/dropping/combining of domains; this was followed by an open-ended question asking why they agreed to said addition/dropping/combining (if applicable). This portion also contained two 5-point Likert-type scales asking participants to rank the domains in order from most important to least important. The third portion of the survey was aimed at gaining the participant’s opinion on the specific criteria (e.g., effect size and effect direction for implementation outcomes) within each domain. For each domain, the participant was asked if there were any criteria that needed to be added/dropped (Yes/No), followed by an open-ended question asking why they would like these items added/dropped (if applicable). The participant was then provided a 5-point Likert scale in which they ranked each item from “Very unimportant” to “Very important”. These questions were repeated for all criteria in all domains.
The final portion of the survey was where the Levels of Evidence (Best Practice Strategy, Promising Strategy, Emerging Strategy, Undetermined Strategies, and Not Recommended Strategy) and their definitions were introduced. The participant was asked if there should be any adding/dropping/combining of the evidence levels (Yes/No), followed by an open-ended question asking why they would like these evidence levels to be added/dropped/combined (if applicable).
Second-round Delphi instrument.
This survey consisted of 52 items. As all participants from Round 2 were recruited from Round 1, the goal of this was to test and receive feedback on the changes to the tool made in response to the results of Round 1. The first portion of the survey gathered the same demographic and basic information as in the first round. The second portion consisted of an overview of the updated tool, including definitions of the domains, criteria, and levels of evidence, and asked for feedback on changes made from the Round 1 results. For example, in the first round of the Delphi survey, participants responded that they would like for greater specificity within the criteria of the Study Design domain. As a response, we split this domain into two domains for Round 2: “Study Design” and “Study Rigor and Limitations.” We presented this change to the participant and asked them to agree or disagree with this change (Yes/No); if “No” was selected, this prompted an open-response question asking for further explanation. Lastly, we asked respondents to attempt to apply the criteria and give an evidence-level rating to a set of fictional cases of implementation research studies, and then allowed respondents to comment on the application and rating process.
Data Analysis and Management.
Quantitative data were managed and analyzed in Excel. Quantitative data were analyzed descriptively, primarily as percent agreement or disagreement for domains, evidence levels, and individual criteria within domains. Qualitative data were analyzed in Dedoose software and Excel, using a rapid direct qualitative content analysis approach (19). Qualitative data were analyzed by a Ph.D.-level researcher with qualitative research expertise and were intended to confirm or complement quantitative analyses.
Pilot and Application to PrEP Implementation Strategies
To ensure a high-quality process for reviewing literature and consistent application of criteria, we piloted the tool with a set of implementation strategies designed to promote the uptake of evidence-based HIV services. After two trainings, four Ph.D.-level members of the ISCI team who were also engaged in systematic reviews of HIV literature applied the criteria to an existing set of eight papers reporting on implementation strategies designed to promote PrEP uptake (20) coding a rating for each criteria. We calculated agreement as a simple percentage of identical ratings between two coders out of the total number of criteria, domain ratings, and overall rating (40 items).
Results
We report the primary results from each stage of our process as well as significant changes to the rubric made at each stage.
Literature Review and Key Informant Interviews Our initial literature review yielded several existing rubrics, tools, criteria and processes for evaluating evidence supporting a specific intervention (5,21). Many had a similar structure of grouping criteria by domain (e.g., aspects of the research design or strength of the outcomes) and having different evidence ratings or levels (e.g., low, medium, high evidence strength). Conceptually, we modeled our initial tool in the same way; grouping criteria by domain and having a series of evidence levels. We conducted a total of 10 interviews. Informants reflected on different potential domains (e.g., elements of the research design) and listed specific ways that they felt research and evidence quality differed in implementation research from clinical trials. Among factors highlighted were a need to consider the context and specification of strategies, criteria specific to implementation outcomes, and consideration of the equity impact of implementation strategies on the health outcome under consideration.
Based on these results, we structured our initial tool along six domains: overall effectiveness, study design quality, implementation outcomes, equity impact, strategy specification, and a bundled strategies domain. Each domain included a set of criteria considered within each domain. For example, criteria for the implementation outcomes domain included operationalization of implementation outcomes; validity and reliability of measure used; significance and direction of effect for quantitative outcomes; and reported effects as beneficial, neutral, or harmful. We also developed and defined five evidence levels with associated recommendations: best practice strategy, promising strategy, emerging strategy, undetermined strategy, non-recommended strategy. As an example, promising strategies were described as demonstrating mostly positive outcomes that may need more rigorous examination to ensure they are having the intended effect or are generalizable to a wider context. Practitioners would be recommended to take caution when using a promising strategy in practice and ensure it is having a similar outcome as demonstrated in the original research.
Modified Delphi
For the Delphi Round 1, we recruited from a pool of 68 experts. Two individuals responded stating their inability to participate, with one participant suggesting a replacement. Forty-one participants completed the survey, and two participants partially completed the survey for a total of 43 participants (63% response rate). For the Delphi Round 2, we recruited among the responders from Round 1 with no refusals to participate and no partial responses. Thirty participants in total completed the Round 2 survey (70% response rate). Respondent characteristics are provided in Table 1 for both Delphi Rounds. Briefly, one half of Respondents in both rounds self-identified as women (55.8%; 50% in rounds 1 and 2 respectively), with the majority white (83.7%; 80%) and not Hispanic or Latino (86%; 100%). Most respondents worked in academic settings (81.4%; 80%), with most working in HIV in round 1 but not round 2 (83.7%; 36.7% respectively). The highest number respondents had 11–20 years of experience in their area of expertise (44.2%; 43.3% respectively), and three quarters reported experience with leading implementation research projects (76.7%; 73.3%). Both complete and partially complete responses are included in the analyses.
Delphi Round 1.
Table 2 presents the quantitative outcomes regarding whether the participant believed that domains should be added, dropped, or combined. More than half (58%) of participants thought no new domains should be added, while 44% of participants thought domains should be dropped or combined. When examining the evidence levels, 79% of individuals felt that no additional evidence levels were needed, while 47% thought one or more of the evidence levels could be dropped or combined.
Table 3 summarizes open-ended responses with example quotes for domains and evidence levels that were commented on most often. When reviewing the qualitative responses of those who indicated a domain should be added, most respondents suggested adding specific criteria or wanted greater clarity in how the domains and criteria within domains were defined. For example, regarding the equity domain, individuals desired greater clarity, operationalization, and description of how equity is being considered and evaluated. Of these, four sought greater clarity of equity-related outcomes, and six recommended inclusion of equity metrics or different ways of operationalizing equity. Three participants felt equity should be examined in combination with implementation outcomes. Three suggested greater consideration of community partnership development and inclusion of the target population in the development of the strategy or design of a study. Finally, participants recommended combining promising, emerging, and/or undetermined as levels of evidence and better specifying and operationalizing the levels.
Briefly, we revised the structure of our tool along five domains: study design, implementation outcomes, study rigor and limitations, strategy specification, and equity impact. These domains each included a revised set of criteria. For example, based on the recommended additions to the study design and rigor domain, we split this domain into two domains: 1) study design; and 2) study limitations. We considered several of the comments on dropping equity but ultimately opted to keep this domain, relax the criteria, and heavily refine the description. Other cross-cutting changes included combining the criteria for bundled strategies and strategy specification. We combined two of the evidence levels (emerging and undetermined) and revised the definitions to include: best practice, promising practice, needs more evidence, and harmful.
Delphi Round 2.
For the second round of the Delphi, we asked respondents to confirm major changes to the tool based on the first round of the Delphi (Table 2), and have respondent evaluate our proposed process for applying the criteria. Most respondents agreed with changes to the domains and evidence levels although there remained some commentary on the equity domain. When examining the open-ended responses among those disagreeing with the changes to the equity domain, we grouped responses into individuals that did not agree with the domain (i.e., a hard no to the revisions) and others who still had additional suggestions for the domain but approved of the domain overall (i.e., a soft no with suggested revisions; Table 3). Based on these responses, we finalized the domains and made several additional adjustments to the definitions of equity including defining which target populations can be considered in determining whether the strategy has a positive equity impact or not. Finally, we revised our process for applying the rubric based on the recommendation to apply the criteria across each domain in addition to giving an overall rating. While this did increase time in the review process, this change allowed us to still provide information on how strategies rate across all domains, enabling researchers and practitioners to compare how strategies rate on different domains or select a strategy that is strong in a specific domain, like equity supporting for example.
Pilot Application to PrEP Implementation Strategies
To ensure a consistent, high-quality process for applying criteria to research studies examining implementation strategies, we initially piloted the rubric with existing studies on implementation strategies to promote the uptake of evidence-based HIV services. At the conclusion, we were able to achieve 90% reliable application of the criteria, resulting in dropping some criteria and clarifying other criteria and their application. Two members of the ISCI team then applied the rubric to a set of 18 implementation strategies identified through an ongoing systematic review designed to promote uptake of PrEP in a second pilot application, achieving 98% reliability. Among the 18 strategy studies, summarized in Table 4, one was assigned an overall rating as Best Practice and the remaining were assigned as Needs More Evidence. The primary domains where strategies failed to exceed the Needs More Evidence criteria were in Research Design as well as Study Rigor and Limitations. This was largely because these studies only utilized post-implementation assessment, were intended as pilot or feasibility studies, or were conducted only at a single site. Given the early state of the implementation research related to PrEP implementation in the US, we felt that this mix of ratings was relatively appropriate. While the domains that have parallels in other rating systems resulted in relatively low ratings among our studies, we observed a good mix of ratings on domains unique to our tool and implementation research (i.e., strategy specification and equity) at the Best, Promising, and Needs More Evidence levels, suggesting these domains are sufficiently discerning among the existing set of studies.
A summary of major changes to the rubric and criteria are summarized in Table 5. The final domains and evidence-levels are provided in Table 6. The final rubric with domains, criteria, evidence levels, and application instructions are presented in Table 7.
Discussion
To our knowledge, this is the first set of criteria to evaluate level of evidence for implementation strategies to establish evidence quality and serve as a basis for recommendations to practitioners. Our Best Practice rubric was initially informed by criteria used for interventions and interviews, refined by a Delphi, and then piloted with implementation strategies. This process yielded a rating scale (i.e., best, promising, needs more evidence, and harmful) and domains (e.g., study design, implementation outcomes, rigor, and limitations), which are common to other tools and rubrics. Yet, implementation research’s unique system-level focus required tailoring to our rubric. For instance, we have outlined criteria for the research design domain that considers the realities of where implementation research is conducted and does not require blinding or randomization for strategies to be considered the highest rating. To help define the criteria for these domains, we used Proctor’s recommendations for strategy specification and for implementation outcomes, as well as current commentary on research rigor in implementation science (1, 6, 22). While these helped provide structure and specific criteria at each of the evidence levels, in conducting the pilot we noted missing information which sometimes made it difficult to evaluate the research. We recommend using Standards for Reporting Implementation Studies (StaRI) guidelines as well as Proctor’s recommendations for strategy specification when reporting implementation research to help report the needed details to evaluate the research and for potential practitioners to understand what resources and efforts are needed for implementation strategies (1).
In addition to being a new resource for implementation science, to our knowledge this is also the first evidence rating criteria that considers the potential to improve equity in a health issue. Because implementation science directly impacts communities with the potential to improve or exacerbate inequities, HIV included, experts reiterated that equity was a critical domain to include. However, our work demonstrates a lack of consensus in the implementation science field about what equity in implementation science means. We have emphasized community engagement in the research process, a research focus on populations experiencing inequities, as well as equity in outcomes as a means to encourage attention to and improvement in HIV-related inequities as many in the field have advocated (23–25). We recognize that no single implementation strategy (or intervention) is going to adequately address the deeply rooted structural determinants, like racism and homophobia, which keeps inequities entrenched. However, it is our hope that by including equity improvement as criteria to be considered the highest quality research, we can bring additional attention to and encourage equity in HIV outcomes in the US.
Our rubric and criteria are designed to discern among studies specific to HIV implementation research, which is a rapidly growing field. There are other heath areas, such as cancer, for which there are more studies leveraging more rigorous research designs to evaluate implementation strategies (26, 27). We encourage others who may consider using this rubric in their area of implementation science to consider the specific criteria within each of the domains and at each of the evidence-levels to ensure that it appropriately discerns among available studies before application. Conversely, we received many suggestions about more stringent criteria that participants felt like should be included that we were not able to include because it would have resulted in few-to-no strategies identified as best practice. US focused HIV implementation science is still in its adolescence, with many pilots and full-fledged trials underway but not yet published. It is our hope that in the future, we will be able to include more stringent criteria within the rubric so that the needed evidence quality improves over time within HIV implementation research.
There are some notable limitations to the processes used to develop the Best Practice rubric and the criteria themselves. We used a Delphi modified approach to develop the rubric and criteria. Our use of this method did not result in consensus, but instead resulted in an approximation of consensus. In addition, we were not able to elicit the opinions about the appropriateness of the rubric and tool from the perspective of front-line implementers on balance with those of the research community. We hope to address this in future iterations of this work.
We envision several future directions for this tool with implications for both researchers and practitioners. Systematic reviews of HIV-related implementation strategies are currently underway (28). The next phase will entail applying these criteria to implementation strategies identified through these reviews and developing a compendium of strategies, with the intention of supporting strategy dissemination with best or promising evidence for their adoption and scale up. We recognize that a rating and recommendation is not sufficient to support uptake, and a complementary dissemination effort is underway to provide the needed information and materials for adoption and penetration. Our criteria and rating system will also yield benefits for researchers conducting HIV implementation research. Along with the systematic review, it will identify strategies for which there is already good evidence, as well as strategies that hold promise but would benefit from additional research and additional evidence supporting their effectiveness. Researchers can also use these criteria in designing studies of new strategies so that they can score better on these criteria.
Conclusion
For practitioners to fully benefit from research developing and testing implementation strategies targeting HIV services, clear evaluation criteria and recommendations are needed to assess which strategies are the most likely to have benefit and impact. Our process for developing a rubric and criteria yielding domains and criteria specific and appropriate for implementation research that can be used to evaluate evidence quality in HIV-related implementation strategies. This rubric includes recommendations for practitioners about strategies for which there is best evidence and recommendations for research about strategies for which more evidence is needed. Establishing criteria to evaluate implementation strategies advances implementation science by filling a much-needed gap in HIV implementation research which can be extended to other areas of implementation science.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Proctor EK, Powell BJ, Mc Millen JC. Implementation strategies: recommendations for specifying and reporting. Implement Sci. 2013;8(1):139.24289295 10.1186/1748-5908-8-139PMC 3882890 · doi ↗ · pubmed ↗
- 2Powell BJ, Fernandez ME, Williams NJ, Aarons GA, Beidas RS, Lewis CC Enhancing the Impact of Implementation Strategies in Healthcare: A Research Agenda. Front Public Health. 2019;7.10.3389/fpubh.2019.00003 PMC 635027230723713 · doi ↗ · pubmed ↗
- 3Guyatt GH, Oxman AD, Vist GE, Kunz R, Falck-Ytter Y, Alonso-Coello P, GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ. 2008;336(7650):924–6.18436948 10.1136/bmj.39489.470347.ADPMC 2335261 · doi ↗ · pubmed ↗
- 4Sawaya GF, Guirguis-Blake J, Le Fevre M, Harris R, Petitti D, for the U.S. Preventive Services Task Force. Update on the Methods of the U.S. Preventive Services Task Force: Estimating Certainty and Magnitude of Net Benefit. Ann Intern Med. 2007;147(12):871.18087058 10.7326/0003-4819-147-12-200712180-00007 · doi ↗ · pubmed ↗
- 5GRADE handbook. https://gdt.gradepro.org/app/handbook/handbook.html. Accessed 17 Nov 2023.
- 6Kilbourne A, Chinman M, Rogal S, Almirall D. Adaptive Designs in Implementation Science and Practice: Their Promise and the Need for Greater Understanding and Improved Communication. Annu Rev Public Health. 2024;45(1).10.1146/annurev-publhealth-060222-014438 PMC 1107044637931183 · doi ↗ · pubmed ↗
- 7Lamont T, Barber N, de Pury J, Fulop N, Garfield-Birkbeck S, Lilford R New approaches to evaluating complex health and care systems. BMJ. 2016;352.10.1136/bmj.i 15426830458 · doi ↗ · pubmed ↗
- 8Schliep ME, Alonzo CN, Morris MA. Beyond RC Ts: Innovations in research design and methods to advance implementation science. Evid-Based Commun Assess Interv. 2017;11(3–4):82–98.
