Transcultural adaptation of the virtual patient integration rating scale: a factor analysis
Isaza-Restrepo Andrés, Buitrago-Ricaurte Natalia, Bermúdez-Hernández Pablo, Ariza-Salamanca Daniel, Ibañez-Pinilla Milciades, Vergel John

TL;DR
This study adapts and validates a rating scale for assessing virtual patient use in Spanish-speaking medical education, ensuring it works well in Ibero-American contexts.
Contribution
The study transculturally validates the Virtual Patient Integration Rating Scale for Spanish-speaking environments, ensuring cultural relevance and reliability.
Findings
The VPIRS-E maintains the original four-dimensional structure with 61.8% total variance explained.
The scale demonstrated strong internal consistency (Cronbach’s alpha of 0.826) and test-retest reliability (intraclass correlation coefficients from 0.8 to 0.9).
The validated VPIRS-E is suitable for assessing student perceptions of virtual patients in Spanish-speaking medical education.
Abstract
Using Virtual Patients (VPs) in medical education has gained popularity, especially during the SARS-CoV-2 pandemic, which restricted traditional clinical training. VPs provide a learning platform for students to refine their clinical reasoning and decision-making skills in a risk-free environment. Although the educational benefits of VPs are well known, there is still a need for validated tools to assess student perceptions, which are key to optimizing learning outcomes. The Virtual Patient Integration Rating Scale (VPIRS) has made a valuable contribution in this regard, having been established in English-speaking contexts, but its applicability in Ibero-American countries remains poorly explored. This study aimed to fill this gap by transculturally validating VPIRS for Spanish-speaking medical education environments, ensuring it reflects cultural nuances. We conducted a two-phase…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation-Based Education in Healthcare · Innovations in Medical Education · Clinical Reasoning and Diagnostic Skills
Background
Virtual patients (VPs) have been used in medical education to enhance students’ clinical skills, while ensuring that no real patients are harmed [1]. This approach facilitates early exposure to professional demands and integration of essential concepts that strengthen students’ critical reasoning [2]. Numerous studies have shown that integrating VPs into the medical curriculum improves students’ diagnostic, treatment, and communication skills [3–6]. Furthermore, due to the SARS-CoV-2 pandemic’s limitations on students’ hospital and primary care rotations [7, 8] there has been a considerable increase in the use of VPs for instructional purposes. Accordingly, the perception of VPs by students has become a relevant research topic.
Despite the increased interest in researching VPs in medical education, the current academic debate has primarily focused on the development and implementation of instructional designs that integrate this technology. The research on VPs shows a wide variance in context and methodology, making direct comparisons difficult and obscuring a clear understanding of their educational value [9]. Some studies have evaluatedthe effectiveness of VPs in enhancing students’ clinical skills compared to real patients [6, 10, 11] with most of these studies using qualitative or mixed-method approaches [12, 13]. However a challenge in this area of research is the limited literature available on tools that assess students’ perceptions of integrating VP into their curriculum. While some studies have explored this issue, their focus is on different disciplines, contexts, and purposes than those of our research [14, 15].
The Virtual Patient Integration Rating Scale (VPIRS) was specifically designed to measure medical students’ perceptions of VPs, focusing on their role in enhancing clinical reasoning and semiology training as an additional teaching and learning strategy. The VPIRS consists of 25 Likert-scale items in four domains: knowledge acquisition, learning facilitation, non-authentic teaching, and learning disabilities [14–18].
Another distinctive feature of VPIRS is its ability to independently assess clinical skills such as history taking, physical examination, and differential diagnosis, instead of combining them into broader categories. In addition, VPIRS is particularly well suited to educational environments that emphasize autonomous learning, as it allows students to interact with virtual patients in an autoregulated manner without needing constant supervision by a facilitator.
Although the VPIRS was originally developed and traduced in Slovenian [18], it has been used in English-speaking contexts. However, to the best of our knowledge, there is no validated Spanish version available for Ibero-American countries. This gap creates uncertainty about students’ perceptions of VPs in these contexts and hinders global comparative analyses. A directly translated VPIRS without thorough transcultural validation for the specific needs of Spanish-speaking medical students may yield unreliable results. Thus, the aim of this study was to translate and transculturally validate the VPIRS to accurately assess Spanish-speaking medical students’ perceptions of VPs use.
Methods
Study settings and participants
This adaptation and transcultural study involved first through sixth year medical students at the Universidad del Rosario. Participants were selected from those who had completed at least one course using VPs. Located in Bogota, Colombia, our university enrolls approximately 280 medical students per year. The six-year medical curriculum emphasizes the integration of basic/biomedical, clinical, population health, and sociohumanistic sciences. This integration revolves around the Problem-Based Learning (PBL) methodology, complemented by laboratories, lectures, and semiology training in real clinical scenarios starting in the first year.
As part of this curriculum, VPs are incorporated through the i-Human^®^ platform by Kaplan. The platform simulates patient encounters through avatars and detailed animations illustrating physiological, pathological, histological, and anatomical systems. Students are required to engage with these cases in English, with language proficiency certifications required throughout the program. Students interact with a curated set of cases selected and standardized by clinical experts as part of the university’s e-Clinic: Transcurricular Virtual Clinical Strategy. The learning experience has two main components: first, students work independently or in small groups to analyze the VP cases, focusing on history taking and physical examination. Then they present their findings in simulated medical rounds, where they receive formative feedback from facilitators. Second, students participate in real-time case evaluations under supervision, where they apply their knowledge to diagnose and manage VPs and generate analytical reports. This experience is designed to strengthen both clinical reasoning and semiological skills, and to prepare students for future interactions with real patients in clinical settings.
For the survey, we used non-probabilistic sampling to select our participants based on voluntary students’ participation and previous exposure to VPs. And following the best practices recommendations [16], we determined a sample size of 150 students. This calculation was based on the criterion of having at least 5 students per scale item. To reduce the risk of data loss or attrition, we included an additional 20% buffer, following the approach of similar studies [17].
The scale
We did a literature review to identify existing tools suitable for assessing student perceptions of VPs use. In PubMed, the search criteria was: ‘Perception*[Title/Abstract] AND ((“virtual patients“[Title/Abstract] AND “perception“[Other Term] AND “scale“[Other Term] AND “students“[Other Term]) OR “virtual patients“[Other Term]) AND “students“[Other Term].’ In Google Scholar, the search terms used were ‘(virtual patient scale perception “medical students”).’ This process led us to the Virtual Patients Integration Rating Scale (VPIRS), a tool initially developed in a Slovenian-speaking context and translated to English. The VPIRS, recognized for itspsychometric properties, assesses medical students’ perceptions of virtual patient-based learning. This instrument has 25 items and its responses are recorded on a 5-point Likert scale. Ratings range from 1, signifying strong disagreement, to 5, indicating strong agreement. The scale is divided into four distinct domains: Knowledge Acquisition and Retention, Learning Facilitation, Perceptions of Inauthentic Teaching, and Learning Disadvantages. Given its established validity and relevance to our pedagogical aims, we obtained the author’s authorization to translate and transculturally adapt the VPIRS for our study [18].
Translation, adaptation, and validation process
Our methodology comprised two phases. The first phase, drawing from Mohamad-Isa et al.’s method, involved translation and face validation [19, 20]. The second phase centered on assessing content, and construct validity, as well as internal scale consistency and reliability [16].
First phase
The VPIRS was translated into Spanish by two certified bilingual translators. Our research team reviewed and merged these translations into a single version. This Spanish version was then back-translated into English by two bilingual experts in medical education. Using the modified Delphi method, the research team refined this back-translated version to a preliminary state, ensuring it retained the original number of items and the five-point response scale (ranging from ‘strongly disagree’ to ‘strongly agree’).
We then tested this preliminary version for face validity on ten medical students of varying years, who had experience with VPs. In this method, students independently solve pre-set virtual patient cases. This trial helped evaluate the scale’s face validity and identify any idiomatic or cultural nuances (see Fig. 1). We assessed the scale’s acceptability through the discrimination index [21].
Fig. 1. Protocol for the cross-cultural validation of the scale
Second phase
We administered the revised version of the scale to medical students who had previously engaged with VPs. Our objectives were to establish construct validity, internal consistency, reproducibility, and reliability of the scale (see Fig. 2). We obtained prior informed consent from the participants and employed a test-retest methodology for this purpose. The scale was administered under the supervision of two medical education experts of the research team.
To ensure the reproducibility of our findings, we scheduled the retest after a one-month interval, coinciding with the students’ vacation period. This timing was strategic, as it minimized the likelihood of students’ further exposure to VPs between tests, thereby reducing potential bias in their responses.
Fig. 2. Established protocol to evaluate the scale’s internal consistency, and reliability
Data collection and analysis
The study began by extending invitations for voluntary participation to all of our nearly 1,500 students. Detailed explanations about the study’s purpose and methodology were provided, emphasizing the students’ rights to voluntary participation and assuring confidentiality. Informed consent was obtained and meticulously documented for those who agreed to participate. Questionnaires were administered in an online format from July to August 2022.
The reproducibility of the data was assessed using the Wilcoxon signed-rank test, which was chosen due to the nonparametric nature of the paired sample data and the fact that the distribution of the scores did not meet the assumptions of normality, as confirmed by the skewness and kurtosis coefficients, as well as the Kolmogorov-Smirnov and Shapiro-Wilk tests. While correlation-based methods (e.g., Pearson’s or Spearman’s correlation) are typically used for test-retest reliability, given the distributional characteristics of the data, the Wilcoxon test was more appropriate for capturing both the magnitude and direction of changes in individual scores over time. Then, exploratory factor analysis was conducted to examine the scale’s factor structure. This step was essential to understand the underlying dimensions of the scale. Reliability analysis, including the calculation of Cronbach’s alpha, was performed to determine the internal consistency of the items.
To further validate the factor structure identified in the exploratory and confirmatory analysis, factor analysis was undertaken. This involved applying direct oblimin rotation and the maximum likelihood method. We also calculated the Average Variance Extracted (AVE) to establish convergent validity, and correlation coefficients were computed to assess discriminant validity between the factors of the instrument. All statistical analyses were conducted using SPSS (SPSS Statistics for Windows, Version 17.0, Chicago: SPSS Inc). In our analysis, we adhered to a standard of statistical significance set at a p-value of < 0.05.
Results
This section reports the results across two distinct phases. The first phase involved translating, adapting, and transculturally validating the scale. The second phase focused on evaluating the scale’s construct validity, internal consistency, reproducibility and reliability. Furthermore, this section describes the challenges encountered during the translation phase.
Translation, adaptation, and transcultural validation of the scale
In our efforts to refine and adapt the instrument for a cultural context different from that of the original English version, we embarked on a meticulous analysis of its Spanish translation. During these sessions, five experts in medical education discussed the nuances of language and meaning. We determined and included only the relevant and representative items. We encountered a challenge in the domain of ‘Acquiring and Maintaining Knowledge.’ The wording of item 5 sparked considerable debate. In the context of virtual patient cases, the term ‘examinations’ in Spanish appeared to be ambiguous, blurring the distinction between a physical examination and a comprehensive case analysis. Similarly, the reference in item 8 to ‘important clinical presentations’ in Spanish was open to interpretation, leaving us uncertain whether it pertained to the techniques used to present patients or to the set of symptoms central to a diagnosis. We opted to specify ‘examinations’ as referring explicitly to physical examinations, since other items within the domain assess aspects related to case analysis.
In the domain of ‘Facilitation of learning’, challenges were also identified in the Spanish versions of item 16, ‘Virtual patient study obligations can be carried out concurrently’, and in item 18, ‘Virtual patients are a useful way of continued medical education (excluding thought courses)’. We noticed that the term “thought” in item 18 might be a typographical error from the original article, and the correct term should probably be “taught” (enseñado). Additionally, we encountered ambiguities in the interpretation of terms such as ‘study obligations’ and ‘continuing medical education’. Our solution to these linguistic challenges was a thorough face validation process. Ten participants answered the scale and an acceptable discrimination index was considered above 90%. This process highlighted the difficulties that participants encountered, for example, with the concept of ‘academic obligations’ in item 16. As a result, we decided to list these obligations explicitly for absolute clarity. Similarly, the Spanish version of ‘continuing medical education’ in item 18 was unfamiliar to many, so we included a concise definition.
The challenges extended beyond these items. In Domain 3, ‘Inauthentic Teaching,’ Item 19 posed a unique dilemma. The Spanish terms elicited a range of interpretations, from ‘not real education’ to ‘education using overly artificial methods.’ To prevent misinterpretation, we refined these terms and produced a final Spanish version of the instrument that was more than just a translation; it served as a cultural bridge. This version included clear instructions and carefully defined terms such as ‘academic obligations,’ ‘continuing medical education,’ and ‘inauthentic teaching,’ ensuring that each question was aligned with our audience’s linguistic and cultural ethos (detailed in Table 1 of the supplementary material 1). These modifications were critical, not mere edits, but instrumental in enhancing students’ understanding, underscoring the important aspects of each question, and embedding culturally relevant language into the instrument.
Table 1. Final Spanish version of virtual patient integration rating scale (VPIRS-E)ÍtemDimensión 1: Adquirir y retener el conocimiento1Aprender a través de pacientes virtuales me ha ofrecido mucho conocimiento nuevo.2Los pacientes virtuales me han ofrecido la oportunidad de profundizar en el conocimiento que he adquirido hasta ahora3Los diferentes métodos de enseñanza que se han utilizado con los pacientes virtuales (fotos, texto) han hecho que aprender sea más fácil.4Aprender con pacientes virtuales me ha facilitado una mejor comprensión del cuidado integral del paciente.5El examen físico de los pacientes virtuales me permite retener más fácil el conocimiento6El requisito de proponer un diagnóstico diferencial me ayudará a obtener la habilidad de razonamiento clínico y priorizar el diagnóstico más probable cuando encuentre a un paciente con síntomas similares en la sala de espera.7Tengo un mejor conocimiento de cómo tratar pacientes con síntomas similares a los estudiados en los pacientes virtuales.8He aprendido más acerca de las presentaciones clínicas de las enfermedades relevantes para el trabajo práctico.9Los casos de los pacientes virtuales son un componente curricular apropiado.10He aprendido más sobre el abordaje general del paciente mediante los pacientes virtuales.11El programa de pacientes virtuales me ha ayudado a desarrollar habilidades para una anamnesis dirigida.12El programa de pacientes virtuales me ha ayudado a desarrollar habilidades para un examen físico dirigido.13Los pacientes virtuales me han ayudado a desarrollar mi pensamiento para ampliar los diagnósticos diferenciales.14He aprendido cómo actuar con pacientes en escenarios específicos. Dimensión 2: Gestión del aprendizaje 15Las otras obligaciones académicas dejan tiempo suficiente para mi estudio independiente de los pacientes virtuales.16Las diferentes obligaciones académicas relacionadas con el paciente virtual se pueden llevar a cabo al mismo tiempo. (Obligaciones académicas entendidas como: construcción de historia clínica, revisión de literatura en torno al caso, preparación de la presentación del caso e identificación de hallazgos claves.)17Es una ventaja tener la oportunidad de decidir por mi cuenta cuándo completar mis tareas de estudio con mi paciente virtual.18Los pacientes virtuales son una herramienta útil para la educación médica continuada. (Entiéndase educación médica continuada como actividades diferentes a los cursos obligatorios) Dimensión 3: Enseñanza inauténtica (Enseñanza auténtica entendida como la estrategia que conecta a las asignaturas con el mundo real) 19La educación con pacientes virtuales no es el aprendizaje teórico auténtico.20Tengo dudas de que el tratamiento del paciente virtual se corresponda con el del paciente real21En muchas ocasiones, los casos de pacientes virtuales son demasiado ideales. Dimensión 4: Desventajas para el aprendizaje 22Los casos fueron muy exigentes para mi nivel de conocimiento.23Los casos de los pacientes virtuales ocuparon demasiado de mi tiempo.24No poder discutir los casos de pacientes virtuales con mi profesor era una desventaja.25Entendí poco los casos de los pacientes virtuales en inglés
Construct validity, internal consistence, reliability and reproducibility
In our study, 186 medical students answered the first test, and 153 completed the questionnaire. The participant group had an average age of 22 years (SD 5 years), predominantly female (73.8%). Most students (81.7%) were in their second to sixth academic years, with the remainder (18.3%) in the seventh year or higher. A comprehensive analysis yielded 3825 complete responses, with no data missing.
Construct validity: factor analysis
The overall mean response was 3.66, with a standard deviation of 0.98. All items demonstrated sufficient variance, qualifying them for inclusion in the factor analysis (refer to Table 2). The data’s suitability for factor analysis was confirmed through the Kaiser-Meyer-Olkin (KMO) and Bartlett tests. The KMO criterion yielded a value of 0.913 (ideal range 0.5 to 1), and Bartlett’s test of sphericity showed statistical significance (p < 0.0001; 300gl; Chi-square 21.256), rejecting the null hypothesis of the presence of orthogonal variability. These results indicated favorable conditions for factor analysis and the identification of factors underlying the correlation matrix of the evaluated items.
Exploratory factor analysis
Our exploratory factor analysis, visualized in the scree plot (Fig. 3), suggested a model with four distinct factors, each with an eigenvalue exceeding 1. This model was chosen as it explained a total variance of 61.8%, a substantial proportion that indicates that these factors are capturing the essence of the data.
Fig. 3. Scree plot, preliminary analysis for the identification of possible components
In the analysis of commonalities, which reveals how much of each variable’s variance is explained by the factors, we observed a range in values. The highest commonality was 0.702, indicating substantial shared variance for that variable, while the lowest was 0.538, suggesting a more moderate level of shared variance (details can be found in Table 2). This analysis sheds light on the degree to which each variable is related to the factors identified, offering a deeper insight into the data’s underlying structure.
Table 2. Descriptive statistics of evaluated items analyzed by four factorsItem n MeanSDVariance percentage (%)Total variance explained (%)Communalities11534.120.81440.10640.1060.70221534.270.78010.15850.2640.66031533.920.9846.09556.3590.59841533.781.0905.50861.8660.65451533.371.2233.80365.6690.64961534.500.6193.20068.8700.56971534.070.8792.92571.7950.56981534.150.8492.87174.6650.65891534.240.9182.58477.2490.661101533.851.0122.42879.6770.607111534.290.8082.28281.9590.516121533.621.1872.10384.0620.616131534.270.8581.87785.9390.605141533.641.0551.74887.6870.621151533.221.0901.61289.2990.652161533.650.9971.49990.7980.727171534.460.7871.40092.1980.365181534.210.9081.32593.5240.648191532.981.0851.23394.7570.612201533.261.2021.08295.8390.673211533.541.0070.99796.8360.640221532.180.9540.91297.7480.667231532.751.0960.81198.5590.668241533.401.2050.80099.3590.538251531.871.1570.641100.0000.594
Definitive factor analysis: internal consistency
We conducted a principal components analysis with Varimax rotation and Kaiser normalization, using four components. This approach explained a total of 61.8% of the variance in the data. The variance distribution among the identified factors is detailed below:
- Factor 1 accounted for 33.15% of the total variance, encompassing items 1 to 14.
- Factor 2 contributed 11.15% to the total variance, covering items 15 to 18.
- Factor 3 was responsible for 9.2% of the total variance, including items 19 to 21.
- Factor 4 represented 8.3% of the total variance, with items 22 to 25.
This distribution clarifies each factor’s contribution to the overall variance and the specific items associated with each.
In our principal component’s analysis using Varimax rotation and Kaiser normalization with four components, we clarified the variance distribution within our data, which totaled 61.8%. Specifically, Factor 1, comprising items 1 to 14, accounted for 33.15% of this variance. Factor 2, including items 15 to 18, contributed 11.15%, while Factor 3, with items 19 to 21, was responsible for 9.2% of the variance. Finally, Factor 4, encompassing items 22 to 25, represented 8.3%. This breakdown effectively delineates the contribution of each factor to the overall variance, along with the specific items grouped under each factor.
In the rotated component matrix, the four components effectively grouped the 25 items, showing satisfactory factor loadings (refer to Table 3 for details). Items 17 and 18, originally intended to measure learning management within Dimension 2, exhibited a stronger association with Dimension 1, focusing on knowledge acquisition and retention. This suggests that aspects of learning management are significantly influenced by the dimension of knowledge acquisition and retention, reflecting the intertwined nature of these educational constructs.
The four components in the rotated component matrix effectively group the 25 items, demonstrating satisfactory factor loadings (see Table 3). Notably, two items (17: “It is an advantage to have the opportunity to decide on my own when to complete my study assignments with my virtual patient” and 18: “Virtual patients are a useful tool for continuing medical education”) from dimension 2, which assesses learning management, are more strongly associated with dimension 1, assessing knowledge acquisition and retention. This finding suggests that these two items, originally designed to measure learning management, are more influenced and explained by the knowledge acquisition and retention dimension. It is plausible that learning management, being a complex and hierarchical phenomenon, involves knowledge acquisition. The theoretical constructs underlying these dimensions may also intersect, leading to the observed overlap and association.
For dimension three, including items 19 through 21 that measure inauthentic teaching, we observed that these items may be influenced by factors related to learning management, as indicated by their loading on factor 2. This connection is further reinforced by the behavior and loadings of items 15 and 16, which relate to time management and concurrency of academic obligations in the third dimension. These factors seem to influence students’ perceptions of the authenticity of the learning process. In addition, perceptions of inauthentic teaching may also influence how students perceive their ability to manage time and balance academic obligations, suggesting a bidirectional relationship between the perceived authenticity of the learning experience and the management of academic tasks in VP-based learning.
Furthermore, Item 24, which falls under the dimension 4 assessing learning disadvantages, demonstrated a closer alignment with the learning management component. This finding indicates that students perceive the absence of teacher interaction in virtual patient cases as a significant disadvantage, underscoring the crucial role of facilitator involvement in effective learning experiences with virtual patients.
Table 3. Principal component analysis of the four identified factorsItemComponent123410.774− 0.2720.148− 0.08620.750− 0.1890.150− 0.19730.722− 0.1850.1930.06740.744− 0.2190.1880.13250.612− 0.3860.2770.22260.6800.1030.030− 0.30870.667− 0.0990.338− 0.02180.7710.0110.202− 0.15590.768− 0.1650.193− 0.079100.742− 0.1910.0120.138110.677− 0.1690.021− 0.166120.683− 0.3030.1570.183130.761− 0.0270.076− 0.136140.603− 0.2150.3930.238150.313− 0.0190.744− 0.005160.2830.1350.7930.024170.4760.1520.130− 0.313180.751− 0.1400.240− 0.08319− 0.2090.7150.0870.22320− 0.2410.767− 0.0760.14021− 0.1460.779− 0.0350.106220.0500.338− 0.0450.74123− 0.2580.353− 0.4540.52024− 0.0200.419− 0.5120.31625− 0.0820.1190.0450.756
Reliability
Encouraging results were found in the reliability assessment of the scale. The overall Cronbach’s alpha for all domains was 0.826, indicating strong reliability. Breaking this down further, Domain 1, which includes items 1–14, showed an impressive alpha of 0.938. Domain 2, covering items 15–18, had an alpha of 0.705. Domain 3, with items 19–21, yielded an alpha of 0.787, and Domain 4, encompassing items 22–25, also recorded an alpha of 0.705. These values attest to the very favorable reliability of each domain within our scale.
A Cronbach’s alpha of 0.826 was found for all the domains. A very favorable reliability is established for this scale and each of the domains evaluated:
- Domain 1: items 1–14, Cronbach’s alpha: 0.938.
- Domain 2: items 15–18, Cronbach’s Alpha: 0.705.
- Domain 3: items 19–21, Cronbach’s alpha: 0.787.
- Domain 4: items 22–25, Cronbach’s alpha: 0.705.
Reproducibility
We evaluated the instrument’s reproducibility by administering it at two distinct points in time. The Wilcoxon signed-rank test revealed statistically significant differences in the scores for 7 of the 25 items evaluated. Specifically, notable differences were observed in the scores for Item 2 (Z= -3.078; negative range; p = 0.002), Item 4 (Z= -3.655; negative range; p < 0.001), Item 9 (Z= -2.481; negative range; p = 0.013), Item 12 (Z= -2.674; negative range; p = 0.008), Item 17 (Z= -3.988; negative range; p < 0.001), Item 23 (Z= -2.881; negative range; p = 0.004), and Item 24 (Z= -2.999; negative range; p = 0.003). The most significant variations in the reproducibility analysis were observed in the first domain (Z= -3.571, p < 0.005) and the fourth domain (Z= -2.59, p = 0.009). A significant difference was also noted between the first and second applications of the tool overall (Z= -3.57, p < 0.005).
Discussion
This study aimed to transculturally validate the VPIRS for assessing medical students’ perceptions regarding the integration of virtual patients into the curriculum. Our results confirm the Spanish version (VPIRS-E) as both reliable and effective in measuring these perceptions, preserving the original scale’s structure of four dimensions and 25 items. These findings ensure the VPIRS-E accurately reflects the intended constructs, affirming its applicability in Spanish-speaking educational settings.
We also found that certain terms of the scale translated into Spanish were challenging to understand, such as ‘study obligations’ (Item 16), ‘continued medical education’ (Item 18), and ‘inauthentic learning’ (Item 19). The literal translation of these terms resulted in different connotations within our cultural context compared to their original meanings in English. For instance, the term ‘inauthentic learning’ may be interpreted as artificial, synthetic, or rigid. To avoid confusion and enhance understanding, we added descriptive notes to these concepts in the survey form next to the relevant items. We recognized that students were not fully familiar with these definitions and often relied on an intuitive understanding. This careful face validation process highlighted the impact of polysemy on the interpretation of potentially unfamiliar or ambiguous concepts and underscored the critical need to account for linguistic and cultural specificities in the educational context to maintain the scale’s measurement validity. Other studies also emphasize the importance of transcultural translation and adaptation of scales to ensure their reliability, which aligns with our findings [22, 23].
Likewise, we found significant variability within the first and fourth domains (items: 2, 4, 9, 12, 17, 23, 24). This variability may stem from the first domain’s large number of items and the fourth domain’s focus on VPs’ learning disadvantages. These factors likely make these domains more prone to fluctuations over time. In this study, the scale’s second application followed a period without VP exposure, potentially altering participants’ response patterns due to changes in context and experience. However, these variations did not have a significant impact on the domains assessed, indicating consistency and correlation among the items. In light of these findings, when interpreting the results of these specific items, attention should be paid to the timing of the application of the scale and the conditions that may make certain items more susceptible to temporal variation.
Four factors were found to be consistent with those in the original English version of the VPIRS. These factors successfully categorized the scored items. This result indicates that the VPIRS-E accurately assesses the same constructs as its original counterpart. Moreover, the VPIRS-E scale showed strong overall reliability with a Cronbach’s alpha of 0.82, closely matching the original scale’s value of 0.86 [18]. VPIRS-E also exhibited enhanced reliability in the fourth domain, achieving a score of 0.705, surpassing the original’s 0.662 [18]. Furthermore, no statistically significant differences were found among the various student groups, with a Cronbach’s alpha of 0.942 reported for the first domain, 0.751 for the second, 0.711 for the third, and 0.662 for the fourth. Based on the principal component’s analysis in the factor analysis modeling, item elimination was deemed unnecessary. Since the VPIRS-E was found to have consistent validity and reproducibility among students at various levels of undergraduate medical education, this scale can be broadly used in medical schools without being limited to a specific academic year. Our participants’ responses indicate that they had a favorable perception of the use of VPs. This result is consistent with those obtained in other studies, demonstrating the consistency and cross-contextual validity of the VPIRS in various educational and cultural contexts [24–27]. These results also suggest that the use of VPs in medical education is satisfactory for students [12, 28–30].
The congruence of the VPIRS-E with the original scale has implications for future research. By ensuring that measures of students’ perception of VP use are comparable across English and Spanish-speaking populations, comparative international studies on the role of VPs in undergraduate medical education can be conducted. Comparability allows for evaluating the impact of VPs in different educational contexts and cultures, potentially leading to enhanced best practices for integrating VPs into the curriculum and improving the learning experience with VPs as a complement to real clinical experiences. Areas of research that could benefit from this comparability include clinical skills development, the influence of VPs on self-learning, self-assessment, and online learning.
While the VPIRS-E has demonstrated reliability in assessing Spanish-speaking medical students’ perceptions, acknowledging the limitations of this study is crucial. The quantitative methodology employed provides robust measures of students’ perceptions, yet it may not fully capture the deep nature of their learning experiences with VPs. Qualitative data could illuminate the reasons behind our findings, such as students’ sense of motivation or potential dissatisfaction, and offer insights into their subjective learning experiences, like the perceived realism of the VPs or their influence on clinical reasoning skills. These descriptive insights could complement our findings by explaining the “why” behind the “what,” thereby enhancing our understanding of the effective integration of VPs into medical education.
The VPIRS-E scale is developed for use in diverse educational settings with the aim of assessing not only students’ perceptions of the learning process using virtual patients, but also other aspects of their integration into the curriculum. These aspects are mainly evaluated through dimensions 2 (items 15 to 18) and 4. Specifically, these dimensions explore students’ perceptions of the time required for the instructional strategy, the role of facilitators (whether present or absent), and the opportunities provided for fostering autonomy in learning. Under the pedagogical model proposed for our program’s students, it was necessary to use a scale that assumed a much more independent student-tool interaction that did not require the constant supervision of professors/facilitators during the process. Scales such as the one proposed by Huwendiek and colleagues integrate several individual aspects of the clinical reasoning process into a single item. Considering the importance of independently evaluating how the virtual patient differentially impactsthe development of skills such as clinical history taking, physical examination, and differential diagnosis, we believe it is necessary that each item can be evaluated separately.
Based on our experience, we recommend considering specific factors of the curriculum and educational settings that may influence students’ learning processes in the analysis of VPIRS-E results. This will help obtain a contextual understanding that addresses the needs of each institution and student population. We recommend presenting scale items in descending order to mitigate the primacy effect [31]. Similarly, when administering the scale, students should be given an average of 15 min to complete the scale, and a facilitator should be present to answer any questions students may have as they review the items.
Conclusions
The reliability and validity of VPIRS-E across different student levels and its cultural adaptability highlight its significance for global medical education research. The VPIRS-E enables a more comprehensive and inclusive understanding of virtual patient integration in medical curricula by bridging linguistic and cultural gaps.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary Material 1
