Evaluating Artificial Intelligence and Traditional Learning Tools for Chest X‐Ray Interpretation: A Descriptive Study
Gurtek Singh Samra, Vashisht Ramoutar, Kelley Chen, Muiz Chaudhry, Hrithika Patel, Terese Bird, Vanessa Rodwell

TL;DR
This study compares AI tool Chester and traditional resource Radiopaedia for teaching chest X-ray interpretation to medical students.
Contribution
The study evaluates AI and traditional tools for CXR interpretation and explores student perspectives on their effectiveness.
Findings
Chester AI was efficient and useful for revision but struggled with complex pathologies.
Radiopaedia was comprehensive but less efficient for the workbook task due to its vast content.
Combining both tools may optimize learning and improve CXR interpretation skills.
Abstract
Chest X‐ray (CXR) interpretation is a fundamental yet challenging skill for medical students to master. Traditional resources like Radiopaedia offer extensive content, while newer artificial intelligence (AI) tools, such as Chester, provide pattern recognition and real‐time feedback. This study aims to evaluate Radiopaedia and Chester's effectiveness as educational tools and to explore student perspectives on AI. A teaching session on CXR interpretation fundamentals was delivered to establish a standardised baseline of knowledge among participants, followed by a live tutorial introducing students to the functionality of both Chester AI and Radiopaedia. Students engaged with both tools to answer a 25‐item workbook assessing complex CXR pathologies. CXRs were deliberately selected for their complexity to examine student engagement with online learning tools amid diagnostic uncertainty,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
FIGURE 1| Themes | Sub‐themes | Concepts—radiopaedia | Quotes |
|---|---|---|---|
| Educational use | Revision tool | Good revision tool | ‘Good learning tool in terms of explanation underneath the [CXR]’ (Radiopaedia group) |
| ‘I'd use it less as a diagnostic tool, more like in the other way, like, okay, I'm going to revise pneumonia’ (Radiopaedia group) | |||
| ‘If I just wanted to check [a CXR pathology], I think radiopaedia would be great’ (Radiopaedia group) | |||
| Reference/diagnostic tool | Good reference if you already know the diagnosis | ‘If you just had the X‐ray with no history, more likely to use AI than Radiopaedia. But if I had a little bit of information, Radiopaedia is great’ (Radiopaedia) | |
| ‘I sort of had an idea of kind of what I thought it was first, and then went to radio’ (Radiopaedia group) | |||
| Assistive tool | Good tool to support prior knowledge | ‘But if I had a little bit of information to go off, or I sort of kind of recognise it [CXR pathology], but just wanted to check, I think Radiopaedia would be great’ (Radiopaedia group) | |
| Efficiency | Time efficiency | Manual processes take longer | ‘It would just take forever to just like search through all the possible conditions’ (Radiopaedia group) |
| ‘Just so much information’ (Radiopaedia group) | |||
| ‘Frantically scouring Radiopaedia trying to find things’ (Radiopaedia group) | |||
| Content | Information overload | ‘Really hard just to find a simple, like, chest X‐ray of tuberculosis’ (Radiopaedia group) | |
| ‘[Content] Not really relevant to me right now’ (Radiopaedia group) | |||
| ‘It would just take forever to just like search through all the possible conditions’ (Radiopaedia group) | |||
| ‘Just so much information’ (Radiopaedia group) | |||
| Ease of use | Familiar tools are beneficial | ‘Summary at the bottom … I found that quite useful’ (Radiopaedia group) | |
| ‘Easier in the filter section’ (Radiopaedia group) | |||
| Difficulty to navigate | ‘Didn't find it straightforward to use’ (Radiopaedia group) |
| Quotes |
|---|
| ‘Think an AI would be more accurate than me in identifying like areas of consolidation’ (Radiopaedia group) |
| ‘If it was more accurate it would be useful in intense situations’ (Chester AI group) |
| ‘Comes in with like a very dangerous injury … it can give the response straight away’ (Chester AI group) |
| ‘Assesses more, perhaps, risk [of developing a disease]’ (Chester AI group) |
| ‘AI picks up the changes and person interprets them’ (Radiopaedia group) |
| ‘Should be, like, checked over by an actual person’ (Radiopaedia group) |
| ‘do not think it should be used to teach’ (Chester AI group) |
| ‘If it comes up with a similar thing to you, you can have more confidence in it; if not, I value human opinion over the AI’ (Radiopaedia group) |
| ‘Like as a sector, you are putting someone's life at risk’ (Chester AI group) |
| ‘Rely on it solely, is also very dangerous, like, for patient outcomes’ (Chester AI group) |
| ‘you still need the experience of a radiologist who has lived through many different examples who has that knowledge. But I guess still human input would be needed given the anomalies that are there, for the time being’ (Chester AI group) |
| ‘It's best to have a rudimentary understanding of how to read an x‐ray before going on to Chester to give the diagnosis for like a given image’ (Chester AI group) |
| ‘So long as we do not over‐rely on it, in case it crashes’ (Radiopaedia group) |
| ‘Do not want to be relying too heavily on it’ (Chester AI group) |
| ‘If it ended up replacing radiologists’ (Chester AI group) |
| ‘An AI tool is consistently better than the radiologist. Why would I put people's lives in the hands of somebody who's less accurate’ (Chester AI group) |
| ‘5% chance doesn't work … you've got a whole team of experienced radiologists who can take over when it does happen’ (Chester AI group) |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiology practices and education · COVID-19 diagnosis using AI · Radiomics and Machine Learning in Medical Imaging
Background
1
The interpretation of chest X‐rays (CXRs) is a fundamental skill for medical students and healthcare professionals. However, mastering this skill can be challenging, particularly for pre‐clinical students [1]. Traditional educational resources, such as textbooks and websites like Radiopaedia [2], are commonly used to teach CXR interpretation [3]. Radiopaedia, a free online radiology resource written by radiologists, serves as a comprehensive repository of medical knowledge [4]. While such resources are comprehensive, they can be overwhelming and difficult to navigate.
In recent years, artificial intelligence (AI) has emerged as a promising tool in CXR analysis [5], providing new opportunities for clinical practice [6], alongside radiology education [7]. While some studies have reported comparable outcomes to traditional methods [8], others have highlighted potential negative impacts on learner performance [9]. Nevertheless, a significant gap remains in the literature regarding the overall effectiveness of AI in undergraduate radiology education.
Chester the AI Radiology Assistant (Chester AI) is a free machine‐learning browser‐based tool designed to interpret CXRs by identifying pathologies and providing probability scores for various conditions [10]. Chester AI also generates saliency maps to highlight areas of potential pathology (Figure 1).
Chester AI tool, showing pathology risk scores and a saliency map for predictive Lung Opacity regions on a CXR. Image is used under fair use from open website “Chester The AI Radiology Assistant” (https://mlmed.org/tools/xray/).
‘Chester the AI Radiology Assistant (Chester AI) is a free machine‐learning browser‐based tool designed to interpret CXRs by identifying pathologies and providing probability scores for various conditions’.
In this study, we aimed to evaluate Radiopaedia and Chester AI as tools for CXR interpretation among preclinical medical students and to explore students' perceptions of both tools. Additionally, we sought to provide insights to inform tutors on the integration of AI‐based resources in radiology education.
Approach
2
To evaluate Chester AI and Radiopaedia as tools for CXR interpretation, a structured teaching and assessment approach was implemented. First, an interactive teaching session covering the fundamentals of CXR interpretation was delivered to establish a standardised baseline of knowledge among the preclinical medical students. This was followed by a tutorial introducing students to the functionality of both Chester AI and Radiopaedia, ensuring familiarity with the platforms before use.
The workbook (see supplement) was designed in accordance with the institution's Learning Objectives as a tool to facilitate student engagement with Radiopaedia and Chester AI. The workbook contained 25 multiple‐choice questions (MCQs), each featuring a CXR that required students to apply systematic interpretation and diagnostic reasoning. The objective was to simulate real‐life clinical scenarios where junior clinicians, when faced with difficult or ambiguous clinical interpretation tasks, may supplement their knowledge and support their clinical reasoning with diagnostic tools or clinical knowledge repositories.
CXR cases were selected to challenge students beyond their preclinical knowledge base, prompting them to seek assistance from Radiopaedia or Chester AI to aid their decision‐making process. This allowed us to examine how students interact with these online tools when encountering uncertainty, mirroring real‐world clinical situations. This task aimed to replicate that experience in an educational setting, fostering skills in information retrieval, critical appraisal and applied clinical reasoning.
Evaluation
3
The study was approved by the Institution's Ethics Committee (Reference [redacted for peer‐review]).
Pre‐clinical medical students from [redacted for review] were invited to participate voluntarily. Students were randomly assigned to either the Chester AI group (n = 5) or Radiopaedia group (n = 5). The study included a 25‐item MCQ workbook session, followed by focus groups. Focus groups were conducted for both groups by authors GSS and VR, aiming to gather individual perspectives while encouraging interaction among participants [11]. Transcripts were line‐by‐line coded by KC, MC and HP, with key concepts grouped into emerging themes. Assistant moderator notes were reviewed to account for group dynamics. Throughout, researchers regularly compared thematic interpretations to ensure agreement on the emerging themes.
Results
3.1
The median [IQR] for the Radiopaedia group was 15 [10–17] out of a possible 25 marks, while the Chester AI group had a median score of 13 [12–16]. The median time to complete the workbook was 26 min [22–26] for the Radiopaedia group and 23 min [17–24] for the Chester AI group. There were no statistically significant differences in marks or time for completion (p > 0.05).
Key themes, quotes and concepts identified during focus group discussions are outlined in Tables 1 and 2.
Summary of Impressions—Chester AI
3.2
Students in the Chester AI group highlighted efficiency as one of its primary advantages. Students particularly valued Chester AI's real‐time feedback, highlighting its ability to provide a visual summary of clinically relevant findings in a CXR. This efficiency was perceived as beneficial for quick‐fire quizzes and pattern recognition training, making Chester AI a useful adjunct to traditional learning materials.
However, several limitations were noted. Students found that Chester AI lacked detailed explanations, making it less useful for deeper learning. Some students reported difficulty interpreting Chester's outputs, leading to uncertainty in their conclusions. The tool was perceived as useful for obvious pathologies but less reliable for subtler or ambiguous cases, leading some students to question its dependability in clinical practice. Furthermore, the lack of reference images or additional learning materials also hindered its ability to serve as a stand‐alone educational resource.
Summary of Impressions—Radiopaedia
3.3
Radiopaedia was valued for its depth of information and familiarity among students. Many participants appreciated its comprehensive content, particularly for in‐depth exploration of CXR pathologies. Some students felt that prior experience with Radiopaedia made it easier to navigate, with features like filter sections and structured summaries helping them locate relevant information more efficiently.
However, within the context of the workbook task, students found Radiopaedia to be inefficient. The platform's wealth of information, while valuable for broad learning, led to ‘information overload’ when trying to answer specific questions quickly. Many students described the process of manually searching for conditions as time‐consuming, making it difficult to work through the workbook efficiently. Overall, while Radiopaedia was highly regarded for its educational value, its vast array of content limited its usefulness in time‐sensitive settings.
Implications
4
In this study, we compared the effectiveness of Radiopaedia and Chester AI for CXR interpretation among preclinical medical students and explored students' experiences and perceptions of both tools. Our findings highlight both the strengths and limitations of each tool, suggesting potential areas for improvement and applicability in radiology education.
A prior study by Dao et al also studied Chester AI in the context of medical education, however the study focused solely on quantitative data, and no significant differences were ellicited in diagnostic performance when comparing the AI‐assisted group to t control groups [12]. Cheng et al. developed an AI tool called HipGuide to assist students in detecting hip fractures on pelvic radiographs, which, similar to Chester, generates saliency maps that highlight areas of potential fractures [13]. Their study showed higher diagnostic accuracy in the AI‐assisted group compared with the control group. Our focus group findings align with this, as students perceived Chester AI as a valuable diagnostic aid that improved efficiency and reinforced pattern recognition. However, limitations in interpretability and vague outputs impacted confidence in decision‐making, suggesting that AI assistance may be most effective when combined with structured teaching.
‘Students perceived Chester AI as a valuable diagnostic aid that improved efficiency and reinforced pattern recognition’.
Our qualitative data found that Chester AI offered promise in augmenting traditional educational resources like Radiopaedia, particularly in enhancing students' confidence in image interpretation. Chester has instant feedback provided by its heatmaps and probability‐based suggestions, which make it a more engaging and interactive tool. Key themes included the tool's versatility and its potential use as a revision resource. This aligns with previous research, suggesting that accessible AI tools for asynchronous learning may be more practical and appealing than formal integration into structured teaching programmes [14]. However, our findings also suggest that the utility of these tools should be viewed as complementary rather than substitutive. Chester AI's strength lies in its pattern recognition, accessibility and time efficiency, but its limitations in dealing with complex cases and lack of detailed explanations mean that it cannot yet replace more comprehensive educational resources.
‘Chester AI's strength lies in its pattern recognition, accessibility and time efficiency, but its limitations in dealing with complex cases and lack of detailed explanations mean that it cannot yet replace more comprehensive educational resources’.
Our study findings also raise important questions about the future of AI in radiology education. Students raised concerns regarding the use of AI in clinical settings, such as the accuracy of AI models and the potential for AI to replace human radiologists. These concerns are reflective of views in the global medical community about the role of AI in radiology [15], highlighting the need for careful consideration of how such tools are integrated into both clinical and educational environments.
Our study has several limitations to note. The generalisability of the findings is constrained by several factors, including volunteer participation, a small sample size and recruitment from a single institution, which may affect the broader applicability of our study. Also, Chester AI is still in its prototype phase; therefore, the findings may not fully reflect the potential of AI tools in CXR education once they are further developed.
Furthermore, we plan to continue working on this project by repeating our study with a larger sample size to collect sufficient data to compare groups quantitatively and evaluate this alongside the qualitative data. Seeing as Chester AI is still in its early stages, it would also be interesting to see how results may differ when this becomes more developed.
In conclusion, combining Radiopaedia's comprehensive content with Chester AI's efficient pattern recognition offers a powerful approach to CXR interpretation and radiology education. While Chester AI facilitates rapid revision and real‐time feedback, Radiopaedia offers the depth needed for a more thorough understanding. To maximise learning outcomes, tutors should also promote critical evaluation of AI‐generated results, fostering clinical reasoning and diagnostic confidence.
Author Contributions
Gurtek Singh Samra: investigation, methodology, writing – review and editing, writing – original draft, project administration, formal analysis, validation, data curation. Vashisht Ramoutar: investigation, writing – original draft, methodology. Kelley Chen: formal analysis, methodology, investigation, writing – original draft. Muiz Chaudhry: investigation, methodology, formal analysis, writing – original draft. Hrithika Patel: investigation, methodology, formal analysis, writing – original draft. Terese Bird: conceptualization, investigation, writing – review and editing, methodology, validation, visualization, supervision, resources. Vanessa Rodwell: conceptualization, investigation, methodology, validation, visualization, writing – review and editing, writing – original draft, supervision, resources, project administration.
Ethics Statement
The study was approved by the Institution's General Research Ethics Committee (Ethics Reference: 0391; Project Title: Artificial Intelligence versus Traditional Chest X‐Ray Interpretation for Undergraduate Medical Students; PI: Gurtek S Samra).
Conflicts of Interest
The authors declare no conflicts of interest.
Supporting information
Data S1 Supporting Information
Data S2 Supporting Information
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1T. Cheung , H. Harianto , M. Spanger , A. Young , and V. Wadhwa , “Low Accuracy and Confidence in Chest Radiograph Interpretation Amongst Junior Doctors and Medical Students,” Internal Medicine Journal 48, no. 7 (2018): 864–868.29984505 10.1111/imj.13946 · doi ↗ · pubmed ↗
- 2Radiopaedia. Available at: https://radiopaedia.org/ (© 2005‐2025 Radiopaedia.org)
- 3A. El‐Ali , F. Kamal , C. L. Cabral , and J. H. Squires , “Comparison of Traditional and Web‐Based Medical Student Teaching by Radiology Residents,” Journal of the American College of Radiology 16, no. 4 Pt A (2019): 492–495.30449521 10.1016/j.jacr.2018.09.048 · doi ↗ · pubmed ↗
- 4C. Ozutemiz , O. Dicle , and N. Koremezli , “How Turkish Radiology Residents Access Information Related to Their Profession in This Social Media and Smartphone Era,” European Journal of Radiology Open 2 (2015): 129–133.26937445 10.1016/j.ejro.2015.10.001PMC 4750626 · doi ↗ · pubmed ↗
- 5S. J. Adams , R. D. E. Henderson , X. Yi , and P. Babyn , “Artificial Intelligence Solutions for Analysis of X‐Ray Images,” Can Assoc Radiol J 72, no. 1 (2021): 60–72.32757950 10.1177/0846537120941671 · doi ↗ · pubmed ↗
- 6J. C. Y. Seah , C. H. M. Tang , Q. D. Buchlak , et al., “Effect of a Comprehensive Deep‐Learning Model on the Accuracy of Chest X‐Ray Interpretation by Radiologists: A Retrospective, Multireader Multicase Study,” Lancet Digital Health 3, no. 8 (2021): e 496–e 506.34219054 10.1016/S 2589-7500(21)00106-0 · doi ↗ · pubmed ↗
- 7M. J. van Kooten , C. O. Tan , E. I. S. Hofmeijer , et al., “A Framework to Integrate Artificial Intelligence Training Into Radiology Residency Programs: Preparing the Future Radiologist,” Insights Into Imaging 15, no. 1 (2024): 15.38228800 10.1186/s 13244-023-01595-3PMC 10792147 · doi ↗ · pubmed ↗
- 8L. Schropp , A. P. S. Sørensen , H. Devlin , and L. H. Matzen , “Use of Artificial Intelligence Software in Dental Education: A Study on Assisted Proximal Caries Assessment in Bitewing Radiographs,” European Journal of Dental Education 28, no. 2 (2024): 490–496.37961027 10.1111/eje.12973 · doi ↗ · pubmed ↗
