# Assessing the capabilities of AI-based large language models (AI-LLMs) in interpreting histopathological slides and scientific figures: Performance evaluation study

**Authors:** Khanisyah E. Gumilar, Grace Ariani, Priangga A. Wiratama, Rimbun, Tri H. Yuliawati, Hong Chen, Ibrahim H. Ibrahim, Cheng-Han Lin, Tai-Yu Hung, Dewanti Anggrahini, Arya S. Rajanagara, Khaled E. Omran, Zih-Ying Yu, Yu-Cheng Hsu, Erry G. Dachlan, Jer-Yen Yang, Li-Na Liao, Ming Tan

PMC · DOI: 10.37796/2211-8039.1698 · BioMedicine · 2026-03-01

## TL;DR

This study evaluates how well AI chatbots interpret medical and scientific images, finding that ChatGPT-4 performs best in explaining complex visuals.

## Contribution

The study introduces a systematic evaluation of AI-LLMs in interpreting histopathology and scientific images using expert ratings and statistical analysis.

## Key findings

- ChatGPT-4 outperformed Gemini Advanced and Copilot in interpreting histopathology and scientific images.
- ChatGPT-4 received higher scores in relevance, clarity, depth, focus, and coherence across all tested images.

## Abstract

Integrating artificial intelligence-based large language models (AI-LLMs) into medical and other scientific domains is increasingly recognized as a tool to support complex tasks, such as interpreting histopathology slides and scientific figures. AI-LLMs can simplify these processes by providing clearer explanations. By improving accessibility and comprehension, AI-LLMs can significantly assist healthcare professionals in diagnosing and therapy determination. Students and the public also find it easier to understand complex scientific concepts and images.

This study explores the capability of AI-LLMs in interpreting histopathological slides and scientific images. This study aims to evaluate the performance of AI-LLMs in supporting diagnostics and improving comprehension in biomolecular sciences.

The study was divided into two parts: interpreting histopathology slides and scientific figures. Twelve histopathology images and twelve scientific figures were tested on each of the three most frequently used chatbots (ChatGPT-4, Gemini Advanced, and Copilot). Responses from the chatbots were coded and blindly examined by expert raters using five parameters—relevance, clarity, depth, focus, and coherence—on a 5-point Likert scale. Statistical analysis included one-way ANOVA and multiple linear regression.

ChatGPT-4 outperformed Gemini Advanced and Copilot in histopathology and scientific image interpretation (P < 0.001) with significantly higher scores across all parameters (relevance, clarity, depth, focus, and coherence). ChatGPT-4’s superior performance may be due to its advanced algorithms, extensive training data, specialized modules, and user feedback.

ChatGPT-4 excels in interpreting histopathology and scientific images, which may lead to improving diagnostic accuracy, clinical decision-making, and reducing pathologists’ workload. It also benefits education by enhancing students’ understanding of complex images and promoting interactive learning. ChatGPT-4 shows a significant potential to improve patient care and enrich student learning.

## Full-text entities

- **Genes:** TP53 (tumor protein p53) [NCBI Gene 7157] {aka BCC7, BMFS5, LFS1, P53, TRP53}, STK26 (serine/threonine kinase 26) [NCBI Gene 51765] {aka MASK, MST4}, TH (tyrosine hydroxylase) [NCBI Gene 7054] {aka DYT14, DYT5b, TYH}, HSPD1 (heat shock protein family D (Hsp60) member 1) [NCBI Gene 3329] {aka CPN60, GROEL, HLD4, HSP-60, HSP60, HSP65}, HSF1 (heat shock transcription factor 1) [NCBI Gene 3297] {aka HSTF1}, CD276 (CD276 molecule) [NCBI Gene 80381] {aka 4Ig-B7-H3, B7-H3, B7H3, B7RP-2}
- **Diseases:** clear cell carcinoma (MESH:D002292), CG-4 (MESH:D053632), ovarian cancer (MESH:D010051), breast cancer (MESH:D001943), AI (MESH:C538142), LLM (MESH:D007806), endometrioid grade-3 (MESH:D018269), cancer (MESH:D009369), mucinous (MESH:D002288), cervical cancer (MESH:D002583), small cell carcinoma (MESH:D018288), metastasis (MESH:D009362)
- **Chemicals:** glucose (MESH:D005947), copper (MESH:D003300), Carboplatin (MESH:D016190), GemAdv (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12962759/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12962759/full.md

## References

54 references — full list in the complete paper: https://tomesphere.com/paper/PMC12962759/full.md

---
Source: https://tomesphere.com/paper/PMC12962759