# Artificial Intelligence-Powered Interpretation of Corneal Epithelial Maps: A Comparative Pilot Study of ChatGPT, Google Gemini, and Microsoft Bing

**Authors:** Ruchi Shukla, Aparajita Shukla, Ashutosh K Mishra, Pragati Garg, Nilakshi Banerjee, Swarastra P Singh, Shrinkhal

PMC · DOI: 10.7759/cureus.90779 · 2025-08-22

## TL;DR

This study compares how well ChatGPT, Google Gemini, and Microsoft Bing interpret corneal thickness maps for eye diseases like keratoconus and pterygium.

## Contribution

The study introduces a novel evaluation of generative AI models for interpreting corneal epithelial thickness maps in diagnosing ocular surface disorders.

## Key findings

- ChatGPT 4.0 achieved the highest diagnostic accuracy (80%) and clinical appropriateness (87%) among the tested AI models.
- Google Gemini and Microsoft Bing showed lower performance with 60% and 53% diagnostic accuracy, respectively.
- ChatGPT's clinical recommendations aligned with standard protocols in 87% of cases.

## Abstract

Background

This study aimed to compare the diagnostic interpretation accuracy and clinical suitability of three generative artificial intelligence (AI) models, i.e., ChatGPT 4.0, Google Gemini, and Microsoft Bing, in analyzing corneal epithelial thickness (CET) data across key ocular surface disorders, including keratoconus, vernal keratoconjunctivitis (VKC), and nasal pterygium.

Methodology

Standardized case scenarios with corresponding CET mapping data were constructed and input into all three AI platforms with the following query: “Evaluate the given CET map and provide the most likely diagnosis and appropriate clinical recommendation.” Responses were independently graded by a panel of three ophthalmologists for diagnostic accuracy and clinical appropriateness. Cases were selected based on known CET signature patterns derived from the literature, including doughnut patterns in keratoconus, superior thinning in VKC, and nasal epithelial thickening in nasal pterygium.

Results

Of the 15 AI-evaluated case scenarios (five each of keratoconus, VKC, and nasal pterygium), ChatGPT showed the highest diagnostic accuracy (80%) and clinical appropriateness (87%). Google Gemini correctly diagnosed 60% and was deemed clinically appropriate in 67%. Microsoft Bing yielded 53% correct diagnoses and 60% appropriate clinical suggestions.

Conclusions

ChatGPT 4.0 consistently outperformed Google Gemini and Microsoft Bing in the context of CET interpretation for common ocular surface diseases. These findings suggest that ChatGPT may serve as a valuable adjunct in AI-assisted ophthalmology diagnostics, particularly for ocular surface diseases where subtle epithelial remodeling is crucial for early identification. While diagnostic accuracy was the primary outcome, appropriateness of suggested appropriate clinical recommendations aligned with standard protocols in the majority of ChatGPT responses (87%), highlighting its clinical utility.

## Linked entities

- **Diseases:** keratoconus (MONDO:0015486), vernal keratoconjunctivitis (MONDO:0019085)

## Full-text entities

- **Diseases:** ocular surface diseases (MESH:D010534), VKC (MESH:D003233), nasal pterygium (MESH:D009668), keratoconus (MESH:D007640)

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12526689/full.md

---
Source: https://tomesphere.com/paper/PMC12526689