# Evaluating the Efficacy of Artificial Intelligence-Driven Chatbots in Addressing Queries on Vernal Conjunctivitis

**Authors:** Muhammad Saad, Muhammad A Moqeet, Hassan Mansoor, Shama Khan, Rabia Sharif, Fahim Ullah Khan, Ali H Naqvi, Warda Ali

PMC · DOI: 10.7759/cureus.79688 · 2025-02-26

## TL;DR

This study evaluates how well Google Gemini Advanced, an AI chatbot, can provide accurate and reliable information about vernal keratoconjunctivitis, a type of allergic eye disease.

## Contribution

The study introduces a systematic evaluation of an AI chatbot's performance in addressing medical queries on VKC using expert ratings.

## Key findings

- Google Gemini Advanced provided highly accurate responses to 86.4% of VKC-related questions.
- The chatbot showed strong inter-rater reliability and no responses were classified as inaccurate or harmful.
- Minor gaps were observed in complex treatment-related discussions and grading details.

## Abstract

Background

Vernal keratoconjunctivitis (VKC) is a recurrent allergic eye disease that requires accurate patient education to ensure proper management. AI-driven chatbots, such as Google Gemini Advanced (Mountain View, California, US), are increasingly being explored as potential tools for providing medical information. This study evaluates the accuracy, reliability, and clinical applicability of Google Gemini Advanced in addressing VKC-related queries.

Objective

To assess the performance of Google Gemini Advanced in delivering medically accurate and relevant information about VKC and to evaluate its reliability based on expert ratings.

Methods

A total of 125 responses generated by Google Gemini Advanced for 25 VKC-related questions were assessed by two independent cornea specialists. Responses were rated on accuracy, completeness, and potential harm using a 5-point Likert scale (1-5). Inter-rater reliability was measured using Cronbach’s alpha. Responses were categorized into highly accurate (score of 5), minor inconsistencies (score of 4), and inaccurate (scores 1-3).

Results

Google Gemini Advanced demonstrated high inter-rater reliability (Cronbach’s alpha = 0.92, 95% CI: 0.87-0.94). Of the 125 responses, 108 (86.4%) were rated highly accurate (score of 5) while 17 (13.6%) had minor inconsistencies (score of 4) but posed no potential for harm. No responses were classified as inaccurate or potentially harmful. The combined mean score was 4.88 ± 0.31, reflecting strong agreement between raters. The chatbot consistently provided reliable information across diagnostic, treatment, and prognosis-related queries, with minor gaps in complex grading and treatment-related discussions.

Discussion

The findings support the use of AI-driven chatbots like Google Gemini Advanced as potential tools for patient education in ophthalmology. The chatbot exhibited strong accuracy and consistency, particularly in addressing general VKC-related queries. However, areas for improvement remain, especially in providing detailed guidance on treatment protocols and ensuring completeness in responses to complex clinical questions.

Conclusion

Google Gemini Advanced demonstrates high reliability and accuracy in delivering medical information about VKC, making it a valuable tool for patient education. While its responses are consistent and generally accurate, expert oversight remains necessary to refine AI-generated content for clinical applications. Further research is needed to enhance AI-driven chatbots' ability to provide nuanced medical advice and integrate them safely into ophthalmic patient education and clinical decision-making.

## Linked entities

- **Diseases:** Vernal keratoconjunctivitis (MONDO:0019085), VKC (MONDO:0019085)

## Full-text entities

- **Diseases:** VKC (MESH:D003233), allergic eye disease (MESH:D005128)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11951947/full.md

---
Source: https://tomesphere.com/paper/PMC11951947