# Reliability of AI Tools in Generating Patient Education Brochures for Bariatric Surgery: An Observational Study

**Authors:** Sneha Zakkir, Khushali Dadhich, Bavanthi K V, Lakshmi Nikhita Bukkasamudram, Anjali Krishna Santhosh, Shrirampirajin Thangaraj, Pallavi Padmakar Kulkarni

PMC · DOI: 10.7759/cureus.99321 · 2025-12-15

## TL;DR

This study evaluates how reliable and readable AI tools like ChatGPT and Google Gemini are in creating patient brochures for bariatric surgery.

## Contribution

The study compares the readability and reliability of AI-generated patient education brochures for bariatric surgery using two major AI tools.

## Key findings

- Both ChatGPT and Google Gemini produced brochures with similar readability and reliability scores.
- Gemini showed higher text overlap with existing literature, while ChatGPT used more original phrasing.
- The readability of the brochures was at a college level, limiting accessibility for patients with lower literacy.

## Abstract

Background: Patient education plays a key role in helping individuals understand their health conditions and participate in treatment. With the growing use of artificial intelligence (AI), tools like ChatGPT (OpenAI, San Francisco, CA, USA) and Google Gemini (Mountain View, CA, USA) are increasingly being used to generate patient information. This study assessed how readable and reliable AI-generated brochures on bariatric surgery are.

Methods: A cross-sectional study was conducted in September 2024 to generate patient brochures on six common bariatric procedures using ChatGPT and Google Gemini. Each brochure was evaluated for readability using the Flesch-Kincaid metrics, and “similarity” was assessed using Quillbot to estimate text overlap with existing literature (higher similarity indicating greater overlap). Reliability was measured using the modified DISCERN score, where higher scores reflect more trustworthy health information.

Results: ChatGPT generally produced longer brochures with more sentences, while Gemini generated shorter text with slightly longer sentences. Despite these structural differences, both tools produced content with similar readability levels - approximately college-level - and comparable reliability scores. Gemini showed higher similarity with pre-existing text, while ChatGPT produced more original phrasing. Overall, both tools generated patient information of good reliability but with limited accessibility for individuals with lower literacy.

Conclusion: ChatGPT and Google Gemini can produce reliable educational material on bariatric surgery, but the readability remains higher than ideal for the average patient. Human editing and simplification may be necessary to make AI-generated brochures more accessible and suitable for routine patient education.

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12803742/full.md

---
Source: https://tomesphere.com/paper/PMC12803742