# Evaluation of Gemini 2.0 AI in Classifying Breast Lesion Status From Dynamic Contrast-Enhanced MRI: A Preliminary Study

**Authors:** Nitin Chetla, Trisha Naidu, Shivam Patel, Andrew Bouras, Harshita Kacham, Vinisha Bonagiri, Nasif Zaman

PMC · DOI: 10.7759/cureus.94144 · 2025-10-08

## TL;DR

This study tests Gemini 2.0 AI's ability to classify breast lesions in MRI scans, finding it can detect malignancy but struggles with accuracy and bias.

## Contribution

The study is one of the first to evaluate Gemini 2.0's performance in breast lesion classification using DCE-MRI data.

## Key findings

- Gemini 2.0 achieved 50% accuracy in distinguishing benign/malignant vs. negative lesions but failed to correctly identify negative cases.
- The model showed strong recall for malignant lesions (97%) but poor recall for benign lesions (7%), indicating a bias toward malignancy.

## Abstract

Introduction: Breast MRI, particularly dynamic contrast-enhanced (DCE) MRI, offers high sensitivity in detecting breast lesions but suffers from variability in interpretation. Artificial intelligence (AI) tools like Gemini 2.0 (Google AI, Mountain View, CA) may help streamline and improve diagnostic accuracy. This study evaluates Gemini 2.0’s performance in classifying breast lesion status using application programming interface (API)-based image analysis.

Methods: MRI images were sourced from the publicly available fastMRI Breast dataset, which includes axial DCE-MRI sequences acquired using a 3D Golden-angle Radial Sparse Parallel (GRASP) protocol. Images were converted from DICOM to PNG for compatibility with Gemini 2.0’s API. Two binary classification prompts were tested. Prompt 1 distinguished between (A) benign or malignant lesion and (B) negative lesion status using 100 patient scans. Prompt 2 classified (A) benign vs. (B) malignant lesions in a separate set of 180 patient scans. Responses from the Gemini 2.0 API were recorded, and performance was assessed using accuracy, precision, recall, and F1-score.

Results: For prompt 1, Gemini 2.0 achieved 50% accuracy. It identified all benign or malignant lesions (100% recall) but misclassified all negative cases, yielding 0% recall for negative lesion status. The F1-score was 0.67 for benign/malignant lesions and 0 for negative lesions. For prompt 2, the model achieved 52% accuracy. It exhibited 97% recall for malignant lesions but only 7% for benign lesions, reflecting a strong bias toward malignancy. The weighted average F1-score was 0.39.

Discussion: Gemini 2.0 shows promise in flagging lesion presence but lacks the discriminatory power needed to distinguish benign from malignant or negative cases reliably. The high false-positive rate and class imbalance indicate a need for algorithmic refinement and further validation. Larger and more diverse training datasets are necessary to improve performance and reduce bias. Future research should compare AI classifications to radiologist interpretations in a prospective setting to assess clinical utility.

Conclusion: Gemini 2.0 offers preliminary utility in breast lesion detection on DCE-MRI, particularly in identifying malignancy. However, its limited specificity and poor differentiation between lesion types underscore the need for continued development before clinical deployment.

## Linked entities

- **Diseases:** breast cancer (MONDO:0004989)

## Full-text entities

- **Diseases:** Breast Lesion (MESH:D061325), malignancy (MESH:D009369)
- **Chemicals:** Gemini (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12594284/full.md

---
Source: https://tomesphere.com/paper/PMC12594284