# Potential and Limitations of Large Language Models for Medical Literature Analysis: A Preliminary Investigation

**Authors:** Takahiro Kamihara, Takuya Omura, Atsuya Shimizu

PMC · DOI: 10.7759/cureus.92590 · Cureus · 2025-09-17

## TL;DR

This paper compares the ability of a large language model (Google Gemini) to traditional text mining tools in analyzing medical literature, highlighting both strengths and limitations.

## Contribution

The study demonstrates the potential of LLMs for medical literature analysis while identifying their interpretive and reproducibility limitations.

## Key findings

- Google Gemini could extract keywords, concepts, and trends similar to traditional tools.
- LLM-generated co-occurrence networks showed visual similarity but lacked statistical comparability.
- LLMs offer conceptual summarization strengths but face challenges in reproducibility and transparency.

## Abstract

Objective

While Large Language Models (LLMs) show great promise for various medical applications, their black-box nature and the difficulty of reproducing results have been noted as significant challenges. In contrast, conventional text mining is a well-established methodology, yet its mastery remains time-consuming. This study aimed to determine if an LLM could achieve literature analysis outcomes comparable to those from traditional text mining, thereby clarifying both its utility and inherent limitations.

Methods

We analyzed the abstracts of 5,112 medical papers retrieved from PubMed using the single keyword "text mining." We used Google Gemini 2.5 (Google Inc., Mountain View, CA, USA) and instructed it to extract distinctive words, concepts, trends, and co-occurrence network concepts. These results were then qualitatively compared with those obtained from conventional text mining tools, VOSviewer and KH Coder.

Results

Google Gemini appeared to conceptually aggregate individual words and identify research trends. The concepts for co-occurrence networks also showed visual similarity to the networks generated by the traditional tools. However, the LLM’s analytical output was based on its own unique interpretation and could not be directly compared with the statistically derived co-occurrence patterns. Furthermore, since this study relied on a visual comparison of network diagrams rather than rigorous quantitative analysis, the conclusions remain qualitative.

Conclusion

Google Gemini indicated an ability to extract keywords, concepts, and trends. A co-occurrence network visually similar to those generated by conventional text mining tools was created. While it showed particular strengths in conceptual summarization and trend detection, its limitations - including its black-box nature, reproducibility challenges, and subjective interpretations - became apparent. With a proper understanding of these constraints, LLMs may serve as a valuable complementary tool, with the potential to accelerate literature analysis in medical research.

## Full-text entities

- **Diseases:** LLMs (MESH:D007806)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12534714/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12534714/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/PMC12534714/full.md

---
Source: https://tomesphere.com/paper/PMC12534714