From Knowledge Generation to Knowledge Verification: Examining the   BioMedical Generative Capabilities of ChatGPT

Ahmed Abdeen Hamed; Alessandro Crimi; Magdalena M. Misiak; Byung Suk; Lee

arXiv:2502.14714·cs.AI·March 25, 2025

From Knowledge Generation to Knowledge Verification: Examining the BioMedical Generative Capabilities of ChatGPT

Ahmed Abdeen Hamed, Alessandro Crimi, Magdalena M. Misiak, Byung Suk, Lee

PDF

Open Access

TL;DR

This paper presents a computational method to evaluate the factual accuracy of biomedical knowledge generated by ChatGPT, combining disease association generation with ontology-based verification to assess reliability.

Contribution

It introduces a novel approach that integrates prompt-engineering and semantic verification to assess the factual correctness of biomedical information from LLMs.

Findings

01

High accuracy in disease, drug, and gene identification (88%-98%)

02

Lower accuracy in symptom identification (49%-61%)

03

Verification shows 89%-91% literature coverage for disease-drug and disease-gene pairs

Abstract

The generative capabilities of LLM models offer opportunities for accelerating tasks but raise concerns about the authenticity of the knowledge they produce. To address these concerns, we present a computational approach that evaluates the factual accuracy of biomedical knowledge generated by an LLM. Our approach consists of two processes: generating disease-centric associations and verifying these associations using the semantic framework of biomedical ontologies. Using ChatGPT as the selected LLM, we designed prompt-engineering processes to establish linkages between diseases and related drugs, symptoms, and genes, and assessed consistency across multiple ChatGPT models (e.g., GPT-turbo, GPT-4, etc.). Experimental results demonstrate high accuracy in identifying disease terms (88%-97%), drug names (90%-91%), and genetic information (88%-98%). However, symptom term identification was…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Machine Learning in Healthcare

MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Dense Connections · Absolute Position Encodings · Layer Normalization · Label Smoothing · Residual Connection · Adam · Softmax