MedG-KRP: Medical Graph Knowledge Representation Probing
Gabriel R. Rosenbaum, Lavender Yao Jiang, Ivaxi Sheth, Jaden Stryker,, Anton Alyakin, Daniel Alexander Alber, Nicolas K. Goff, Young Joon Fred Kwon,, John Markert, Mustafa Nasir-Moin, Jan Moritz Niehues, Karl L. Sangwon, Eunice, Yang, and Eric Karl Oermann

TL;DR
This paper introduces a knowledge graph-based method to evaluate and visualize the biomedical reasoning abilities of large language models, aiming to improve their reliability for clinical applications.
Contribution
It presents a novel approach using medical knowledge graphs to assess and interpret LLMs' reasoning in biomedical contexts, addressing limitations of traditional benchmarks.
Findings
GPT-4 performs best in human review but worst in ground truth comparison.
PalmyraMed-70b shows the opposite pattern.
The method enables visualization of LLMs' reasoning pathways in medicine.
Abstract
Large language models (LLMs) have recently emerged as powerful tools, finding many medical applications. LLMs' ability to coalesce vast amounts of information from many sources to generate a response-a process similar to that of a human expert-has led many to see potential in deploying LLMs for clinical use. However, medicine is a setting where accurate reasoning is paramount. Many researchers are questioning the effectiveness of multiple choice question answering (MCQA) benchmarks, frequently used to test LLMs. Researchers and clinicians alike must have complete confidence in LLMs' abilities for them to be deployed in a medical setting. To address this need for understanding, we introduce a knowledge graph (KG)-based method to evaluate the biomedical reasoning abilities of LLMs. Essentially, we map how LLMs link medical concepts in order to better understand how they reason. We test…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Topic Modeling · Semantic Web and Ontologies
MethodsAttention Is All You Need · Linear Layer · Dropout · Multi-Head Attention · Adam · Layer Normalization · Position-Wise Feed-Forward Layer · Label Smoothing · Residual Connection · Softmax
