Addressing accuracy and hallucination of LLMs in Alzheimer's disease research through knowledge graphs

Tingxuan Xu; Jiarui Feng; Justin Melendez; Kaleigh Roberts; Donghong Cai; Mingfang Zhu; Donald Elbert; Yixin Chen; Randall J. Bateman

arXiv:2508.21238·cs.AI·September 1, 2025

Addressing accuracy and hallucination of LLMs in Alzheimer's disease research through knowledge graphs

Tingxuan Xu, Jiarui Feng, Justin Melendez, Kaleigh Roberts, Donghong Cai, Mingfang Zhu, Donald Elbert, Yixin Chen, Randall J. Bateman

PDF

Open Access

TL;DR

This study evaluates the effectiveness of GraphRAG systems in improving the accuracy and traceability of LLMs like GPT-4o for Alzheimer's disease research by constructing a specialized knowledge base and comparing response quality.

Contribution

It introduces a comprehensive Alzheimer's disease knowledge base for GraphRAG, compares its performance with standard GPT-4o, and assesses traceability, advancing domain-specific LLM applications.

Findings

01

GraphRAG improves response accuracy over standard GPT-4o.

02

Enhanced traceability in GraphRAG aids scientific research.

03

The provided interface facilitates testing of LLMs in biomedical domains.

Abstract

In the past two years, large language model (LLM)-based chatbots, such as ChatGPT, have revolutionized various domains by enabling diverse task completion and question-answering capabilities. However, their application in scientific research remains constrained by challenges such as hallucinations, limited domain-specific knowledge, and lack of explainability or traceability for the response. Graph-based Retrieval-Augmented Generation (GraphRAG) has emerged as a promising approach to improving chatbot reliability by integrating domain-specific contextual information before response generation, addressing some limitations of standard LLMs. Despite its potential, there are only limited studies that evaluate GraphRAG on specific domains that require intensive knowledge, like Alzheimer's disease or other biomedical domains. In this paper, we assess the quality and traceability of two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling · AI in Service Interactions