GraphRAG on Consumer Hardware: Benchmarking Local LLMs for Healthcare EHR Schema Retrieval

Peter Fernandes; Ria Kanjilal

arXiv:2605.20815·cs.CL·May 21, 2026

GraphRAG on Consumer Hardware: Benchmarking Local LLMs for Healthcare EHR Schema Retrieval

Peter Fernandes, Ria Kanjilal

PDF

TL;DR

This paper systematically evaluates GraphRAG for healthcare EHR schema retrieval using open-source LLMs on consumer hardware, revealing model size thresholds and retrieval design impacts on performance and reliability.

Contribution

It provides the first comprehensive benchmarking of GraphRAG with local LLMs on real-world healthcare data, highlighting practical deployment considerations.

Findings

01

Llama 3.1 produces the richest knowledge graph with 1,172 entities.

02

Qwen 2.5 achieves the best answer quality score of 3.3/5.

03

Models below approximately 7B parameters struggle with structured output and pipeline completion.

Abstract

Graph-based Retrieval Augmented Generation (GraphRAG) extends retrieval-augmented generation to support structured reasoning over complex corpora, but its reliability under resource-constrained, privacy-sensitive deployments remains unclear. In healthcare, where Electronic Health Record (EHR) data is complex and strictly regulated, reliance on cloud-based large language models (LLMs) introduces challenges in cost, latency, and compliance. In this work, we present a systematic evaluation of GraphRAG for EHR schema retrieval using locally deployed open-source LLMs. We implement the Microsoft GraphRAG pipeline on real-world EHR schema documentation and benchmark four models, including Llama 3.1 (8B), Mistral (7B), Qwen 2.5 (7B), and Phi-4-mini (3.8B), each deployed via Ollama on a single consumer GPU (8 GB VRAM). We evaluate indexing efficiency, knowledge graph construction, query latency,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.