Biomedical knowledge graph-optimized prompt generation for large   language models

Karthik Soman; Peter W Rose; John H Morris; Rabia E Akbas; Brett; Smith; Braian Peetoom; Catalina Villouta-Reyes; Gabriel Cerono; Yongmei Shi,; Angela Rizk-Jackson; Sharat Israni; Charlotte A Nelson; Sui Huang; Sergio E; Baranzini

arXiv:2311.17330·cs.CL·May 15, 2024·6 cites

Biomedical knowledge graph-optimized prompt generation for large language models

Karthik Soman, Peter W Rose, John H Morris, Rabia E Akbas, Brett, Smith, Braian Peetoom, Catalina Villouta-Reyes, Gabriel Cerono, Yongmei Shi,, Angela Rizk-Jackson, Sharat Israni, Charlotte A Nelson, Sui Huang, Sergio E, Baranzini

PDF

Open Access 1 Repo

TL;DR

This paper introduces KG-RAG, a token-efficient retrieval augmented generation framework leveraging biomedical knowledge graphs to improve large language models' performance in biomedical domains, reducing costs and enhancing accuracy.

Contribution

The paper presents a novel, token-optimized KG-RAG framework that significantly reduces token consumption and improves biomedical question-answering performance of LLMs.

Findings

01

Over 50% reduction in token usage without accuracy loss

02

71% performance boost on biomedical MCQ dataset

03

Enhanced performance of GPT-3.5 and GPT-4 with KG-RAG

Abstract

Large Language Models (LLMs) are being adopted at an unprecedented rate, yet still face challenges in knowledge-intensive domains like biomedicine. Solutions such as pre-training and domain-specific fine-tuning add substantial computational overhead, requiring further domain expertise. Here, we introduce a token-optimized and robust Knowledge Graph-based Retrieval Augmented Generation (KG-RAG) framework by leveraging a massive biomedical KG (SPOKE) with LLMs such as Llama-2-13b, GPT-3.5-Turbo and GPT-4, to generate meaningful biomedical text rooted in established knowledge. Compared to the existing RAG technique for Knowledge Graphs, the proposed method utilizes minimal graph schema for context extraction and uses embedding methods for context pruning. This optimization in context extraction results in more than 50% reduction in token consumption without compromising the accuracy,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

BaranziniLab/KG_RAG
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Natural Language Processing Techniques

MethodsWordPiece · BART · Linear Warmup With Linear Decay · BERT · RAG · {Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · Attention Is All You Need · Position-Wise Feed-Forward Layer · Linear Layer