DrugRAG: Enhancing Pharmacy LLM Performance Through A Novel Retrieval-Augmented Generation Pipeline
Houman Kazemzadeh, Kiarash Mokhtari Dizaji, Seyed Reza Tavakoli, Farbod Davoodi, MohammadReza KarimiNejad, Parham Abed Azad, Fatemeh Latifi, Ali Sabzi, Armin Khosravi, Siavash Ahmadi, Babak Khalaj, Mohammad Hossein Rohban, Glolamali Aminian, Zohreh Amoozgar, Tahereh Javaheri

TL;DR
This paper introduces DrugRAG, a retrieval-augmented generation pipeline that enhances pharmacy question-answering accuracy of large language models by integrating external drug knowledge without altering the models.
Contribution
The study presents DrugRAG, a novel external knowledge retrieval method that significantly improves LLM performance on pharmacy tasks without changing model architecture.
Findings
DrugRAG increased accuracy by 7 to 21 percentage points across models.
Statistically significant improvements were observed mainly in smaller and mid-sized open-source models.
Benchmarking showed baseline performance ranged from 46% to 92%, with DrugRAG boosting accuracy.
Abstract
In our study, we evaluated large language model (LLM) performance on pharmacy licensure-style question-answering tasks and developed an external knowledge integration method to improve accuracy. We benchmarked ten LLMs with varying parameter sizes (8 billion to 70+ billion) using a 141-question pharmacy dataset, measuring baseline accuracy without modification. Baseline performance ranged from 46% to 92%, with GPT-5 (92%) and o3 (89%) achieving the highest scores, while smaller open-source models showed substantially lower performance. We then developed DrugRAG, a three-step retrieval-augmented generation (RAG) pipeline that retrieves structured, evidence-based drug information and augments model prompts with contextual pharmacological evidence, operating externally and requiring no changes to model architecture or parameters. DrugRAG increased accuracy across all five evaluated models,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Machine Learning in Healthcare
