MedDoc-Bot: A Chat Tool for Comparative Analysis of Large Language Models in the Context of the Pediatric Hypertension Guideline
Mohamed Yaseen Jabarulla, Steffen Oeltze-Jafra, Philipp Beerbaum,, Theodor Uden

TL;DR
This paper introduces MedDoc-Bot, a user-friendly tool for evaluating open-source large language models' ability to interpret pediatric hypertension guidelines from PDFs, combining automated metrics and expert human assessment.
Contribution
It presents a novel chatbot platform for medical document analysis that compares multiple LLMs' performance on real-world pediatric hypertension guidelines.
Findings
Llama-2 and Mistral performed well in metrics evaluation.
Llama-2 was slower with text and tabular data.
Responses from Mistral, Meditron, and Llama-2 showed reasonable fidelity and relevance.
Abstract
This research focuses on evaluating the non-commercial open-source large language models (LLMs) Meditron, MedAlpaca, Mistral, and Llama-2 for their efficacy in interpreting medical guidelines saved in PDF format. As a specific test scenario, we applied these models to the guidelines for hypertension in children and adolescents provided by the European Society of Cardiology (ESC). Leveraging Streamlit, a Python library, we developed a user-friendly medical document chatbot tool (MedDoc-Bot). This tool enables authorized users to upload PDF files and pose questions, generating interpretive responses from four locally stored LLMs. A pediatric expert provides a benchmark for evaluation by formulating questions and responses extracted from the ESC guidelines. The expert rates the model-generated responses based on their fidelity and relevance. Additionally, we evaluated the METEOR and chrF…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education
