Insight: A Multi-Modal Diagnostic Pipeline using LLMs for Ocular Surface Disease Diagnosis
Chun-Hsiao Yeh, Jiayun Wang, Andrew D. Graham, Andrea J. Liu, Bo Tan,, Yubei Chen, Yi Ma, Meng C. Lin

TL;DR
This paper presents MDPipe, a multi-modal diagnostic pipeline that leverages large language models to interpret imaging and clinical data for ocular surface disease diagnosis, outperforming existing methods and providing clinical rationales.
Contribution
The paper introduces a novel multi-modal pipeline using LLMs that integrates image interpretation, clinical data, and reasoning for ocular disease diagnosis, advancing beyond traditional classification approaches.
Findings
MDPipe outperforms GPT-4 and existing standards in diagnosis accuracy.
The pipeline provides clinically sound rationales for diagnoses.
It effectively integrates imaging and clinical metadata for improved insights.
Abstract
Accurate diagnosis of ocular surface diseases is critical in optometry and ophthalmology, which hinge on integrating clinical data sources (e.g., meibography imaging and clinical metadata). Traditional human assessments lack precision in quantifying clinical observations, while current machine-based methods often treat diagnoses as multi-class classification problems, limiting the diagnoses to a predefined closed-set of curated answers without reasoning the clinical relevance of each variable to the diagnosis. To tackle these challenges, we introduce an innovative multi-modal diagnostic pipeline (MDPipe) by employing large language models (LLMs) for ocular surface disease diagnosis. We first employ a visual translator to interpret meibography images by converting them into quantifiable morphology data, facilitating their integration with clinical metadata and enabling the communication…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRetinal Imaging and Analysis · Digital Imaging for Blood Diseases
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Layer Normalization · Dense Connections · Adam · Residual Connection · Position-Wise Feed-Forward Layer · Label Smoothing · Byte Pair Encoding
