Simplifying Multimodality: Unimodal Approach to Multimodal Challenges in Radiology with General-Domain Large Language Model
Seonhee Cho, Choonghan Kim, Jiho Lee, Chetan Chilkunda, Sujin Choi,, Joo Heung Yoon

TL;DR
This paper presents MID-M, a framework that uses general-domain Large Language Models to handle multimodal radiology data through image descriptions, achieving high performance with fewer parameters and demonstrating robustness to data quality issues.
Contribution
The paper introduces MID-M, a novel unimodal approach leveraging general-domain LLMs for multimodal radiology tasks, reducing reliance on domain-specific multimodal training.
Findings
MID-M performs comparably or better than task-specific models.
It requires fewer parameters and less domain-specific training.
MID-M shows robustness to data quality variations.
Abstract
Recent advancements in Large Multimodal Models (LMMs) have attracted interest in their generalization capability with only a few samples in the prompt. This progress is particularly relevant to the medical domain, where the quality and sensitivity of data pose unique challenges for model training and application. However, the dependency on high-quality data for effective in-context learning raises questions about the feasibility of these models when encountering with the inevitable variations and errors inherent in real-world medical data. In this paper, we introduce MID-M, a novel framework that leverages the in-context learning capabilities of a general-domain Large Language Model (LLM) to process multimodal data via image descriptions. MID-M achieves a comparable or superior performance to task-specific fine-tuned LMMs and other general-domain ones, without the extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiology practices and education · Topic Modeling · Artificial Intelligence in Healthcare and Education
