MedVisionLlama: Leveraging Pre-Trained Large Language Model Layers to Enhance Medical Image Segmentation
Gurucharan Marthi Krishna Kumar, Aman Chadha, Janine Mendola, Amir Shmuel

TL;DR
This paper introduces MedVisionLlama, a novel approach that integrates pre-trained Large Language Model transformer layers into Vision Transformers to significantly improve medical image segmentation accuracy across various modalities.
Contribution
It proposes a hybrid model combining frozen LLM transformer blocks with Vision Transformers, along with a new attention mechanism and multi-scale fusion for enhanced segmentation performance.
Findings
Dice score increased from 0.74 to 0.79
Significant improvements in accuracy and precision
Effective integration of LLMs into vision models
Abstract
Large Language Models (LLMs), known for their versatility in textual data, are increasingly being explored for their potential to enhance medical image segmentation, a crucial task for accurate diagnostic imaging. This study explores enhancing Vision Transformers (ViTs) for medical image segmentation by integrating pre-trained LLM transformer blocks. Our approach, which incorporates a frozen LLM transformer block into the encoder of a ViT-based model, leads to substantial improvements in segmentation performance across various medical imaging modalities. We propose a Hybrid Attention Mechanism that combines global and local feature learning with a Multi-Scale Fusion Block for aggregating features across different scales. The enhanced model shows significant performance gains, including an average Dice score increase from 0.74 to 0.79 and improvements in accuracy, precision, and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Topic Modeling · COVID-19 diagnosis using AI
MethodsSoftmax · Attention Is All You Need
