Large Language Models for Multimodal Deformable Image Registration

Mingrui Ma; Weijie Wang; Jie Ning; Jianfeng He; Nicu Sebe; Bruno Lepri

arXiv:2408.10703·cs.CV·August 21, 2024

Large Language Models for Multimodal Deformable Image Registration

Mingrui Ma, Weijie Wang, Jie Ning, Jianfeng He, Nicu Sebe, Bruno Lepri

PDF

Open Access 1 Repo

TL;DR

This paper introduces LLM-Morph, a novel framework that leverages pre-trained Large Language Models combined with CNNs and adapters to improve multimodal deformable image registration across different medical imaging modalities.

Contribution

The paper proposes a new coarse-to-fine MDIR framework that effectively aligns features from different modalities using pre-trained LLMs and multi-scale deformation fields.

Findings

01

Demonstrates effectiveness on MR-CT Abdomen and Brain datasets.

02

Shows potential of pre-trained LLMs for medical image registration.

03

Achieves improved registration accuracy over existing methods.

Abstract

The challenge of Multimodal Deformable Image Registration (MDIR) lies in the conversion and alignment of features between images of different modalities. Generative models (GMs) cannot retain the necessary information enough from the source modality to the target one, while non-GMs struggle to align features across these two modalities. In this paper, we propose a novel coarse-to-fine MDIR framework,LLM-Morph, which is applicable to various pre-trained Large Language Models (LLMs) to solve these concerns by aligning the deep features from different modal medical images. Specifically, we first utilize a CNN encoder to extract deep visual features from cross-modal image pairs, then we use the first adapter to adjust these tokens, and use LoRA in pre-trained LLMs to fine-tune their weights, both aimed at eliminating the domain gap between the pre-trained LLMs and the MDIR task. Third, for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ninjannn/llm-morph
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques

MethodsAdapter · ALIGN