Adapting Frozen Mono-modal Backbones for Multi-modal Registration via Contrast-Agnostic Instance Optimization
Yi Zhang, Yidong Zhao, Qian Tao

TL;DR
This paper introduces a method for multi-modal image registration that leverages frozen mono-modal models combined with style transfer and lightweight adaptation, avoiding expensive full fine-tuning.
Contribution
It presents a novel framework that integrates style transfer and instance optimization with frozen mono-modal models for efficient multi-modal registration.
Findings
Achieves consistent improvements over state-of-the-art mono-modal models.
Ranks second on multi-modal subset and third on out-of-domain subset in Dice score.
Demonstrates effectiveness of combining frozen models with modality adaptation.
Abstract
Deformable image registration remains a central challenge in medical image analysis, particularly under multi-modal scenarios where intensity distributions vary significantly across scans. While deep learning methods provide efficient feed-forward predictions, they often fail to generalize robustly under distribution shifts at test time. A straightforward remedy is full network fine-tuning, yet for modern architectures such as Transformers or deep U-Nets, this adaptation is prohibitively expensive in both memory and runtime when operating in 3D. Meanwhile, the naive fine-tuning struggles more with potential degradation in performance in the existence of drastic domain shifts. In this work, we propose a registration framework that integrates a frozen pretrained \textbf{mono-modal} registration model with a lightweight adaptation pipeline for \textbf{multi-modal} image registration.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
