EI: Early Intervention for Multimodal Imaging based Disease Recognition

Qijie Wei; Hailan Lin; Xirong Li

arXiv:2603.17514·cs.CV·April 7, 2026

EI: Early Intervention for Multimodal Imaging based Disease Recognition

Qijie Wei, Hailan Lin, Xirong Li

PDF

TL;DR

This paper introduces EI, a novel framework for multimodal medical image disease recognition that leverages early intervention and low-rank adaptation to improve embedding and classification accuracy.

Contribution

The paper proposes a new early intervention approach and a low-rank adaptation method to better utilize multimodal data and pretrained vision models in medical diagnosis.

Findings

01

EI outperforms baseline methods on three public datasets.

02

The proposed MoR method is parameter-efficient and effective.

03

Early intervention improves multimodal embedding quality.

Abstract

Current methods for multimodal medical imaging based disease recognition face two major challenges. First, the prevailing "fusion after unimodal image embedding" paradigm cannot fully leverage the complementary and correlated information in the multimodal data. Second, the scarcity of labeled multimodal medical images, coupled with their significant domain shift from natural images, hinders the use of cutting-edge Vision Foundation Models (VFMs) for medical image embedding. To jointly address the challenges, we propose a novel Early Intervention (EI) framework. Treating one modality as target and the rest as reference, EI harnesses high-level semantic tokens from the reference as intervention tokens to steer the target modality's embedding process at an early stage. Furthermore, we introduce Mixture of Low-varied-Ranks Adaptation (MoR), a parameter-efficient fine-tuning method that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.