Vision-Language Model Based Multi-Expert Fusion for CT Image Classification

Jianfa Bai; Kejin Lu; Runtian Yuan; Qingqiu Li; Jilan Xu; Junlin Hou; Yuejie Zhang; Rui Feng

arXiv:2603.15154·eess.IV·March 18, 2026

Vision-Language Model Based Multi-Expert Fusion for CT Image Classification

Jianfa Bai, Kejin Lu, Runtian Yuan, Qingqiu Li, Jilan Xu, Junlin Hou, Yuejie Zhang, Rui Feng

PDF

Open Access

TL;DR

This paper introduces a multi-expert, source-aware framework combining lung-aware 3D, MedSigLIP, and Transformer experts for robust COVID-19 CT classification across multiple sources, achieving high accuracy and AUC.

Contribution

It presents a novel three-stage multi-expert fusion approach that explicitly models source information to improve multi-source COVID-19 CT classification robustness.

Findings

01

Stage 1 model achieved macro-F1 of 0.9711

02

Stage 2 experts achieved AUC scores over 0.985

03

Source classifier reached over 91% accuracy

Abstract

Robust detection of COVID-19 from chest CT remains challenging in multi-institutional settings due to substantial source shift, source imbalance, and hidden test-source identities. In this work, we propose a three-stage source-aware multi-expert framework for multi-source COVID-19 CT classification. First, we build a lung-aware 3D expert by combining original CT volumes and lung-extracted CT volumes for volumetric classification. Second, we develop two MedSigLIP-based experts: a slice-wise representation and probability learning module, and a Transformer-based inter-slice context modeling module for capturing cross-slice dependency. Third, we train a source classifier to predict the latent source identity of each test scan. By leveraging the predicted source information, we perform model fusion and voting based on different experts. On the validation set covering all four sources, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications