MedDChest: A Content-Aware Multimodal Foundational Vision Model for Thoracic Imaging

Mahmoud Soliman; Islam Osman; Mohamed S. Shehata; Rasika Rajapakshe

arXiv:2511.04016·cs.CV·November 7, 2025

MedDChest: A Content-Aware Multimodal Foundational Vision Model for Thoracic Imaging

Mahmoud Soliman, Islam Osman, Mohamed S. Shehata, Rasika Rajapakshe

PDF

Open Access

TL;DR

MedDChest is a specialized vision transformer model pre-trained on a large thoracic imaging dataset, utilizing a novel content-aware augmentation, which significantly improves performance on diagnostic tasks compared to models pre-trained on natural images.

Contribution

We introduce MedDChest, a domain-specific vision transformer trained from scratch on multimodal thoracic images, with a new content-aware data augmentation technique for improved medical imaging analysis.

Findings

01

MedDChest outperforms ImageNet-pretrained models on thoracic diagnostic tasks.

02

The content-aware augmentation enhances model focus on relevant anatomical regions.

03

Large-scale in-domain pre-training improves feature extraction for medical imaging.

Abstract

The performance of vision models in medical imaging is often hindered by the prevailing paradigm of fine-tuning backbones pre-trained on out-of-domain natural images. To address this fundamental domain gap, we propose MedDChest, a new foundational Vision Transformer (ViT) model optimized specifically for thoracic imaging. We pre-trained MedDChest from scratch on a massive, curated, multimodal dataset of over 1.2 million images, encompassing different modalities including Chest X-ray and Computed Tomography (CT) compiled from 10 public sources. A core technical contribution of our work is Guided Random Resized Crops, a novel content-aware data augmentation strategy that biases sampling towards anatomically relevant regions, overcoming the inefficiency of standard cropping techniques on medical scans. We validate our model's effectiveness by fine-tuning it on a diverse set of downstream…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications