RadJEPA: Radiology Encoder for Chest X-Rays via Joint Embedding Predictive Architecture

Anas Anwarul Haq Khan; Mariam Husain; Kshitij Jadhav

arXiv:2601.15891·cs.CV·May 19, 2026

RadJEPA: Radiology Encoder for Chest X-Rays via Joint Embedding Predictive Architecture

Anas Anwarul Haq Khan, Mariam Husain, Kshitij Jadhav

PDF

1 Models

TL;DR

RadJEPA is a self-supervised learning framework for chest X-ray encoders that predicts latent representations of masked regions without using language supervision, outperforming existing methods.

Contribution

It introduces a novel latent-space prediction approach in self-supervised learning for radiology images, eliminating the need for paired image-text data.

Findings

01

RadJEPA surpasses state-of-the-art methods like Rad-DINO in multiple benchmarks.

02

The model effectively learns from unlabeled chest X-ray images for various tasks.

03

It demonstrates strong performance in disease classification, segmentation, and report generation.

Abstract

Recent advances in medical vision language models guide the learning of visual representations; however, this form of supervision is constrained by the availability of paired image text data, raising the question of whether robust radiology encoders can be learned without relying on language supervision. In this work, we introduce RadJEPA, a self-supervised framework built on a Joint Embedding Predictive Architecture that learns without language supervision. Pre-trained solely on unlabeled chest X-ray images, the model learns to predict latent representations of masked image regions. This predictive objective differs fundamentally from both image text pre-training and DINO-style self-distillation: rather than aligning global representations across views or modalities, RadJEPA explicitly models latent-space prediction. We evaluate the learned encoder on disease classification, semantic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
AIDElab-IITBombay/RadJEPA
model· 1.1k dl· ♡ 9
1.1k dl♡ 9

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Domain Adaptation and Few-Shot Learning · Artificial Intelligence in Healthcare and Education