Unleashing the Power of Image-Tabular Self-Supervised Learning via Breaking Cross-Tabular Barriers

Yibing Fu; Yunpeng Zhao; Zhitao Zeng; Cheng Chen; Yueming Jin

arXiv:2512.14026·cs.CV·December 17, 2025

Unleashing the Power of Image-Tabular Self-Supervised Learning via Breaking Cross-Tabular Barriers

Yibing Fu, Yunpeng Zhao, Zhitao Zeng, Cheng Chen, Yueming Jin

PDF

Open Access

TL;DR

This paper introduces CITab, a novel self-supervised learning framework that enhances multi-modal medical image and tabular data analysis by overcoming inter-tabular barriers, improving transferability and scalability across diverse datasets.

Contribution

CITab employs a semantic-aware tabular modeling mechanism and a prototype-guided mixture-of-linear layer to better handle heterogeneous tabular data in multi-modal SSL.

Findings

01

CITab outperforms existing methods on Alzheimer's diagnosis across multiple datasets.

02

The semantic-aware modeling improves transferability of learned representations.

03

The P-MoLin module enhances feature specialization for diverse tabular data.

Abstract

Multi-modal learning integrating medical images and tabular data has significantly advanced clinical decision-making in recent years. Self-Supervised Learning (SSL) has emerged as a powerful paradigm for pretraining these models on large-scale unlabeled image-tabular data, aiming to learn discriminative representations. However, existing SSL methods for image-tabular representation learning are often confined to specific data cohorts, mainly due to their rigid tabular modeling mechanisms when modeling heterogeneous tabular data. This inter-tabular barrier hinders the multi-modal SSL methods from effectively learning transferrable medical knowledge shared across diverse cohorts. In this paper, we propose a novel SSL framework, namely CITab, designed to learn powerful multi-modal feature representations in a cross-tabular manner. We design the tabular modeling mechanism from a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis