Disentangling Shared and Task-Specific Representations from Multi-Modal Clinical Data

He Lyu; Huolin Zeng; Junren Wang; Huazhen Yang; Linchao He; Yong Chen; Zhirui Li; Andreas Maier; Siming Bayer; Huan Song

arXiv:2605.03570·cs.LG·May 6, 2026

Disentangling Shared and Task-Specific Representations from Multi-Modal Clinical Data

He Lyu, Huolin Zeng, Junren Wang, Huazhen Yang, Linchao He, Yong Chen, Zhirui Li, Andreas Maier, Siming Bayer, Huan Song

PDF

TL;DR

This paper introduces OrthTD, a novel multi-task learning framework using a Transformer and orthogonal decomposition to effectively disentangle shared and task-specific features in multimodal clinical data, improving outcome prediction.

Contribution

The paper proposes OrthTD, a new method that separates shared and task-specific representations with orthogonality constraints, enhancing multi-outcome clinical predictions from multimodal data.

Findings

01

OrthTD achieved an average AUC of 87.5% and AUPRC of 37.2% on real-world data.

02

OrthTD outperformed existing tabular and multi-task methods in predictive accuracy.

03

Enforcing orthogonality reduced redundancy and improved detection of rare events.

Abstract

Real-world clinical data is inherently multimodal, providing complementary evidence that mirrors the practical necessity of jointly assessing multiple related outcomes. Although multi-task learning can improve efficiency by sharing information across outcomes, existing approaches often fail to balance shared representation learning with outcome-specific modeling. Hard parameter sharing can trigger negative transfer when task gradients conflict, while flexible sharing may still entangle shared and task-specific signals. To address this, we propose a multi-task framework built on a unified Transformer for multimodal fusion, augmented with Orthogonal Task Decomposition (OrthTD) to split patient representations into shared and task-specific subspaces and impose a geometric orthogonality constraint to reduce redundancy and isolate task-specific signals. We evaluated OrthTD on a real-world…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.