Disentangling Shared and Task-Specific Representations from Multi-Modal Clinical Data
He Lyu, Huolin Zeng, Junren Wang, Huazhen Yang, Linchao He, Yong Chen, Zhirui Li, Andreas Maier, Siming Bayer, Huan Song

TL;DR
This paper introduces OrthTD, a novel multi-task learning framework using a Transformer and orthogonal decomposition to effectively disentangle shared and task-specific features in multimodal clinical data, improving outcome prediction.
Contribution
The paper proposes OrthTD, a new method that separates shared and task-specific representations with orthogonality constraints, enhancing multi-outcome clinical predictions from multimodal data.
Findings
OrthTD achieved an average AUC of 87.5% and AUPRC of 37.2% on real-world data.
OrthTD outperformed existing tabular and multi-task methods in predictive accuracy.
Enforcing orthogonality reduced redundancy and improved detection of rare events.
Abstract
Real-world clinical data is inherently multimodal, providing complementary evidence that mirrors the practical necessity of jointly assessing multiple related outcomes. Although multi-task learning can improve efficiency by sharing information across outcomes, existing approaches often fail to balance shared representation learning with outcome-specific modeling. Hard parameter sharing can trigger negative transfer when task gradients conflict, while flexible sharing may still entangle shared and task-specific signals. To address this, we propose a multi-task framework built on a unified Transformer for multimodal fusion, augmented with Orthogonal Task Decomposition (OrthTD) to split patient representations into shared and task-specific subspaces and impose a geometric orthogonality constraint to reduce redundancy and isolate task-specific signals. We evaluated OrthTD on a real-world…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
