Task Scarcity and Label Leakage in Relational Transfer Learning
Francisco Galuppo Azevedo, Clarissa Lima Loures, Denis Oliveira Correa

TL;DR
This paper investigates how limited supervision in relational models causes label leakage, and proposes a gradient projection method to improve transfer performance across tasks.
Contribution
It introduces a gradient projection technique to mitigate label leakage in relational transfer learning models with limited task diversity.
Findings
Gradient projection improves transfer AUROC by +0.145 on RelBench.
The method often recovers near single-task performance.
Limited task diversity constrains relational foundation models.
Abstract
Training relational foundation models requires learning representations that transfer across tasks, yet available supervision is typically limited to a small number of prediction targets per database. This task scarcity causes learned representations to encode task-specific shortcuts that degrade transfer even within the same schema, a problem we call label leakage. We study this using K-Space, a modular architecture combining frozen pretrained tabular encoders with a lightweight message-passing core. To suppress leakage, we introduce a gradient projection method that removes label-predictive directions from representation updates. On RelBench, this improves within-dataset transfer by +0.145 AUROC on average, often recovering near single-task performance. Our results suggest that limited task diversity, not just limited data, constrains relational foundation models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
