TL;DR
This paper evaluates the robustness of Graph Self-Supervised Learning methods on noisy, text-derived biomedical graphs, proposing a framework that improves performance and offers practical guidance for real-world applications.
Contribution
It introduces NATD-GSSL, a comprehensive framework for GSSL on noisy biomedical graphs, and provides the first systematic robustness analysis in this context.
Findings
Relation reconstruction is highly sensitive to noise but benefits from schemas.
Feature reconstruction remains robust and comparable to clean graphs.
Bidirectional GNNs outperform unidirectional ones on noisy graphs.
Abstract
Graph Self-Supervised Learning (GSSL) offers a powerful paradigm for learning graph representations without labeled data. However, existing work assumes clean, manually curated graphs. Recent advances in NLP enable the large-scale automatic extraction of knowledge graphs from text, opening new opportunities for GSSL while introducing substantial real-world noise. This type of noise remains largely unexplored, as prior robustness studies typically rely on synthetic perturbations. To address this gap, we present the first comprehensive evaluation of GSSL methods on text-driven graphs for unsupervised term typing. We introduce Noise-Aware Text-Driven Graph GSSL (NATD-GSSL), a unified framework that combines automatic graph construction, graph refinement, and GSSL. Our evaluation follows a dual-graph protocol that contrasts a noisy graph derived from MedMentions with a clean Unified Medical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
