Robustness in Text-Attributed Graph Learning: Insights, Trade-offs, and New Defenses

Runlin Lei; Lu Yi; Mingguo He; Pengyu Qiu; Zhewei Wei; Yongchao Liu; Chuntao Hong

arXiv:2510.17185·cs.LG·October 21, 2025

Robustness in Text-Attributed Graph Learning: Insights, Trade-offs, and New Defenses

Runlin Lei, Lu Yi, Mingguo He, Pengyu Qiu, Zhewei Wei, Yongchao Liu, Chuntao Hong

PDF

Open Access 3 Reviews

TL;DR

This paper systematically evaluates the robustness of various text-attributed graph learning models against diverse perturbations, revealing inherent trade-offs and vulnerabilities, and proposes SFT-auto, a new framework for balanced robustness.

Contribution

It introduces a comprehensive evaluation framework for robustness in TAG learning and proposes SFT-auto, a novel method for balanced defense against textual and structural attacks.

Findings

01

Models exhibit robustness trade-offs between text and structure.

02

GNNs and RGNNs' performance depends on text encoder and attack type.

03

GraphLLMs are highly vulnerable to training data corruption.

Abstract

While Graph Neural Networks (GNNs) and Large Language Models (LLMs) are powerful approaches for learning on Text-Attributed Graphs (TAGs), a comprehensive understanding of their robustness remains elusive. Current evaluations are fragmented, failing to systematically investigate the distinct effects of textual and structural perturbations across diverse models and attack scenarios. To address these limitations, we introduce a unified and comprehensive framework to evaluate robustness in TAG learning. Our framework evaluates classical GNNs, robust GNNs (RGNNs), and GraphLLMs across ten datasets from four domains, under diverse text-based, structure-based, and hybrid perturbations in both poisoning and evasion scenarios. Our extensive analysis reveals multiple findings, among which three are particularly noteworthy: 1) models have inherent robustness trade-offs between text and structure,…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 4

Strengths

1. The paper includes extensive baselines and experiments. 2. It offers novel insights into robustness in text-attributed graph learning. 3. The topic is highly relevant and timely.

Weaknesses

1. The experiments use a fixed perturbation rate; it would be more informative to select a few representative models and show robustness across different perturbation rates. 2. Although effective, the strategies in the SFT-auto pipeline are not very novel. 3. The study mainly reports average ranks across datasets rather than raw accuracy or significance tests. This approach may obscure real performance gaps and overstate the robustness of SFT-auto.

Reviewer 02Rating 4Confidence 4

Strengths

* Investigating the disting effect of textual perturbations (compared to small-epsilon perturbations) for node features is interesting and more realistic compared to pervious studies. * The text-structure tradeoff is an interesting insight and the proposed method a good remedy against. * There are many empirical results and insights. However, given its a purely empirical paper, this is also in a way a necessity. * Code provided. * Comparing robust GNNs with Graph LLMs in a (unified) adversarial

Weaknesses

1. Comparison Table 1 feels outdates and slightly misleading. On GNNs & RGNNs, the table only mentions the quite old graph robustness benchmark, while they don't mention recent works on GNNs & RGNNs. Exemplary, the cited work by Gosch et al. 2023 in the paper includes 8 datasets, GMA, evasion & inductive & adaptive attacks but is not mentioned, instead the 5 datsets of the GRB highlighted. I do think that the breath of experiments in submitted work is quite good, but I don't think for metrics su

Reviewer 03Rating 2Confidence 4

Strengths

* Authors provide a comprehensive, large-scale empirical evaluation across a wide range of models, datasets, and attack settings. * The writing and the figures are well-structured and easy to follow. * Various experiments provided in appendix.

Weaknesses

* Some of the statements are wrongly addressed: * In Section 3.1, it says that the spectral methods demonstrate superior performance against poisoning attacks while referencing [1]. However, [1] does not include any experiments or analysis on poisoning setting. * In Section 3.2, GNNGuard and RUNG are marked as spectral methods, while both are spatial methods. * More recent work such as [2,3] could be included for the "improving structure" category for a more reliable empirical eva

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Topic Modeling · Adversarial Robustness in Machine Learning