Delving into Instance-Dependent Label Noise in Graph Data: A Comprehensive Study and Benchmark

Suyeon Kim; SeongKu Kang; Dongwoo Kim; Jungseul Ok; Hwanjo Yu

arXiv:2506.12468·cs.LG·June 18, 2025

Delving into Instance-Dependent Label Noise in Graph Data: A Comprehensive Study and Benchmark

Suyeon Kim, SeongKu Kang, Dongwoo Kim, Jungseul Ok, Hwanjo Yu

PDF

1 Repo

TL;DR

This paper introduces BeGIN, a comprehensive benchmark for evaluating GNNs under realistic, instance-dependent label noise conditions, highlighting the challenges and guiding future robustness improvements.

Contribution

It presents a new benchmark with realistic noise simulations, including LLM-based methods, and evaluates strategies for noise robustness in GNNs, addressing a gap in existing studies.

Findings

01

LLM-based corruption poses significant challenges for GNNs.

02

Node-specific parameterization improves robustness against label noise.

03

The benchmark enables systematic evaluation of noise-handling strategies.

Abstract

Graph Neural Networks (GNNs) have achieved state-of-the-art performance in node classification tasks but struggle with label noise in real-world data. Existing studies on graph learning with label noise commonly rely on class-dependent label noise, overlooking the complexities of instance-dependent noise and falling short of capturing real-world corruption patterns. We introduce BeGIN (Benchmarking for Graphs with Instance-dependent Noise), a new benchmark that provides realistic graph datasets with various noise types and comprehensively evaluates noise-handling strategies across GNN architectures, noisy label detection, and noise-robust learning. To simulate instance-dependent corruptions, BeGIN introduces algorithmic methods and LLM-based simulations. Our experiments reveal the challenges of instance-dependent noise, particularly LLM-based corruption, and underscore the importance of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kimsu55/begin
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.