Leveraging Large Language Models for Effective Label-free Node Classification in Text-Attributed Graphs

Taiyan Zhang; Renchi Yang; Yurui Lai; Mingyu Yan; Xiaochun Ye; Dongrui Fan

arXiv:2412.11983·cs.LG·May 19, 2025

Leveraging Large Language Models for Effective Label-free Node Classification in Text-Attributed Graphs

Taiyan Zhang, Renchi Yang, Yurui Lai, Mingyu Yan, Xiaochun Ye, Dongrui Fan

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces Locle, a cost-effective framework that leverages large language models and graph neural networks for label-free node classification in text-attributed graphs, significantly reducing labeling costs while improving accuracy.

Contribution

Locle is a novel active self-training framework that combines LLMs and GNNs to perform label-free node classification efficiently, addressing noisy labels and query costs.

Findings

01

Locle outperforms state-of-the-art methods on five benchmark datasets.

02

Achieves 8.08% accuracy improvement on DBLP with minimal query cost.

03

Effective in reducing labeling costs to less than one cent.

Abstract

Graph neural networks (GNNs) have become the preferred models for node classification in graph data due to their robust capabilities in integrating graph structures and attributes. However, these models heavily depend on a substantial amount of high-quality labeled data for training, which is often costly to obtain. With the rise of large language models (LLMs), a promising approach is to utilize their exceptional zero-shot capabilities and extensive knowledge for node labeling. Despite encouraging results, this approach either requires numerous queries to LLMs or suffers from reduced performance due to noisy labels generated by LLMs. To address these challenges, we introduce Locle, an active self-training framework that does Label-free node Classification with LLMs cost-Effectively. Locle iteratively identifies small sets of "critical" samples using GNNs and extracts informative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hkbu-lagas/locle
pytorchOfficial

Datasets

lannester/Locle
dataset· 10 dl
10 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Text and Document Classification Technologies