LATEX-GCL: Large Language Models (LLMs)-Based Data Augmentation for   Text-Attributed Graph Contrastive Learning

Haoran Yang; Xiangyu Zhao; Sirui Huang; Qing Li; Guandong Xu

arXiv:2409.01145·cs.SI·September 4, 2024

LATEX-GCL: Large Language Models (LLMs)-Based Data Augmentation for Text-Attributed Graph Contrastive Learning

Haoran Yang, Xiangyu Zhao, Sirui Huang, Qing Li, Guandong Xu

PDF

Open Access

TL;DR

LATEX-GCL introduces a novel framework leveraging Large Language Models to generate textual augmentations for Text-Attributed Graphs, overcoming previous limitations and enhancing graph contrastive learning performance.

Contribution

The paper proposes LATEX-GCL, a new GCL framework that uses LLMs for effective textual augmentation in TAGs, addressing key challenges of information and semantic loss.

Findings

01

Outperforms existing methods on four TAG datasets

02

Demonstrates the effectiveness of LLM-based augmentations

03

Provides reproducible code and datasets

Abstract

Graph Contrastive Learning (GCL) is a potent paradigm for self-supervised graph learning that has attracted attention across various application scenarios. However, GCL for learning on Text-Attributed Graphs (TAGs) has yet to be explored. Because conventional augmentation techniques like feature embedding masking cannot directly process textual attributes on TAGs. A naive strategy for applying GCL to TAGs is to encode the textual attributes into feature embeddings via a language model and then feed the embeddings into the following GCL module for processing. Such a strategy faces three key challenges: I) failure to avoid information loss, II) semantic loss during the text encoding phase, and III) implicit augmentation constraints that lead to uncontrollable and incomprehensible results. In this paper, we propose a novel GCL framework named LATEX-GCL to utilize Large Language Models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks

MethodsSoftmax · Attention Is All You Need · Contrastive Learning