TEG-DB: A Comprehensive Dataset and Benchmark of Textual-Edge Graphs

Zhuofeng Li; Zixing Gou; Xiangnan Zhang; Zhongyuan Liu; Sirui Li,; Yuntong Hu; Chen Ling; Zheng Zhang; Liang Zhao

arXiv:2406.10310·cs.CL·November 26, 2024·2 cites

TEG-DB: A Comprehensive Dataset and Benchmark of Textual-Edge Graphs

Zhuofeng Li, Zixing Gou, Xiangnan Zhang, Zhongyuan Liu, Sirui Li,, Yuntong Hu, Chen Ling, Zheng Zhang, Liang Zhao

PDF

Open Access 1 Repo 1 Video

TL;DR

TEG-DB introduces a large-scale, diverse dataset of Textual-Edge Graphs with rich descriptions on nodes and edges, enabling advanced research in graph analysis using natural language information.

Contribution

We present TEG-DB, the first comprehensive benchmark dataset with detailed textual annotations on both nodes and edges across multiple domains.

Findings

01

Current models show limited ability to utilize rich textual edge information.

02

Benchmark results highlight the need for specialized methods to exploit textual edge data.

03

TEG-DB facilitates future research in textual-edge graph analysis.

Abstract

Text-Attributed Graphs (TAGs) augment graph structures with natural language descriptions, facilitating detailed depictions of data and their interconnections across various real-world settings. However, existing TAG datasets predominantly feature textual information only at the nodes, with edges typically represented by mere binary or categorical attributes. This lack of rich textual edge annotations significantly limits the exploration of contextual relationships between entities, hindering deeper insights into graph-structured data. To address this gap, we introduce Textual-Edge Graphs Datasets and Benchmark (TEG-DB), a comprehensive and diverse collection of benchmark textual-edge datasets featuring rich textual descriptions on nodes and edges. The TEG-DB datasets are large-scale and encompass a wide range of domains, from citation networks to social networks. In addition, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhuofeng-li/teg-benchmark
pytorchOfficial

Videos

TEG-DB: A Comprehensive Dataset and Benchmark of Textual-Edge Graphs· slideslive

Taxonomy

TopicsSemantic Web and Ontologies