AutoG: Towards automatic graph construction from tabular data
Zhikai Chen, Han Xie, Jian Zhang, Xiang song, Jiliang Tang, Huzefa Rangwala, George Karypis

TL;DR
This paper introduces AutoG, an LLM-based method for automatically constructing high-quality graphs from tabular data, addressing a key gap in graph machine learning by formalizing the problem and providing datasets for evaluation.
Contribution
The paper formalizes the graph construction problem from tabular data, introduces datasets for evaluation, and proposes AutoG, an LLM-based approach that automates high-quality graph generation.
Findings
AutoG generates graphs that match human expert quality.
Graph quality significantly impacts downstream task performance.
AutoG outperforms existing methods in various scenarios.
Abstract
Recent years have witnessed significant advancements in graph machine learning (GML), with its applications spanning numerous domains. However, the focus of GML has predominantly been on developing powerful models, often overlooking a crucial initial step: constructing suitable graphs from common data formats, such as tabular data. This construction process is fundamental to applying graph-based models, yet it remains largely understudied and lacks formalization. Our research aims to address this gap by formalizing the graph construction problem and proposing an effective solution. We identify two critical challenges to achieve this goal: 1. The absence of dedicated datasets to formalize and evaluate the effectiveness of graph construction methods, and 2. Existing automatic construction methods can only be applied to some specific cases, while tedious human engineering is required to…
Peer Reviews
Decision·ICLR 2025 Poster
- S1: The proposal introduces a carefully devised LLM prompt (Appendix D) - S2: The evaluation experiments show that the proposal significantly outperforms existing techniques and achieves results close to those of manual graph generation. - S3: Section 3.1 identifies C4 (graph variations) , which is indeed an important challenge. - S4: Generating co-author relationships as edges is definitely effective for node classification of homophily graphs.
- W1: The problem definition is not explicitly stated. - W2: Although the proposal utilizes carefully designed LLM prompts, it relies on standard techniques in LLMs like few-shot learning and chain of thought (CoT), making the novelty unclear. - W3: While the quantitative oracle evaluating GML with a validation set seems effective, Table 4 suggests the oracle may not be essential. The GML results are highly dependent on the choice of label selection (e.g., venue vs. year), making the conclusi
* The general observation is an important one: the conversion of tabular data into graph form cannot be taken for granted. Existing graph benchmarks based on tables avoid the hard cases. * The setup with an LLM to generate candidates is pretty nice. Restricting its generation freedom using function calls is a good idea as well.
* It remains unclear how well this method performs on the long tail. The evaluation is averaging over many cases, but the results might be dwarfed by very common ones. * The paper only considers very well behaving rectangular tables (relational database style), with column names and all. Even the data types are given. There is also a lot of web tables around with large datasets available. One could certainly question whether te chosen setting is realistic, now. * The datasets are pretty small. *
1. In the field of GML methods for tabular data, this paper is the first to concern graph construction evaluation, and introduces a benchmark for it. 2. The authors identify five key challenges in converting tabular data into graphs and proposed AutoG, an agent-based approach. They design actions for the large language model (LLM) to utilize its prior knowledge for augmenting graph construction. Additionally, they implement a feedback mechanism to calibrate the LLM's output based on the validati
My main concern is with the "metrics used in the proposed benchmark for evaluating the conversion of tabular data into graphs (T2G)." As stated in line 216, the authors use the performance of fixed GML models (RGCN, RGAT) trained on the generated graphs for quantitative evaluation. However, as a benchmark for T2G, it should be independent of specific downstream GML methods. Thus, the evaluation is limited to these two models and lacks generalizability to all GML models. Until it is demonstrated
Videos
Taxonomy
TopicsSemantic Web and Ontologies · Data Mining Algorithms and Applications · Natural Language Processing Techniques
MethodsSparse Evolutionary Training · Focus
