HyperJoin: LLM-augmented Hypergraph Link Prediction for Joinable Table Discovery
Shiyuan Liu, Jianwei Wang, Xuemin Lin, Lu Qin, Wenjie Zhang, Ying Zhang

TL;DR
HyperJoin introduces a hypergraph-based framework augmented with large language models to improve joinable table discovery by capturing structural interactions and enhancing result coherence, outperforming existing methods.
Contribution
It proposes a novel hypergraph and LLM-augmented approach for joinable table discovery, addressing structural modeling and coherence issues in previous methods.
Findings
Achieves 21.4% improvement in Precision@15
Achieves 17.2% improvement in Recall@15
Demonstrates superior performance over baseline methods
Abstract
As a pivotal task in data lake management, joinable table discovery has attracted widespread interest. While existing language model-based methods achieve remarkable performance by combining offline column representation learning with online ranking, their design insufficiently accounts for the underlying structural interactions: (1) offline, they directly model tables into isolated or pairwise columns, thereby struggling to capture the rich inter-table and intra-table structural information; and (2) online, they rank candidate columns based solely on query-candidate similarity, ignoring the mutual interactions among the candidates, leading to incoherent result sets. To address these limitations, we propose HyperJoin, a large language model (LLM)-augmented Hypergraph framework for Joinable table discovery. Specifically, we first construct a hypergraph to model tables using both the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Advanced Graph Neural Networks · Advanced Text Analysis Techniques
