LLM-FK: Multi-Agent LLM Reasoning for Foreign Key Detection in Large-Scale Complex Databases
Zijian Tang, Ying Zhang, Sibo Cai, Ruoxuan Wang

TL;DR
This paper introduces LLM-FK, a multi-agent framework that leverages large language models to accurately and efficiently detect foreign keys in large-scale, complex databases, overcoming limitations of traditional heuristic methods.
Contribution
The paper presents the first fully automated multi-agent system for foreign key detection using LLMs, addressing scalability, ambiguity, and consistency challenges in complex databases.
Findings
Achieves over 93% F1-score on five benchmark datasets.
Reduces search space by 100 to 1000 times without losing true FKs.
Outperforms existing methods by 15% on the MusicBrainz database.
Abstract
Detecting missing foreign keys (FKs) requires accurately modeling semantic dependencies across database schemas, which conventional heuristic-based methods are fundamentally limited in capturing. We propose LLM-FK, the first fully automated multi-agent framework for FK detection, designed to address three core challenges that hinder naive LLM-based solutions in large-scale complex databases: combinatorial search space explosion, ambiguous inference under limited context, and global inconsistency arising from isolated local predictions. LLM-FK coordinates four specialized agents: a Profiler that decomposes the FK detection problem into the task of validating FK candidate column pairs and prunes the search space via a unique-key-driven schema decomposition strategy; an Interpreter that injects self-augmented domain knowledge; a Refiner that constructs compact structural representations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Topic Modeling · Web Application Security Vulnerabilities
