Toward Multi-Database Query Reasoning for Text2Cypher
Makbule Gulcin Ozsoy

TL;DR
This paper introduces a structured approach to enable large language models to reason across multiple graph databases for natural language query translation, addressing real-world distributed data scenarios.
Contribution
It formalizes the multi-database reasoning problem for Text2Cypher, proposing a three-phase roadmap for database routing, decomposition, and heterogeneous query reasoning.
Findings
Formalization of multi-database reasoning for Text2Cypher
Identification of challenges in source selection and query decomposition
Roadmap for future research in multi-database natural language interfaces
Abstract
Large language models have significantly improved natural language interfaces to databases by translating user questions into executable queries. In particular, Text2Cypher focuses on generating Cypher queries for graph databases, enabling users to access graph data without query language expertise. Most existing Text2Cypher systems assume a single preselected graph database, where queries are generated over a known schema. However, real-world systems are often distributed across multiple independent graph databases organized by domain or system boundaries, where relevant information may span multiple sources. To address this limitation, we propose a shift from single-database query generation to multi-database query reasoning. Instead of assuming a fixed execution context, the system must reason about (i) relevant databases, (ii) how to decompose a question across them, and (iii) how…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
