Enhancing Text2Cypher with Schema Filtering
Makbule Gulcin Ozsoy

TL;DR
This paper investigates schema filtering techniques to improve Text2Cypher query generation, reducing costs and noise, especially benefiting smaller models, while analyzing the impact on performance and token efficiency.
Contribution
It introduces and evaluates various schema filtering methods for Text2Cypher, demonstrating their effectiveness in optimizing query generation and reducing computational costs.
Findings
Schema filtering reduces token length and costs.
Smaller models benefit more from schema filtering.
Larger models are less affected by schema filtering.
Abstract
Knowledge graphs represent complex data using nodes, relationships, and properties. Cypher, a powerful query language for graph databases, enables efficient modeling and querying. Recent advancements in large language models allow translation of natural language questions into Cypher queries - Text2Cypher. A common approach is incorporating database schema into prompts. However, complex schemas can introduce noise, increase hallucinations, and raise computational costs. Schema filtering addresses these challenges by including only relevant schema elements, improving query generation while reducing token costs. This work explores various schema filtering methods for Text2Cypher task and analyzes their impact on token length, performance, and cost. Results show that schema filtering effectively optimizes Text2Cypher, especially for smaller models. Consistent with prior research, we find…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Data Quality and Management
