Text2VectorSQL: Towards a Unified Interface for Vector Search and SQL Queries
Zhengren Wang, Dongwen Yao, Bozhou Li, Dongsheng Ma, Bo Li, Zhiyu Li, Feiyu Xiong, Bin Cui, Linpeng Tang, Wentao Zhang

TL;DR
This paper introduces Text2VectorSQL, a unified natural language interface for querying both structured and unstructured data, supported by a new dataset, benchmark, and evaluation metrics to advance research in integrated data retrieval.
Contribution
It formalizes the Text2VectorSQL task, provides a scalable data synthesis pipeline, creates the first comprehensive benchmark, and proposes novel evaluation metrics for unified data querying.
Findings
Baseline models perform well but face recall degradation with combined SQL filters and vector search.
The integrated approach reveals challenges in maintaining high recall in unified queries.
The work establishes foundational tools and benchmarks for future research in unified data interfaces.
Abstract
The proliferation of unstructured data poses a fundamental challenge to traditional database interfaces. While Text-to-SQL has democratized access to structured data, it remains incapable of interpreting semantic or multi-modal queries. Concurrently, vector search has emerged as the de facto standard for querying unstructured data, but its integration with SQL-termed VectorSQL-still relies on manual query crafting and lacks standardized evaluation methodologies, creating a significant gap between its potential and practical application. To bridge this fundamental gap, we introduce and formalize Text2VectorSQL, a novel task to establish a unified natural language interface for seamlessly querying both structured and unstructured data. To catalyze research in this new domain, we present a comprehensive foundational ecosystem, including: (1) A scalable and robust pipeline for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Information Retrieval and Search Behavior · Advanced Database Systems and Queries
