UQE: A Query Engine for Unstructured Databases
Hanjun Dai, Bethany Yixin Wang, Xingchen Wan, Bo Dai, Sherry Yang,, Azade Nova, Pengcheng Yin, Phitchaya Mangpo Phothilimthana, Charles Sutton,, Dale Schuurmans

TL;DR
UQE introduces a universal query engine leveraging large language models to analyze unstructured data like images and conversations, enabling flexible, efficient, and accurate data insights through a SQL-like language.
Contribution
The paper presents UQE, a novel system that combines LLMs, sampling, optimization, and compiler techniques to enable flexible querying of unstructured data using a universal query language.
Findings
Effective analysis of images, dialogs, and reviews.
Supports complex queries like semantic retrieval and aggregation.
Demonstrates efficiency and accuracy in unstructured data analytics.
Abstract
Analytics on structured data is a mature field with many successful methods. However, most real world data exists in unstructured form, such as images and conversations. We investigate the potential of Large Language Models (LLMs) to enable unstructured data analytics. In particular, we propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections. This engine accepts queries in a Universal Query Language (UQL), a dialect of SQL that provides full natural language flexibility in specifying conditions and operators. The new engine leverages the ability of LLMs to conduct analysis of unstructured data, while also allowing us to exploit advances in sampling and optimization techniques to achieve efficient and accurate query execution. In addition, we borrow techniques from classical compiler theory to better orchestrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Data Quality and Management
