CAESURA: Language Models as Multi-Modal Query Planners
Matthias Urban, Carsten Binnig

TL;DR
This paper introduces CAESURA, a novel approach using language models like GPT-4 to generate multi-modal query plans from natural language, enabling querying of diverse data types beyond traditional relational databases.
Contribution
It presents the first GPT-4 based prototype for multi-modal query planning, demonstrating the feasibility of language models in translating natural language into complex, multi-modal query plans.
Findings
Feasibility shown on two datasets
Prototype demonstrates multi-modal query planning
Discussion of future improvements
Abstract
Traditional query planners translate SQL queries into query plans to be executed over relational data. However, it is impossible to query other data modalities, such as images, text, or video stored in modern data systems such as data lakes using these query planners. In this paper, we propose Language-Model-Driven Query Planning, a new paradigm of query planning that uses Language Models to translate natural language queries into executable query plans. Different from relational query planners, the resulting query plans can contain complex operators that are able to process arbitrary modalities. As part of this paper, we present a first GPT-4 based prototype called CEASURA and show the general feasibility of this idea on two datasets. Finally, we discuss several ideas to improve the query planning capabilities of today's Language Models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
