CodexDB: Generating Code for Processing SQL Queries using GPT-3 Codex
Immanuel Trummer

TL;DR
CodexDB is a customizable SQL processing engine that leverages GPT-3 Codex to decompose complex queries into simpler steps, translating natural language instructions into executable code, demonstrating high accuracy on WikiSQL.
Contribution
Introduces CodexDB, a framework that uses GPT-3 Codex to generate SQL processing code from natural language, enabling customization and improved query handling.
Findings
Correctly generates code for most WikiSQL queries
Supports customization via natural language instructions
Decomposes complex queries into simpler steps
Abstract
CodexDB is an SQL processing engine whose internals can be customized via natural language instructions. CodexDB is based on OpenAI's GPT-3 Codex model which translates text into code. It is a framework on top of GPT-3 Codex that decomposes complex SQL queries into a series of simple processing steps, described in natural language. Processing steps are enriched with user-provided instructions and descriptions of database properties. Codex translates the resulting text into query processing code. An early prototype of CodexDB is able to generate correct code for a majority of queries of the WikiSQL benchmark and can be customized in various ways.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Scientific Computing and Data Management · Advanced Database Systems and Queries
MethodsAttention Is All You Need · Linear Layer · Cosine Annealing · Adam · Multi-Head Attention · Residual Connection · Byte Pair Encoding · {Dispute@FaQ-s}How to file a dispute with Expedia? · Dense Connections · Attention Dropout
