LLM-Enhanced Data Management

Xuanhe Zhou; Xinyang Zhao; Guoliang Li

arXiv:2402.02643·cs.DB·February 6, 2024·2 cites

LLM-Enhanced Data Management

Xuanhe Zhou, Xinyang Zhao, Guoliang Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces LLMDB, a novel LLM-enhanced data management framework that improves accuracy, reduces costs, and avoids hallucination by embedding domain knowledge, using vector databases, and deploying multi-round inference, demonstrated in real-world scenarios.

Contribution

The paper presents LLMDB, a new paradigm integrating LLMs with domain knowledge, vector databases, and multi-round inference to enhance data management tasks.

Findings

01

Effective in query rewrite, database diagnosis, and data analytics.

02

Reduces LLM costs via vector database caching.

03

Achieves high accuracy and avoids hallucination.

Abstract

Machine learning (ML) techniques for optimizing data management problems have been extensively studied and widely deployed in recent five years. However traditional ML methods have limitations on generalizability (adapting to different scenarios) and inference ability (understanding the context). Fortunately, large language models (LLMs) have shown high generalizability and human-competitive abilities in understanding context, which are promising for data management tasks (e.g., database diagnosis, database tuning). However, existing LLMs have several limitations: hallucination, high cost, and low accuracy for complicated tasks. To address these challenges, we design LLMDB, an LLM-enhanced data management paradigm which has generalizability and high inference ability while avoiding hallucination, reducing LLM cost, and achieving high accuracy. LLMDB embeds domain-specific knowledge to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tsinghuadatabasegroup/db-gpt
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Database Systems and Queries · Simulation Techniques and Applications