LLM-Powered Proactive Data Systems
Sepanta Zeighami, Yiming Lin, Shreya Shankar, Aditya Parameswaran

TL;DR
This paper advocates for proactive data systems powered by LLMs that understand, rework, and optimize user inputs and data, moving beyond reactive, black-box approaches to improve efficiency and correctness.
Contribution
It introduces a new proactive framework for data systems leveraging LLMs to understand and manipulate data and queries, enabling more intelligent and user-aware operations.
Findings
Proposed a proactive data system framework using LLMs.
Demonstrated improved efficiency in real-world tasks.
Outlined future research directions in proactive data management.
Abstract
With the power of LLMs, we now have the ability to query data that was previously impossible to query, including text, images, and video. However, despite this enormous potential, most present-day data systems that leverage LLMs are reactive, reflecting our community's desire to map LLMs to known abstractions. Most data systems treat LLMs as an opaque black box that operates on user inputs and data as is, optimizing them much like any other approximate, expensive UDFs, in conjunction with other relational operators. Such data systems do as they are told, but fail to understand and leverage what the LLM is being asked to do (i.e. the underlying operations, which may be error-prone), the data the LLM is operating on (e.g., long, complex documents), or what the user really needs. They don't take advantage of the characteristics of the operations and/or the data at hand, or ensure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Advanced Data Storage Technologies
