The Stretto Execution Engine for LLM-Augmented Data Systems
Gabriele Sanmartino, Matthias Urban, Paolo Papotti, Carsten Binnig

TL;DR
Stretto is an execution engine for LLM-augmented data systems that optimally balances runtime efficiency and accuracy through a novel query planning approach and flexible operator implementations.
Contribution
It introduces a holistic query planning method using constrained optimization and a new KV-caching technique for fine-grained execution choices.
Findings
Outperforms existing systems in efficiency and accuracy guarantees.
Effectively manages runtime-accuracy trade-offs in LLM-augmented data queries.
Provides end-to-end query guarantees with optimized operator selection.
Abstract
LLM-augmented data systems enable semantic querying over structured and unstructured data, but executing queries with LLM-powered operators introduces a fundamental runtime-accuracy trade-off. In this paper, we present Stretto, a new execution engine that provides end-to-end query guarantees while efficiently navigating this trade-off in a holistic manner. For this, Stretto formulates query planning as a constrained optimization problem and uses a gradient-based optimizer to jointly select operator implementations and allocate error budgets across pipelines. Moreover, to enable fine-grained execution choices, Stretto introduces a novel idea on how KV-caching can be used to realize a spectrum of different physical operators that transform a sparse design space into a dense continuum of runtime-accuracy trade-offs. Experiments show that Stretto outperforms state-of-the-art systems while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Semantic Web and Ontologies · Cloud Computing and Resource Management
