PLOP: Cost-Based Placement of Semantic Operators in Hybrid Query Plans
Qiuyang Mang, Yufan Xiang, Hangrui Zhou, Runyuan He, Jiaxiang Yu, Hanchen Li, Aditya Parameswaran, and Alvin Cheung

TL;DR
PLOP is a cost-based optimizer that strategically places semantic operators in hybrid query plans to minimize total execution costs involving large language model calls and traditional database processing.
Contribution
It introduces a dynamic programming-based cost model for optimal semantic filter placement, balancing LLM and relational costs in hybrid queries.
Findings
PLOP achieves up to 1.5× speedup and 4.29× cost reduction.
It maintains high output quality with an average F1 of 0.85.
PLOP outperforms six publicly available systems in cost and accuracy.
Abstract
Recent database systems have introduced semantic operators that leverage large language models (LLMs) to filter, join, and project over structured data using natural language predicates. In practice, these operators are combined with traditional relational operators, e.g., equi-joins, producing hybrid query plans whose execution cost depends on both expensive LLM calls and conventional database processing. A key optimization question is where to place each semantic operator relative to the relational operators in the plan: placing them earlier reduces the data that subsequent operators process, but requires more LLM calls; placing them later reduces LLM calls through deduplication, but forces relational operators to process larger intermediate data. Existing systems either ignore this placement question or apply simple heuristics without considering the full cost trade-off. We present…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
