Stochastic Package Queries in Probabilistic Databases
Matteo Brucato, Nishant Yadav, Azza Abouzied, Peter J. Haas, and Alexandra Meliou

TL;DR
This paper introduces stochastic package queries (SPQs) in probabilistic databases, enabling efficient decision-making under uncertainty by using a novel SummarySearch algorithm that outperforms traditional Monte Carlo methods.
Contribution
It presents a SQL extension for specifying SPQs and a new SummarySearch algorithm that efficiently approximates large stochastic optimization problems in probabilistic databases.
Findings
SummarySearch is significantly faster than prior methods.
The approach produces high-quality, feasible packages.
Experimental results demonstrate substantial performance improvements.
Abstract
We provide methods for in-database support of decision making under uncertainty. Many important decision problems correspond to selecting a package (bag of tuples in a relational database) that jointly satisfy a set of constraints while minimizing some overall cost function; in most real-world problems, the data is uncertain. We provide methods for specifying -- via a SQL extension -- and processing stochastic package queries (SPQs), in order to solve optimization problems over uncertain data, right where the data resides. Prior work in stochastic programming uses Monte Carlo methods where the original stochastic optimization problem is approximated by a large deterministic optimization problem that incorporates many scenarios, i.e., sample realizations of the uncertain data values. For large database tables, however, a huge number of scenarios is required, leading to poor performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
