Stochastic SketchRefine: Scaling In-Database Decision-Making under Uncertainty to Millions of Tuples
Riddho R. Haque, Anh L. Mai, Matteo Brucato, Azza Abouzied, Peter J., Haas, Alexandra Meliou

TL;DR
This paper introduces Stochastic SketchRefine, a scalable framework for solving large stochastic package queries efficiently by linearizing risk constraints and dividing the problem into manageable subproblems, enabling fast, high-quality decision-making under uncertainty.
Contribution
It presents two novel methods: risk-constraint linearization (RCL) for scalable ILP formulation and Stochastic SketchRefine for divide-and-conquer optimization, significantly improving processing speed for large uncertain datasets.
Findings
Achieves orders of magnitude faster runtime than existing methods.
Produces high-quality packages with near-optimal expected outcomes.
Handles high variance data effectively in large-scale stochastic optimization.
Abstract
Decision making under uncertainty often requires choosing packages, or bags of tuples, that collectively optimize expected outcomes while limiting risks. Processing Stochastic Package Queries (SPQs) involves solving very large optimization problems on uncertain data. Monte Carlo methods create numerous scenarios, or sample realizations of the stochastic attributes of all the tuples, and generate packages with optimal objective values across these scenarios. The number of scenarios needed for accurate approximation - and hence the size of the optimization problem when using prior methods - increases with variance in the data, and the search space of the optimization problem increases exponentially with the number of tuples in the relation. Existing solvers take hours to process SPQs on large relations containing stochastic attributes with high variance. Besides enriching the SPaQL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms
