Scaling Package Queries to a Billion Tuples via Hierarchical Partitioning and Customized Optimization
Anh L.Mai, Pengyu Wang, Azza Abouzied, Matteo Brucato, Peter J.Haas,, and Alexandra Meliou

TL;DR
This paper introduces Progressive Shading, a scalable algorithm for package queries that efficiently handles billions of tuples using hierarchical partitioning and customized optimization, significantly improving over prior methods.
Contribution
It presents a novel hierarchical partitioning approach and optimized ILP/LP solvers that enable package query processing at unprecedented data scales.
Findings
Scales to billions of tuples efficiently.
Handles very tight constraints gracefully.
Outperforms traditional partitioning schemes.
Abstract
A package query returns a package - a multiset of tuples - that maximizes or minimizes a linear objective function subject to linear constraints, thereby enabling in-database decision support. Prior work has established the equivalence of package queries to Integer Linear Programs (ILPs) and developed the SketchRefine algorithm for package query processing. While this algorithm was an important first step toward supporting prescriptive analytics scalably inside a relational database, it struggles when the data size grows beyond a few hundred million tuples or when the constraints become very tight. In this paper, we present Progressive Shading, a novel algorithm for processing package queries that can scale efficiently to billions of tuples and gracefully handle tight constraints. Progressive Shading solves a sequence of optimization problems over a hierarchy of relations, each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Constraint Satisfaction and Optimization
