JanusAQP: Efficient Partition Tree Maintenance for Dynamic Approximate Query Processing
Xi Liang, Stavros Sintos, Sanjay Krishnan

TL;DR
JanusAQP is a dynamic approximate query processing system that efficiently maintains partition trees, significantly reducing error and supporting high update rates with low latency in datasets with frequent insertions and deletions.
Contribution
It introduces novel methods for initializing, maintaining, and re-optimizing dynamic partition trees for approximate query processing, improving accuracy and performance.
Findings
Reduces approximation error by over 60% compared to baseline.
Handles over 100,000 updates per second with millisecond query latency.
Supports multiple query types including SUM, COUNT, AVG, MIN, MAX.
Abstract
Approximate query processing over dynamic databases, i.e., under insertions/deletions, has applications ranging from high-frequency trading to internet-of-things analytics. We present JanusAQP, a new dynamic AQP system, which supports SUM, COUNT, AVG, MIN, and MAX queries under insertions and deletions to the dataset. JanusAQP extends static partition tree synopses, which are hierarchical aggregations of datasets, into the dynamic setting. This paper contributes new methods for: (1) efficient initialization of the data synopsis in the presence of incoming data, (2) maintenance of the data synopsis under insertions/deletions, and (3) re-optimization of the partitioning to reduce the approximation error. JanusAQP reduces the error of a state-of-the-art baseline by more than 60% using only 10% storage cost. JanusAQP can process more than 100K updates per second in a single node setting and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Database Systems and Queries · Data Quality and Management
