DBSP: Automatic Incremental View Maintenance for Rich Query Languages
Mihai Budiu, Frank McSherry, Leonid Ryzhyk, Val Tannen

TL;DR
This paper introduces DBSP, a versatile language for data stream computations, and provides a general algorithm for incremental view maintenance applicable to complex query languages, enabling efficient updates for rich database queries.
Contribution
The paper presents a unified approach to incremental view maintenance for rich query languages by defining DBSP and developing a general solution applicable to various complex queries.
Findings
DBSP can model full relational queries, grouping, aggregation, and recursion.
The proposed algorithm efficiently maintains views for complex queries.
The approach generalizes previous solutions limited to simpler query languages.
Abstract
Incremental view maintenance has been for a long time a central problem in database theory. Many solutions have been proposed for restricted classes of database languages, such as the relational algebra, or Datalog. These techniques do not naturally generalize to richer languages. In this paper we give a general solution to this problem in 3 steps: (1) we describe a simple but expressive language called DBSP for describing computations over data streams; (2) we give a general algorithm for solving the incremental view maintenance problem for arbitrary DBSP programs, and (3) we show how to model many rich database query languages (including the full relational queries, grouping and aggregation, monotonic and non-monotonic recursion, and streaming aggregation) using DBSP. As a consequence, we obtain efficient incremental view maintenance techniques for all these rich languages.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Distributed systems and fault tolerance
