The Marriage of Incremental and Approximate Computing
Dhanya R Krishnan

TL;DR
This paper introduces IncApprox, a system that combines incremental and approximate computing by biasing sampling towards memoized data, enabling efficient, low-latency data analytics with bounded error.
Contribution
It proposes a novel online stratified sampling algorithm that marries incremental and approximate computing paradigms, implemented in Apache Spark Streaming.
Findings
IncApprox achieves low-latency analytics with bounded error.
The system effectively combines benefits of incremental and approximate computing.
Experimental results show improved efficiency over traditional methods.
Abstract
Most data analytics systems that require low-latency execution and efficient utilization of computing resources, increasingly adopt two computational paradigms, namely, incremental and approximate computing. Incremental computation updates the output incrementally instead of re-computing everything from scratch for successive runs of a job with input changes. Approximate computation returns an approximate output for a job instead of the exact output. Both paradigms rely on computing over a subset of data items instead of computing over the entire dataset, but they differ in their means for skipping parts of the computation. Incremental computing relies on the memoization of intermediate results of sub-computations, and reusing these memoized results across jobs for sub-computations that are unaffected by the changed input. Approximate computing relies on representative sampling of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Cloud Computing and Resource Management · Advanced Database Systems and Queries
