Approximating Aggregated SQL Queries With LSTM Networks
Nir Regev, Lior Rokach, Asaf Shabtai

TL;DR
This paper introduces 'Hunch', a lightweight LSTM-based approach for approximate query processing that significantly reduces query latency and improves prediction accuracy compared to existing methods, enabling near real-time data analytics.
Contribution
The paper presents a novel LSTM-based method for AQP that achieves high throughput and accuracy, outperforming state-of-the-art engines in query latency and prediction quality.
Findings
Predicted query results with 1-4% NRMSE
Predicted up to 120,000 queries per second
Single query latency under 2 milliseconds
Abstract
Despite continuous investments in data technologies, the latency of querying data still poses a significant challenge. Modern analytic solutions require near real-time responsiveness both to make them interactive and to support automated processing. Current technologies (Hadoop, Spark, Dataflow) scan the dataset to execute queries. They focus on providing a scalable data storage to maximize task execution speed. We argue that these solutions fail to offer an adequate level of interactivity since they depend on continual access to data. In this paper we present a method for query approximation, also known as approximate query processing (AQP), that reduce the need to scan data during inference (query calculation), thus enabling a rapid query processing tool. We use LSTM network to learn the relationship between queries and their results, and to provide a rapid inference layer for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory
