PIQUE: Progressive Integrated QUery Operator with Pay-As-You-Go Enrichment
Dhrubajyoti Ghosh, Roberto Yus, Yasser Altowim, Sharad Mehrotra

TL;DR
This paper introduces PIQUE, a novel operator that enables progressive data enrichment during query processing, significantly improving the rate of answer quality enhancement for interactive exploratory analysis of big data.
Contribution
The paper presents a new operator, PIQUE, that supports prioritized, progressive data enrichment during query execution, addressing limitations of offline enrichment methods.
Findings
PIQUE outperforms baseline methods in answer quality improvement rate.
Progressive enrichment enables more interactive and timely data analysis.
The approach is effective across various data types and enrichment functions.
Abstract
Big data today in the form of text, images, video, and sensor data needs to be enriched (i.e., annotated with tags) prior to be effectively queried or analyzed. Data enrichment (that, depending upon the application could be compiled code, declarative queries, or expensive machine learning and/or signal processing techniques) often cannot be performed in its entirety as a pre-processing step at the time of data ingestion. Enriching data as a separate offline step after ingestion makes it unavailable for analysis during the period between the ingestion and enrichment. To bridge such a gap, this paper explores a novel approach that supports progressive data enrichment during query processing in order to support interactive exploratory analysis. Our approach is based on integrating an operator, entitled PIQUE, to support a prioritized execution of the enrichment functions during query…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCustomer churn and segmentation · Advanced Database Systems and Queries · Data Management and Algorithms
