Analytical Engines With Context-Rich Processing: Towards Efficient Next-Generation Analytics
Viktor Sanca, Anastasia Ailamaki

TL;DR
This paper proposes a new analytical engine that integrates context-rich data processing, online data integration, and adaptive optimization to efficiently handle complex, heterogeneous data sources and analytical workloads.
Contribution
It introduces a co-optimized system combining model-assisted similarity, holistic cost-based optimization, and adaptive execution for next-generation analytics.
Findings
Proposes online data integration via model-assisted similarity operations.
Develops a holistic pipeline optimization across relational and model-based operators.
Envisions adaptive execution for heterogeneous hardware and workloads.
Abstract
As modern data pipelines continue to collect, produce, and store a variety of data formats, extracting and combining value from traditional and context-rich sources such as strings, text, video, audio, and logs becomes a manual process where such formats are unsuitable for RDBMS. To tap into the dark data, domain experts analyze and extract insights and integrate them into the data repositories. This process can involve out-of-DBMS, ad-hoc analysis, and processing resulting in ETL, engineering effort, and suboptimal performance. While AI systems based on ML models can automate the analysis process, they often further generate context-rich answers. Using multiple sources of truth, for either training the models or in the form of knowledge bases, further exacerbates the problem of consolidating the data of interest. We envision an analytical engine co-optimized with components that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Advanced Database Systems and Queries · Semantic Web and Ontologies
