Holistic Cube Analysis: A Query Framework for Data Insights
Xi Wu, Shaleen Deep, Joe Benassi, Fengan Li, Yaqi Zhang, Uyeong Jang,, James Foster, Stella Kim, Yujing Sun, Long Nguyen, Stratis Viglas, Somesh, Jha, John Cieslewicz, Jeffrey F. Naughton

TL;DR
Holistic Cube Analysis (HoCA) is a framework that enhances relational queries with new data types and operators to facilitate complex data insight searches, enabling powerful analysis across diverse applications.
Contribution
HoCA introduces a novel data type AbstractCube and operators like cube crawling and join, expanding the capabilities of relational models for data insight exploration.
Findings
Implemented and deployed HoCA at Google.
Attracted over 30 teams across various fields.
Enabled novel analyses like recurrent crawling.
Abstract
Many data insight questions can be viewed as searching in a large space of tables and finding important ones, where the notion of importance is defined in some adhoc user defined manner. This paper presents Holistic Cube Analysis (HoCA), a framework that augments the capabilities of relational queries for such problems. HoCA first augments the relational data model and introduces a new data type AbstractCube, defined as a function which maps a region-features pair to a relational table (a region is a tuple which specifies values of a set of dimensions). AbstractCube provides a logical form of data, and HoCA operators are cube-to-cube transformations. We describe two basic but fundamental HoCA operators, cube crawling and cube join (with many possible extensions). Cube crawling explores a region space, and outputs a cube that maps regions to signal vectors. Cube join, in turn, is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Web Data Mining and Analysis · Advanced Database Systems and Queries
