
TL;DR
This paper introduces Dimensional Data Analysis (DDA), a technique for quickly understanding large datasets' structure and anomalies, leveraging existing schemas with minimal overhead for big data systems.
Contribution
The paper presents DDA, a novel method that efficiently analyzes big data structures and anomalies using existing schemas, reducing human effort and computational overhead.
Findings
DDA effectively identifies data structure and anomalies.
DDA has low overhead and integrates with existing systems.
Performance measurements show DDA's efficiency on various datasets.
Abstract
The ability to collect and analyze large amounts of data is a growing problem within the scientific community. The growing gap between data and users calls for innovative tools that address the challenges faced by big data volume, velocity and variety. One of the main challenges associated with big data variety is automatically understanding the underlying structures and patterns of the data. Such an understanding is required as a pre-requisite to the application of advanced analytics to the data. Further, big data sets often contain anomalies and errors that are difficult to know a priori. Current approaches to understanding data structure are drawn from the traditional database ontology design. These approaches are effective, but often require too much human involvement to be effective for the volume, velocity and variety of data encountered by big data systems. Dimensional Data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
