AIDE: An Automated Sample-based Approach for Interactive Data   Exploration

Kyriaki Dimitriadou; Olga Papaemmanouil; Yanlei Diao

arXiv:1510.08897·cs.DB·November 2, 2015·1 cites

AIDE: An Automated Sample-based Approach for Interactive Data Exploration

Kyriaki Dimitriadou, Olga Papaemmanouil, Yanlei Diao

PDF

Open Access

TL;DR

AIDE is an automated framework that guides users through complex datasets by intelligently sampling and predicting data patterns, reducing effort and wait time in data exploration tasks.

Contribution

The paper introduces AIDE, a novel automated data exploration system that combines classification and optimization to efficiently discover data patterns with minimal user input.

Findings

01

High accuracy in predicting common conjunctive queries

02

Effective prediction of complex disjunctive queries with limited samples

03

Interactive performance with user wait times under a few seconds

Abstract

In this paper, we argue that database systems be augmented with an automated data exploration service that methodically steers users through the data in a meaningful way. Such an automated system is crucial for deriving insights from complex datasets found in many big data applications such as scientific and healthcare applications as well as for reducing the human effort of data exploration. Towards this end, we present AIDE, an Automatic Interactive Data Exploration framework that assists users in discovering new interesting data patterns and eliminate expensive ad-hoc exploratory queries. AIDE relies on a seamless integration of classification algorithms and data management optimization techniques that collectively strive to accurately learn the user interests based on his relevance feedback on strategically collected samples. We present a number of exploration techniques as well…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Data Management and Algorithms · Advanced Database Systems and Queries