# Explaining Aggregates for Exploratory Analytics

**Authors:** Fotis Savva, Christos Anagnostopoulos, Peter Triantafillou

arXiv: 1812.11346 · 2020-03-13

## TL;DR

XAXA introduces a novel explanation mechanism for aggregate queries in exploratory data analysis, enabling understanding of data subspaces through learned parametric functions without additional database access.

## Contribution

The paper presents XAXA, a new method for explaining aggregate query results using parametric functions learned online, improving interpretability in exploratory analytics.

## Key findings

- XAXA accurately explains aggregate queries with high fidelity.
- The method operates efficiently without additional database access.
- XAXA performs well on real-world and synthetic datasets.

## Abstract

Analysts wishing to explore multivariate data spaces, typically pose queries involving selection operators, i.e., range or radius queries, which define data subspaces of possible interest and then use aggregation functions, the results of which determine their exploratory analytics interests. However, such aggregate query (AQ) results are simple scalars and as such, convey limited information about the queried subspaces for exploratory analysis. We address this shortcoming aiding analysts to explore and understand data subspaces by contributing a novel explanation mechanism coined XAXA: eXplaining Aggregates for eXploratory Analytics. XAXA's novel AQ explanations are represented using functions obtained by a three-fold joint optimization problem. Explanations assume the form of a set of parametric piecewise-linear functions acquired through a statistical learning model. A key feature of the proposed solution is that model training is performed by only monitoring AQs and their answers on-line. In XAXA, explanations for future AQs can be computed without any database (DB) access and can be used to further explore the queried data subspaces, without issuing any more queries to the DB. We evaluate the explanation accuracy and efficiency of XAXA through theoretically grounded metrics over real-world and synthetic datasets and query workloads.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.11346/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/1812.11346/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/1812.11346/full.md

---
Source: https://tomesphere.com/paper/1812.11346