Reptile: Aggregation-level Explanations for Hierarchical Data
Zezhou Huang, Eugene Wu

TL;DR
Reptile is a system that explains anomalies in hierarchical data by recommending attributes to drill down, using a multi-level model that leverages data hierarchy to identify and resolve data errors efficiently.
Contribution
Reptile introduces a hierarchical explanation system with a multi-level model and optimizations for identifying data errors and anomalies in hierarchical datasets.
Findings
Reduces runtime by over 6 times compared to baseline.
Correctly identifies 21 out of 30 data errors in COVID-19 data.
Successfully resolves 20 out of 22 user-reported complaints.
Abstract
Recent query explanation systems help users understand anomalies in aggregation results by proposing predicates that describe input records that, if deleted, would resolve the anomalies. However, it can be difficult for users to understand how a predicate was chosen, and these approaches are limited to errors that can be resolved through deletion. In contrast, data errors may be due to group-wise errors, such as missing records or systematic value errors. This paper presents Reptile, an explanation system for hierarchical data. Given an anomalous aggregate query result, Reptile recommends the next drill-down attribute,and ranks the drill-down groups based on the extent repairing the group's statistics to its expected values resolves the anomaly. Reptile efficiently trains a multi-level model that leverages the data's hierarchy to estimate the expected values, and uses a factorised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference · Data Stream Mining Techniques
