Introducing explainable supervised machine learning into interactive feedback loops for statistical production system
Carlos Mougan, George Kanellos, Johannes Micheler, Jose Martinez,, Thomas Gottron

TL;DR
This paper integrates explainable supervised machine learning into an interactive feedback loop within a statistical production system to enhance data quality assurance by prioritizing exceptions and reducing user workload.
Contribution
It introduces a novel application of explainable machine learning to optimize exception handling in statistical data quality processes, addressing the challenge of limited labeled data.
Findings
Improved exception prioritization reduces user intervention time.
Enhanced data quality through targeted exception identification.
Developed an explainable AI taxonomy for system needs.
Abstract
Statistical production systems cover multiple steps from the collection, aggregation, and integration of data to tasks like data quality assurance and dissemination. While the context of data quality assurance is one of the most promising fields for applying machine learning, the lack of curated and labeled training data is often a limiting factor. The statistical production system for the Centralised Securities Database features an interactive feedback loop between data collected by the European Central Bank and data quality assurance performed by data quality managers at National Central Banks. The quality assurance feedback loop is based on a set of rule-based checks for raising exceptions, upon which the user either confirms the data or corrects an actual error. In this paper we use the information received from this feedback loop to optimize the exceptions presented to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Data Quality and Management · Data Analysis with R
