Internal Data Imputation in Data Warehouse Dimensions

Yuzhao Yang (IRIT-SIG); Fatma Abdelhedi; J\'er\^ome Darmont (ERIC),; Franck Ravat (IRIT-SIG); Olivier Teste (IRIT-SIG)

arXiv:2110.01228·cs.DB·October 5, 2021

Internal Data Imputation in Data Warehouse Dimensions

Yuzhao Yang (IRIT-SIG), Fatma Abdelhedi, J\'er\^ome Darmont (ERIC),, Franck Ravat (IRIT-SIG), Olivier Teste (IRIT-SIG)

PDF

Open Access

TL;DR

This paper introduces an internal data imputation method tailored for multidimensional data warehouse dimensions, leveraging existing data and relationships to address missing values efficiently for improved analysis.

Contribution

It presents a novel imputation approach specifically designed for dimension tables in data warehouses, considering intra- and inter-dimension relationships.

Findings

01

Effective imputation of missing dimension data

02

Reduces time and effort compared to existing methods

03

Enhances data completeness for better analysis

Abstract

Missing values occur commonly in the multidimensional data warehouses. They may generate problems of usefulness of data since the analysis performed on a multidimensional data warehouse is through different dimensions with hierarchies where we can roll up or drill down to the different parameters of analysis. Therefore, it's essential to complete these missing values in order to carry out a better analysis. There are existing data imputation methods which are suitable for numeric data, so they can be applied for fact tables but not for dimension tables. Some other data imputation methods need extra time and effort costs. As consequence, we propose in this article an internal data imputation method for multidimensional data warehouse based on the existing data and considering the intra-dimension and inter-dimension relationships.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Data Quality and Management · Advanced Database Systems and Queries