Prioritizing Technical Debt in Database Normalization Using Portfolio Theory and Data Quality Metrics
Mashel Albarak, Rami Bahsoon

TL;DR
This paper introduces a pragmatic approach to prioritize database normalization efforts by applying Portfolio Theory and data quality metrics to mitigate technical debt and enhance database performance and integrity.
Contribution
It proposes a novel method combining Portfolio Theory with data quality and performance metrics to prioritize normalization of database tables, focusing on cost-effective debt reduction.
Findings
Effective reduction of normalization debt demonstrated in case study
Improved data quality and performance metrics after applying the method
Prioritization based on I/O cost and data inconsistency risk enhances normalization strategy
Abstract
Database normalization is the one of main principles for designing relational databases. The benefits of normalization can be observed through improving data quality and performance, among the other qualities. We explore a new context of technical debt manifestation, which is linked to ill-normalized databases. This debt can have long-term impact causing systematic degradation of database qualities. Such degradation can be liken to accumulated interest on a debt. We claim that debts are likely to materialize for tables below the fourth normal form. Practically, achieving fourth normal form for all the tables in the database is a costly and idealistic exercise. Therefore, we propose a pragmatic approach to prioritize tables that should be normalized to the fourth normal form based on the metaphoric debt and interest of the ill-normalized tables, observed on data quality and performance.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
