The effects of data preprocessing on probability of default model fairness
Di Wu

TL;DR
This paper examines how data preprocessing, especially Truncated SVD, influences the fairness and accuracy of probability of default models in financial credit risk assessment.
Contribution
It provides an analysis of the impact of different preprocessing techniques, including SVD, on model fairness and performance in credit risk modeling.
Findings
Preprocessing with SVD can improve model fairness.
Certain preprocessing techniques enhance model accuracy.
Data preprocessing affects bias and discrimination in predictions.
Abstract
In the context of financial credit risk evaluation, the fairness of machine learning models has become a critical concern, especially given the potential for biased predictions that disproportionately affect certain demographic groups. This study investigates the impact of data preprocessing, with a specific focus on Truncated Singular Value Decomposition (SVD), on the fairness and performance of probability of default models. Using a comprehensive dataset sourced from Kaggle, various preprocessing techniques, including SVD, were applied to assess their effect on model accuracy, discriminatory power, and fairness.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus
