The effects of data preprocessing on probability of default model   fairness

Di Wu

arXiv:2408.15452·econ.EM·August 29, 2024

The effects of data preprocessing on probability of default model fairness

Di Wu

PDF

TL;DR

This paper examines how data preprocessing, especially Truncated SVD, influences the fairness and accuracy of probability of default models in financial credit risk assessment.

Contribution

It provides an analysis of the impact of different preprocessing techniques, including SVD, on model fairness and performance in credit risk modeling.

Findings

01

Preprocessing with SVD can improve model fairness.

02

Certain preprocessing techniques enhance model accuracy.

03

Data preprocessing affects bias and discrimination in predictions.

Abstract

In the context of financial credit risk evaluation, the fairness of machine learning models has become a critical concern, especially given the potential for biased predictions that disproportionately affect certain demographic groups. This study investigates the impact of data preprocessing, with a specific focus on Truncated Singular Value Decomposition (SVD), on the fairness and performance of probability of default models. Using a comprehensive dataset sourced from Kaggle, various preprocessing techniques, including SVD, were applied to assess their effect on model accuracy, discriminatory power, and fairness.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFocus