Fighting Sampling Bias: A Framework for Training and Evaluating Credit Scoring Models
Nikita Kozodoi, Stefan Lessmann, Morteza Alamgir, Luis Moreira-Matias,, Konstantinos Papakonstantinou

TL;DR
This paper introduces bias-aware methods for training and evaluating credit scoring models to address sampling bias, demonstrating improved predictive performance and profitability through extensive experiments and a Bayesian evaluation framework.
Contribution
It proposes a reject inference framework for bias-aware self-learning and a Bayesian evaluation method to better estimate scorecard performance under sampling bias.
Findings
Bias-aware self-learning improves predictive accuracy.
Bayesian evaluation provides more reliable performance estimates.
Potential profit increase of about 8% using Bayesian evaluation.
Abstract
Scoring models support decision-making in financial institutions. Their estimation and evaluation are based on the data of previously accepted applicants with known repayment behavior. This creates sampling bias: the available labeled data offers a partial picture of the distribution of candidate borrowers, which the model is supposed to score. The paper addresses the adverse effect of sampling bias on model training and evaluation. To improve scorecard training, we propose bias-aware self-learning - a reject inference framework that augments the biased training data by inferring labels for selected rejected applications. For scorecard evaluation, we propose a Bayesian framework that extends standard accuracy measures to the biased setting and provides a reliable estimate of future scorecard performance. Extensive experiments on synthetic and real-world data confirm the superiority of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCredit Risk and Financial Regulations
MethodsSelf-Learning
