Risk-Sensitive Machine Learning for Financial Decision Modeling Under Imbalanced Data: Evidence from Bank Telemarketing
Bowen Dong, Xinyu Zhang, Yang Liu, Tianhui Zhang, Xianchen Liu, Lingmin Hou, Lingyi Meng, Zhen Guo, Aliya Mulati

TL;DR
This paper explores how to improve bank telemarketing predictions using machine learning in the face of imbalanced data.
Contribution
The study introduces a novel combination of synthetic oversampling and cost-sensitive learning for financial decision modeling.
Findings
Ensemble models like CatBoost, XGBoost, and LightGBM outperformed traditional models in predicting telemarketing outcomes.
The best model achieved an F1-score of 0.540 and a recall of 0.812 for the positive class.
SHAP analysis identified campaign duration and macroeconomic indicators as key predictors.
Abstract
Bank telemarketing campaigns often experience low subscription rates due to customer heterogeneity and severe class imbalance, which pose challenges for reliable predictive modeling. This study investigates a data-driven approach that integrates synthetic minority oversampling and cost-sensitive learning to improve the prediction of telemarketing outcomes. Experiments are conducted using the Portuguese Bank Marketing dataset, comprising 41,188 instances with a positive response rate of 11.3%. Eight machine learning models are evaluated under a unified preprocessing pipeline and five-fold stratified cross-validation, including Logistic Regression, Decision Tree, Random Forest, and Ensemble methods. The results show that Ensemble models, particularly CatBoost, XGBoost, and LightGBM, achieve improved performance compared with traditional baselines, with notable gains in minority-class…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Financial Distress and Bankruptcy Prediction · Customer churn and segmentation
