Corporate Financial Distress Prediction: Based on Multi-source Data and Feature Selection
Yi Ding, Chun Yan

TL;DR
This paper proposes a multi-source data integration and feature selection approach for financial distress prediction, demonstrating improved accuracy using a novel MRMR-SVM-RFE method on Chinese listed companies.
Contribution
It introduces a new indicator system combining internal, external, and online data, and applies a feature selection method to enhance prediction accuracy.
Findings
MRMR-SVM-RFE effectively extracts relevant features.
Selected features improve prediction accuracy.
BP model outperforms other classifiers.
Abstract
The advent of the era of big data provides new ideas for financial distress prediction. In order to evaluate the financial status of listed companies more accurately, this study establishes a financial distress prediction indicator system based on multi-source data by integrating three data sources: the company's internal management, the external market and online public opinion. This study addresses the redundancy and dimensional explosion problems of multi-source data integration, feature selection of the fused data, and a financial distress prediction model based on maximum relevance and minimum redundancy and support vector machine recursive feature elimination (MRMR-SVM-RFE). To verify the effectiveness of the model, we used back propagation (BP), support vector machine (SVM), and gradient boosted decision tree (GBDT) classification algorithms, and conducted an empirical study on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinancial Distress and Bankruptcy Prediction · AI and HR Technologies · Imbalanced Data Classification Techniques
