Multimodal Generative Models for Bankruptcy Prediction Using Textual Data
Rogelio A. Mancisidor, Kjersti Aas

TL;DR
This paper introduces a multimodal model that predicts bankruptcy using accounting, market, and textual data, effectively handling missing textual data by learning from complete samples and outperforming traditional models.
Contribution
The study proposes the CMMD model, enabling bankruptcy prediction with incomplete textual data by learning multimodal representations from complete samples.
Findings
The model achieves higher prediction accuracy than traditional classifiers.
It can generate textual information from non-textual data modalities.
The approach extends bankruptcy prediction to more companies by handling missing textual data.
Abstract
Textual data from financial filings, e.g., the Management's Discussion & Analysis (MDA) section in Form 10-K, has been used to improve the prediction accuracy of bankruptcy models. In practice, however, we cannot obtain the MDA section for all public companies, which limits the use of MDA data in traditional bankruptcy models, as they need complete data to make predictions. The two main reasons for the lack of MDA are: (i) not all companies are obliged to submit the MDA and (ii) technical problems arise when crawling and scrapping the MDA section. To solve this limitation, this research introduces the Conditional Multimodal Discriminative (CMMD) model that learns multimodal representations that embed information from accounting, market, and textual data modalities. The CMMD model needs a sample with all data modalities for model training. At test time, the CMMD model only needs access…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinancial Distress and Bankruptcy Prediction
MethodsTest
