Next-Year Bankruptcy Prediction from Textual Data: Benchmark and   Baselines

Henri Arno; Klaas Mulier; Joke Baeck; Thomas Demeester

arXiv:2208.11334·cs.CL·August 25, 2022

Next-Year Bankruptcy Prediction from Textual Data: Benchmark and Baselines

Henri Arno, Klaas Mulier, Joke Baeck, Thomas Demeester

PDF

Open Access

TL;DR

This paper establishes a benchmark dataset and evaluation framework for bankruptcy prediction using unstructured textual data, compares classical and neural models, and highlights the effectiveness of simple bag-of-words approaches.

Contribution

It introduces a standardized benchmark for textual bankruptcy prediction and evaluates baseline models, providing a foundation for future research in this area.

Findings

01

Lightweight bag-of-words model performs surprisingly well.

02

Using multi-year textual data improves prediction accuracy.

03

Evaluation of classical vs neural models reveals strengths and weaknesses.

Abstract

Models for bankruptcy prediction are useful in several real-world scenarios, and multiple research contributions have been devoted to the task, based on structured (numerical) as well as unstructured (textual) data. However, the lack of a common benchmark dataset and evaluation strategy impedes the objective comparison between models. This paper introduces such a benchmark for the unstructured data scenario, based on novel and established datasets, in order to stimulate further research into the task. We describe and evaluate several classical and neural baseline models, and discuss benefits and flaws of different strategies. In particular, we find that a lightweight bag-of-words model based on static in-domain word representations obtains surprisingly good results, especially when taking textual data from several years into account. These results are critically assessed, and discussed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFinancial Distress and Bankruptcy Prediction · Stock Market Forecasting Methods