Are Foundation Models Useful for Bankruptcy Prediction?
Marcin Kostrzewa, Oleksii Furman, Roman Furman, Sebastian Tomczak, Maciej Zi\k{e}ba

TL;DR
This study systematically compares foundation models like Llama-3.3-70B-Instruct and TabPFN with classical machine learning methods for bankruptcy prediction, finding that traditional models outperform foundation models in accuracy and reliability.
Contribution
First systematic evaluation of foundation models versus classical methods for bankruptcy prediction using large, imbalanced datasets.
Findings
Classical models like XGBoost outperform foundation models across all horizons.
LLMs have unreliable probability estimates, limiting their risk application.
TabPFN is competitive but computationally expensive without performance benefits.
Abstract
Foundation models have shown promise across various financial applications, yet their effectiveness for corporate bankruptcy prediction remains systematically unevaluated against established methods. We study bankruptcy forecasting using Llama-3.3-70B-Instruct and TabPFN, evaluated on large, highly imbalanced datasets of over one million company records from the Visegr\'ad Group. We provide the first systematic comparison of foundation models against classical machine learning baselines for this task. Our results show that models such as XGBoost and CatBoost consistently outperform foundation models across all prediction horizons. LLM-based approaches suffer from unreliable probability estimates, undermining their use in risk-sensitive financial settings. TabPFN, while competitive with simpler baselines, requires substantial computational resources with costs not justified by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinancial Distress and Bankruptcy Prediction · Corporate Insolvency and Governance · Imbalanced Data Classification Techniques
