LendNova: Towards Automated Credit Risk Assessment with Language Models
Kiarash Shamsi, Danijel Novokmet, Joshua Peters, Mao Lin Liu, Paul K Edwards, Vahab Khoshdel

TL;DR
LendNova introduces an automated credit risk assessment pipeline using language models to analyze raw credit records, reducing manual effort and improving scalability in financial risk evaluation.
Contribution
It presents the first practical end-to-end system leveraging NLP and language models for direct analysis of raw credit data, bypassing manual feature engineering.
Findings
Effective risk signal extraction from raw text
Reduced manual preprocessing costs
Potential for scalable, accurate credit risk assessment
Abstract
Credit risk assessment is essential in the financial sector, but has traditionally depended on costly feature-based models that often fail to utilize all available information in raw credit records. This paper introduces LendNova, the first practical automated end-to-end pipeline for credit risk assessment, designed to utilize all available information in raw credit records by leveraging advanced NLP techniques and language models. LendNova transforms risk modeling by operating directly on raw, jargon-heavy credit bureau text using a language model that learns task-relevant representations without manual feature engineering. By automatically capturing patterns and risk signals embedded in the text, it replaces manual preprocessing steps, reducing costs and improving scalability. Evaluation on real-world data further demonstrates its strong potential in accurate and efficient risk…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsFinancial Distress and Bankruptcy Prediction · Credit Risk and Financial Regulations · Machine Learning in Healthcare
