Capsule Network-Based Multimodal Fusion for Mortgage Risk Assessment from Unstructured Data Sources
Mahsa Tavakoli, Rohitash Chandra, Cristian Bravo

TL;DR
This paper introduces a novel multimodal deep learning framework using unstructured data sources and capsule network-based fusion to improve mortgage risk assessment accuracy and interpretability.
Contribution
The study presents a new capsule network-inspired fusion strategy tailored for multimodal data integration in mortgage risk prediction.
Findings
Outperforms unimodal models in predictive accuracy.
Surpasses traditional fusion methods like addition and concatenation.
Enhances interpretability through GradCAM visualizations.
Abstract
Mortgage risk assessment traditionally relies on structured financial data, which is often proprietary, confidential, and costly. In this study, we propose a novel multimodal deep learning framework that uses cost-free, publicly available, unstructured data sources, including textual information, images, and sentiment scores, to generate credit scores that approximate commercial scorecards. Our framework adopts a two-phase approach. In the unimodal phase, we identify the best-performing models for each modality, i.e. BERT for text, VGG for image data, and a multilayer perceptron for sentiment-based features. In the fusion phase, we introduce the capsule-based fusion network (FusionCapsNet), a novel fusion strategy inspired by capsule networks, but fundamentally redesigned for multimodal integration. Unlike standard capsule networks, our method adapts a specific mechanism in capsule…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
