Beyond Semantic Similarity: A Two-Phase Non-Parametric Retrieval Workflow for Corporate Credit Underwriting
Linus Ng Junjia, Ezekiel Tee Kongquan, Kelvin Heng, Kenneth Zhu Ke, Zhao Jing Yuan

TL;DR
This paper introduces a two-phase retrieval system for corporate credit analysis that improves the relevance of extracted information from complex financial documents, reducing review time significantly.
Contribution
It presents a novel non-parametric, utility-focused retrieval architecture that separates candidate retrieval from utility ranking, tailored for multilingual, document-heavy financial analysis.
Findings
Outperforms naive retrieval baselines on proprietary multilingual financial documents.
Reduces document review time from hours to about three minutes in real-world deployment.
Enhances decision-support workflows with a utility-aware retrieval approach.
Abstract
Corporate credit underwriting requires analysts to extract actionable evidence from long, heterogeneous financial documents spanning hundreds of pages and multiple languages. Standard Retrieval-Augmented Generation (RAG) pipelines optimize for semantic similarity, which frequently surfaces passages that are topically related but lack decision utility, a problem we term the similarity-utility gap. We propose a two-phase non-parametric retrieval architecture that separates high-recall candidate retrieval from high-precision utility ranking. The first phase combines lexical and dense multilingual retrieval to construct a broad candidate pool. The second phase applies an adaptive retrieval controller that filters candidates using query intent and document structure signals, followed by an LLM-as-a-Judge utility scoring mechanism that ranks passages by analytical usefulness rather than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
