AI4Contracts: LLM & RAG-Powered Encoding of Financial Derivative Contracts
Maruf Ahmed Mridul, Ian Sloyan, Aparna Gupta, Oshani Seneviratne

TL;DR
This paper presents CDMizer, a novel framework combining LLMs and RAG for structured extraction and validation of financial contracts, enhancing scalability and accuracy in document understanding.
Contribution
Introduction of CDMizer, a template-driven, hierarchical LLM and RAG-based system for structured text transformation and validation of financial derivatives.
Findings
Effective transformation of OTC contracts into CDM schema
Improved scalability and schema adherence in structured extraction
Validated framework with an LLM-powered evaluation method
Abstract
Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) are reshaping how AI systems extract and organize information from unstructured text. A key challenge is designing AI methods that can incrementally extract, structure, and validate information while preserving hierarchical and contextual relationships. We introduce CDMizer, a template-driven, LLM, and RAG-based framework for structured text transformation. By leveraging depth-based retrieval and hierarchical generation, CDMizer ensures a controlled, modular process that aligns generated outputs with predefined schema. Its template-driven approach guarantees syntactic correctness, schema adherence, and improved scalability, addressing key limitations of direct generation methods. Additionally, we propose an LLM-powered evaluation framework to assess the completeness and accuracy of structured representations.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInsurance and Financial Risk Management · Private Equity and Venture Capital
