LongFin: A Multimodal Document Understanding Model for Long Financial Domain Documents
Ahmed Masry, Amir Hajian

TL;DR
LongFin is a multimodal AI model designed for understanding long financial documents, capable of processing up to 4,000 tokens, and trained on a new dataset addressing industrial challenges in financial document analysis.
Contribution
The paper introduces LongFin, a novel multimodal model for long financial documents, and presents the LongForms dataset to address industrial challenges in document understanding.
Findings
LongFin outperforms existing models on the LongForms dataset.
LongFin maintains competitive performance on single-page benchmarks.
The LongForms dataset captures real-world industrial financial document challenges.
Abstract
Document AI is a growing research field that focuses on the comprehension and extraction of information from scanned and digital documents to make everyday business operations more efficient. Numerous downstream tasks and datasets have been introduced to facilitate the training of AI models capable of parsing and extracting information from various document types such as receipts and scanned forms. Despite these advancements, both existing datasets and models fail to address critical challenges that arise in industrial contexts. Existing datasets primarily comprise short documents consisting of a single page, while existing models are constrained by a limited maximum length, often set at 512 tokens. Consequently, the practical application of these methods in financial services, where documents can span multiple pages, is severely impeded. To overcome these challenges, we introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Topic Modeling
MethodsSparse Evolutionary Training
