FATURA: A Multi-Layout Invoice Image Dataset for Document Analysis and Understanding
Mahmoud Limam, Marwa Dhiaf, Yousri Kessentini

TL;DR
FATURA is the largest publicly available dataset of 10,000 diverse invoice images with 50 layouts, designed to facilitate research in document analysis and understanding, especially for tasks beyond text transcription.
Contribution
The paper introduces FATURA, a comprehensive, multi-layout invoice image dataset with extensive annotations, filling a critical gap in resources for document analysis research.
Findings
Established benchmarks for various document analysis tasks.
Demonstrated the dataset's utility across different training and evaluation scenarios.
Provided a publicly accessible resource to advance invoice document understanding.
Abstract
Document analysis and understanding models often require extensive annotated data to be trained. However, various document-related tasks extend beyond mere text transcription, requiring both textual content and precise bounding-box annotations to identify different document elements. Collecting such data becomes particularly challenging, especially in the context of invoices, where privacy concerns add an additional layer of complexity. In this paper, we introduce FATURA, a pivotal resource for researchers in the field of document analysis and understanding. FATURA is a highly diverse dataset featuring multi-layout, annotated invoice document images. Comprising invoices with distinct layouts, it represents the largest openly accessible image dataset of invoice documents known to date. We also provide comprehensive benchmarks for various document analysis and understanding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Processing and 3D Reconstruction · Music and Audio Processing
