ReceiptSense: Beyond Traditional OCR -- A Dataset for Receipt Understanding
Abdelrahman Abdallah, Mohamed Mounis, Mahmoud Abdalla, Mahmoud SalahEldin Kasem, Mohamed Mahmoud, Ibrahim Abdelhalim, Mohamed Elkasaby, Yasser ElBendary, Adam Jatowt

TL;DR
ReceiptSense introduces a large, annotated multilingual receipt dataset with diverse tasks and benchmarks, enabling improved research in receipt understanding and extraction, especially for complex scripts like Arabic.
Contribution
The paper presents a comprehensive, publicly available dataset for Arabic-English receipt understanding, including annotations, QA pairs, and baseline evaluations, advancing multilingual receipt processing research.
Findings
Baseline methods show room for improvement on receipt understanding tasks.
The dataset effectively captures complex receipt layouts and multilingual content.
Initial benchmarks demonstrate the dataset's utility for object detection and information extraction.
Abstract
Multilingual OCR and information extraction from receipts remains challenging, particularly for complex scripts like Arabic. We introduce \dataset, a comprehensive dataset designed for Arabic-English receipt understanding comprising 20,000 annotated receipts from diverse retail settings, 30,000 OCR-annotated images, and 10,000 item-level annotations, and a new Receipt QA subset with 1265 receipt images paired with 40 question-answer pairs each to support LLM evaluation for receipt understanding. The dataset captures merchant names, item descriptions, prices, receipt numbers, and dates to support object detection, OCR, and information extraction tasks. We establish baseline performance using traditional methods (Tesseract OCR) and advanced neural networks, demonstrating the dataset's effectiveness for processing complex, noisy real-world receipt layouts. Our publicly accessible dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
