Benchmarking OCR Pipelines with Adaptive Enhancement for Multi-Domain Retail Bill Digitization
Vijaysinh Gaikwad

TL;DR
This paper introduces a comprehensive adaptive OCR pipeline for retail bill digitization across multiple domains, significantly improving accuracy and speed over baseline methods.
Contribution
It presents a novel, integrated OCR system with adaptive enhancement, quality analysis, and NLP correction, benchmarked on a diverse real-world retail dataset.
Findings
Achieved 18.4% CER and 27.6% WER, improving over baseline by over 26%.
Reduced processing time to 3.64 seconds per image, 6.4x faster than EasyOCR.
Enhanced image quality with an average PSNR of 28.7 dB on medium and low-quality images.
Abstract
The digitization of multi-domain retail billing documents remains a challenging task due to variability in scan quality, layout heterogeneity, and domain diversity across commercial sectors. This paper proposes and benchmarks an intelligent, quality-aware adaptive Optical Character Recognition (OCR) pipeline for retail bill digitization spanning five domains: grocery stores, restaurants, hardware shops, footwear outlets, and clothing retailers. The proposed system integrates a Convolutional Neural Network (CNN)-based image enhancement module trained via self-supervised denoising, a Laplacian variance-based image quality analyzer with three-tier routing, a confidence-driven adaptive feedback loop with iterative retry, and an NLP-based post-OCR correction layer. Experiments were conducted on a real-world dataset of 360 heterogeneous retail bill images. Ground truth for quantitative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
