Confidence Prediction for Lexicon-Free OCR
Noam Mor, Lior Wolf

TL;DR
This paper introduces two confidence prediction techniques for lexicon-free OCR systems, significantly reducing misreads and improving reliability in real-world applications where explicit confidence scores are essential.
Contribution
The paper presents novel confidence measurement methods tailored for lexicon-free OCR, addressing the challenge of error filtering without lexicon assistance.
Findings
Significant reduction in OCR misreads using proposed confidence methods
Effective confidence prediction demonstrated on standard benchmarks
Proprietary dataset results confirm practical applicability
Abstract
Having a reliable accuracy score is crucial for real world applications of OCR, since such systems are judged by the number of false readings. Lexicon-based OCR systems, which deal with what is essentially a multi-class classification problem, often employ methods explicitly taking into account the lexicon, in order to improve accuracy. However, in lexicon-free scenarios, filtering errors requires an explicit confidence calculation. In this work we show two explicit confidence measurement techniques, and show that they are able to achieve a significant reduction in misreads on both standard benchmarks and a proprietary dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
