Confidence Prediction for Lexicon-Free OCR

Noam Mor; Lior Wolf

arXiv:1805.11161·cs.CV·July 17, 2018

Confidence Prediction for Lexicon-Free OCR

Noam Mor, Lior Wolf

PDF

TL;DR

This paper introduces two confidence prediction techniques for lexicon-free OCR systems, significantly reducing misreads and improving reliability in real-world applications where explicit confidence scores are essential.

Contribution

The paper presents novel confidence measurement methods tailored for lexicon-free OCR, addressing the challenge of error filtering without lexicon assistance.

Findings

01

Significant reduction in OCR misreads using proposed confidence methods

02

Effective confidence prediction demonstrated on standard benchmarks

03

Proprietary dataset results confirm practical applicability

Abstract

Having a reliable accuracy score is crucial for real world applications of OCR, since such systems are judged by the number of false readings. Lexicon-based OCR systems, which deal with what is essentially a multi-class classification problem, often employ methods explicitly taking into account the lexicon, in order to improve accuracy. However, in lexicon-free scenarios, filtering errors requires an explicit confidence calculation. In this work we show two explicit confidence measurement techniques, and show that they are able to achieve a significant reduction in misreads on both standard benchmarks and a proprietary dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.