TL;DR
PopEval introduces a character-level evaluation method for OCR that aligns more closely with human judgment and is compatible with existing word-level datasets, improving assessment accuracy for modern OCR applications.
Contribution
It proposes a novel character-level evaluation approach for OCR that better reflects human qualitative judgment and is compatible with existing word-level benchmarks.
Findings
PopEval aligns more closely with human evaluation than existing methods.
PopEval is compatible with word-level benchmark datasets.
The method is effective for current end-to-end OCR tasks.
Abstract
The most prevalent scope of interest for OCR applications used to be scanned documents, but it has now shifted towards the natural scene. Despite the change of times, the existing evaluation methods are still based on the old criteria suited better for the past interests. In this paper, we propose PopEval, a novel evaluation approach for the recent OCR interests. The new and past evaluation algorithms were compared through the results on various datasets and OCR models. Compared to the other evaluation methods, the proposed evaluation algorithm was closer to the human's qualitative evaluation than other existing methods. Although the evaluation algorithm was devised as a character-level approach, the comparative experiment revealed that PopEval is also compatible on existing benchmark datasets annotated at word-level. The proposed evaluation algorithm is not only applicable to current…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
