Improving Word Recognition using Multiple Hypotheses and Deep Embeddings

Siddhant Bansal; Praveen Krishnan; C.V. Jawahar

arXiv:2010.14411·cs.CV·October 28, 2020

Improving Word Recognition using Multiple Hypotheses and Deep Embeddings

Siddhant Bansal, Praveen Krishnan, C.V. Jawahar

PDF

Open Access 1 Repo

TL;DR

This paper introduces EmbedNet and CAB, a novel approach combining multiple hypotheses, deep embeddings, and confidence scoring to significantly improve Hindi word recognition accuracy.

Contribution

The paper presents EmbedNet trained with triplet loss and a Confidence based Accuracy Booster module, enhancing recognition accuracy over existing methods.

Findings

01

Achieves around 10% absolute improvement in word recognition accuracy.

02

Effectively utilizes deep embeddings and confidence scores for better prediction.

03

Systematically evaluated on Hindi book collections.

Abstract

We propose a novel scheme for improving the word recognition accuracy using word image embeddings. We use a trained text recognizer, which can predict multiple text hypothesis for a given word image. Our fusion scheme improves the recognition process by utilizing the word image and text embeddings obtained from a trained word image embedding network. We propose EmbedNet, which is trained using a triplet loss for learning a suitable embedding space where the embedding of the word image lies closer to the embedding of the corresponding text transcription. The updated embedding space thus helps in choosing the correct prediction with higher confidence. To further improve the accuracy, we propose a plug-and-play module called Confidence based Accuracy Booster (CAB). The CAB module takes in the confidence scores obtained from the text recognizer and Euclidean distances between the embeddings…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Sid2697/Word-recognition-EmbedNet-CAB
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsTriplet Loss