Offline Detection of Misspelled Handwritten Words by Convolving Recognition Model Features with Text Labels
Andrey Totev, Tomas Ward

TL;DR
This paper presents a novel method for detecting misspelled handwritten words by convolving recognition features with text labels, improving accuracy in out-of-vocabulary scenarios for handwriting recognition.
Contribution
It introduces an unrestricted binary classifier that compares handwriting images to text, trained on synthetic data, enhancing misspelling detection beyond traditional HWR models.
Findings
Achieves 19.5% average precision increase over state-of-the-art HWR models.
Maintains high recall in misspelled word detection.
Uses synthetic data for training, reducing reliance on labeled real data.
Abstract
Offline handwriting recognition (HWR) has improved significantly with the advent of deep learning architectures in recent years. Nevertheless, it remains a challenging problem and practical applications often rely on post-processing techniques for restricting the predicted words via lexicons or language models. Despite their enhanced performance, such systems are less usable in contexts where out-of-vocabulary words are anticipated, e.g. for detecting misspelled words in school assessments. To that end, we introduce the task of comparing a handwriting image to text. To solve the problem, we propose an unrestricted binary classifier, consisting of a HWR feature extractor and a multimodal classification head which convolves the feature extractor output with the vector representation of the input text. Our model's classification head is trained entirely on synthetic data created using a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Topic Modeling
