Adaptive Text Recognition through Visual Matching

Chuhan Zhang; Ankush Gupta; Andrew Zisserman

arXiv:2009.06610·cs.CV·September 15, 2020·1 cites

Adaptive Text Recognition through Visual Matching

Chuhan Zhang, Ankush Gupta, Andrew Zisserman

PDF

Open Access

TL;DR

This paper presents a novel shape matching-based text recognition model that improves generalization and flexibility across fonts, languages, and characters without retraining, outperforming existing methods.

Contribution

The proposed model decouples visual and linguistic learning, enabling shape matching for better generalization and class flexibility in text recognition tasks.

Findings

01

Generalizes to unseen fonts without new exemplars

02

Flexibly changes number of classes with different exemplars

03

Handles new languages and characters without retraining

Abstract

In this work, our objective is to address the problems of generalization and flexibility for text recognition in documents. We introduce a new model that exploits the repetitive nature of characters in languages, and decouples the visual representation learning and linguistic modelling stages. By doing this, we turn text recognition into a shape matching problem, and thereby achieve generalization in appearance and flexibility in classes. We evaluate the new model on both synthetic and real datasets across different alphabets and show that it can handle challenges that traditional architectures are not able to solve without expensive retraining, including: (i) it can generalize to unseen fonts without new exemplars from them; (ii) it can flexibly change the number of classes, simply by changing the exemplars provided; and (iii) it can generalize to new languages and new characters that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications