Text is Text, No Matter What: Unifying Text Recognition using Knowledge Distillation
Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Yi-Zhe Song

TL;DR
This paper proposes a unified text recognition model using knowledge distillation to effectively handle both scene and handwritten text, achieving comparable or better performance than specialized models.
Contribution
It introduces a novel KD framework tailored for variable-length sequential text data to unify scene and handwriting text recognition models.
Findings
Unified model performs on par or better than separate models.
Proposed distillation losses effectively handle variable-length sequences.
Naive baselines are less effective than the proposed approach.
Abstract
Text recognition remains a fundamental and extensively researched topic in computer vision, largely owing to its wide array of commercial applications. The challenging nature of the very problem however dictated a fragmentation of research efforts: Scene Text Recognition (STR) that deals with text in everyday scenes, and Handwriting Text Recognition (HTR) that tackles hand-written text. In this paper, for the first time, we argue for their unification -- we aim for a single model that can compete favourably with two separate state-of-the-art STR and HTR models. We first show that cross-utilisation of STR and HTR models trigger significant performance drops due to differences in their inherent challenges. We then tackle their union by introducing a knowledge distillation (KD) based framework. This is however non-trivial, largely due to the variable-length and sequential nature of text…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Multimodal Machine Learning Applications · Natural Language Processing Techniques
MethodsKnowledge Distillation
