Pho(SC)-CTC -- A Hybrid Approach Towards Zero-shot Word Image Recognition
Ravi Bhatt, Anuj Rai, Narayanan C. Krishnan, and Sukalpa Chanda

TL;DR
This paper introduces Pho(SC)-CTC, a hybrid zero-shot word recognition model combining Pho(SC)Net features with CTC, demonstrating effectiveness on historical and synthetic datasets for recognizing unseen words.
Contribution
It presents a novel hybrid approach that leverages Pho(SC)Net features with CTC for zero-shot word recognition in historical documents, improving recognition of unseen words.
Findings
Effective on historical document datasets
Outperforms previous zero-shot methods
Validates approach on synthetic handwritten data
Abstract
Annotating words in a historical document image archive for word image recognition purpose demands time and skilled human resource (like historians, paleographers). In a real-life scenario, obtaining sample images for all possible words is also not feasible. However, Zero-shot learning methods could aptly be used to recognize unseen/out-of-lexicon words in such historical document images. Based on previous state-of-the-art method for zero-shot word recognition Pho(SC)Net, we propose a hybrid model based on the CTC framework (Pho(SC)-CTC) that takes advantage of the rich features learned by Pho(SC)Net followed by a connectionist temporal classification (CTC) framework to perform the final classification. Encouraging results were obtained on two publicly available historical document datasets and one synthetic handwritten dataset, which justifies the efficacy of Pho(SC)-CTC and Pho(SC)Net.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Topic Modeling
