Learning Deep Representations for Word Spotting Under Weak Supervision
Neha Gurjar, Sebastian Sudholt, Gernot A. Fink

TL;DR
This paper presents a CNN-based method for handwritten word spotting that significantly reduces manual annotation effort by using weak supervision with synthetic data and a small real dataset, achieving competitive results.
Contribution
Introduces a weakly supervised learning approach combining synthetic and limited real data for effective handwritten word spotting with less manual annotation.
Findings
Achieves state-of-the-art performance with less manual annotation.
Reduces training time compared to fully supervised methods.
Maintains high accuracy using synthetic data and minimal real annotations.
Abstract
Convolutional Neural Networks have made their mark in various fields of computer vision in recent years. They have achieved state-of-the-art performance in the field of document analysis as well. However, CNNs require a large amount of annotated training data and, hence, great manual effort. In our approach, we introduce a method to drastically reduce the manual annotation effort while retaining the high performance of a CNN for word spotting in handwritten documents. The model is learned with weak supervision using a combination of synthetically generated training data and a small subset of the training partition of the handwritten data set. We show that the network achieves results highly competitive to the state-of-the-art in word spotting with shorter training times and a fraction of the annotation effort.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
