Curriculum Learning for Handwritten Text Line Recognition

J\'er\^ome Louradour; Christopher Kermorvant

arXiv:1312.1737·cs.LG·December 9, 2013·1 cites

Curriculum Learning for Handwritten Text Line Recognition

J\'er\^ome Louradour, Christopher Kermorvant

PDF

Open Access

TL;DR

This paper introduces a curriculum learning approach to improve the training efficiency and performance of RNNs in handwritten text line recognition by gradually increasing sequence length during training.

Contribution

It proposes a simple curriculum learning method for sequence recognition tasks, accelerating training and enhancing accuracy in handwritten text recognition.

Findings

01

Training time reduced significantly with curriculum learning.

02

Performance improved on multiple handwritten text datasets.

03

Simple implementation makes it practical for real-world use.

Abstract

Recurrent Neural Networks (RNN) have recently achieved the best performance in off-line Handwriting Text Recognition. At the same time, learning RNN by gradient descent leads to slow convergence, and training times are particularly long when the training database consists of full lines of text. In this paper, we propose an easy way to accelerate stochastic gradient descent in this set-up, and in the general context of learning to recognize sequences. The principle is called Curriculum Learning, or shaping. The idea is to first learn to recognize short sequences before training on all available training sequences. Experiments on three different handwritten text databases (Rimes, IAM, OpenHaRT) show that a simple implementation of this strategy can significantly speed up the training of RNN for Text Recognition, and even significantly improve performance in some cases.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Image Processing and 3D Reconstruction

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings