A limited-size ensemble of homogeneous CNN/LSTMs for high-performance   word classification

Mahya Ameryan; Lambert Schomaker

arXiv:1912.03223·cs.CL·February 3, 2021

A limited-size ensemble of homogeneous CNN/LSTMs for high-performance word classification

Mahya Ameryan, Lambert Schomaker

PDF

TL;DR

This paper introduces a small ensemble of homogeneous CNN and LSTM networks with data augmentation and voting to achieve high accuracy in handwritten word recognition, outperforming previous methods on standard benchmarks.

Contribution

The paper presents a novel end-to-end convolutional LSTM network ensemble that effectively handles geometric and sequence variability in handwritten text recognition with minimal network size.

Findings

01

Achieved 96.6% accuracy on RIMES dataset

02

Ensemble of five networks outperforms state-of-the-art methods

03

Effective on both modern and historical handwritten datasets

Abstract

In recent years, long short-term memory neural networks (LSTMs) have been applied quite successfully to problems in handwritten text recognition. However, their strength is more located in handling sequences of variable length than in handling geometric variability of the image patterns. Furthermore, the best results for LSTMs are often based on large-scale training of an ensemble of network instances. In this paper, an end-to-end convolutional LSTM Neural Network is used to handle both geometric variation and sequence variability. We show that high performances can be reached on a common benchmark set by using proper data augmentation for just five such networks using a proper coding scheme and a proper voting scheme. The networks have similar architectures (Convolutional Neural Network (CNN): five layers, bidirectional LSTM (BiLSTM): three layers followed by a connectionist temporal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest · Sigmoid Activation · Tanh Activation · Long Short-Term Memory