A Study on Effects of Implicit and Explicit Language Model Information   for DBLSTM-CTC Based Handwriting Recognition

Qi Liu; Lijuan Wang; Qiang Huo

arXiv:2008.01532·cs.CL·August 5, 2020

A Study on Effects of Implicit and Explicit Language Model Information for DBLSTM-CTC Based Handwriting Recognition

Qi Liu, Lijuan Wang, Qiang Huo

PDF

TL;DR

This paper investigates how implicit and explicit language model information affect handwriting recognition using DBLSTM-CTC, showing explicit models improve performance even with large training data, and introduces a GPU-based training tool.

Contribution

It compares implicit and explicit language model effects in DBLSTM-CTC handwriting recognition and develops a GPU-based training tool for large-scale CTC training.

Findings

01

Explicit language models improve recognition accuracy.

02

Large-scale training benefits from GPU-based BPTT.

03

Implicit models alone are insufficient for optimal performance.

Abstract

Deep Bidirectional Long Short-Term Memory (D-BLSTM) with a Connectionist Temporal Classification (CTC) output layer has been established as one of the state-of-the-art solutions for handwriting recognition. It is well known that the DBLSTM trained by using a CTC objective function will learn both local character image dependency for character modeling and long-range contextual dependency for implicit language modeling. In this paper, we study the effects of implicit and explicit language model information for DBLSTM-CTC based handwriting recognition by comparing the performance of using or without using an explicit language model in decoding. It is observed that even using one million lines of training sentences to train the DBLSTM, using an explicit language model is still helpful. To deal with such a large-scale training problem, a GPU-based training tool has been developed for CTC…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.