A Study on Effects of Implicit and Explicit Language Model Information for DBLSTM-CTC Based Handwriting Recognition
Qi Liu, Lijuan Wang, Qiang Huo

TL;DR
This paper investigates how implicit and explicit language model information affect handwriting recognition using DBLSTM-CTC, showing explicit models improve performance even with large training data, and introduces a GPU-based training tool.
Contribution
It compares implicit and explicit language model effects in DBLSTM-CTC handwriting recognition and develops a GPU-based training tool for large-scale CTC training.
Findings
Explicit language models improve recognition accuracy.
Large-scale training benefits from GPU-based BPTT.
Implicit models alone are insufficient for optimal performance.
Abstract
Deep Bidirectional Long Short-Term Memory (D-BLSTM) with a Connectionist Temporal Classification (CTC) output layer has been established as one of the state-of-the-art solutions for handwriting recognition. It is well known that the DBLSTM trained by using a CTC objective function will learn both local character image dependency for character modeling and long-range contextual dependency for implicit language modeling. In this paper, we study the effects of implicit and explicit language model information for DBLSTM-CTC based handwriting recognition by comparing the performance of using or without using an explicit language model in decoding. It is observed that even using one million lines of training sentences to train the DBLSTM, using an explicit language model is still helpful. To deal with such a large-scale training problem, a GPU-based training tool has been developed for CTC…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
